<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Smackdown! &#187; search engines</title>
	<atom:link href="http://smackdown.blogsblogsblogs.com/category/search-engines/feed/" rel="self" type="application/rss+xml" />
	<link>http://smackdown.blogsblogsblogs.com</link>
	<description>Smackdown!</description>
	<lastBuildDate>Tue, 22 Nov 2011 22:40:24 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>Google Says &#8220;Fuck It&#8221; For The Christmas Season, Removes The Ability To Report AdSense Violations</title>
		<link>http://smackdown.blogsblogsblogs.com/2011/11/22/google-says-fuck-it-for-the-christmas-season-removes-the-ability-to-report-adsense-violations/</link>
		<comments>http://smackdown.blogsblogsblogs.com/2011/11/22/google-says-fuck-it-for-the-christmas-season-removes-the-ability-to-report-adsense-violations/#comments</comments>
		<pubDate>Tue, 22 Nov 2011 20:57:49 +0000</pubDate>
		<dc:creator>Michael VanDeMar</dc:creator>
				<category><![CDATA[blogthropology]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[lackofmeds]]></category>
		<category><![CDATA[marketing]]></category>
		<category><![CDATA[scams]]></category>
		<category><![CDATA[search engines]]></category>

		<guid isPermaLink="false">http://smackdown.blogsblogsblogs.com/?p=1053</guid>
		<description><![CDATA[It has to be tough policing a program like AdSense. It must be exceptionally difficult during the holiday season, when the payoff to running scams grows so much more. It is so tough, in fact, that this year as the holiday shopping season grows near, with Black Friday just a few short days away, that [...]]]></description>
			<content:encoded><![CDATA[<div style="float:right; margin: 4px;"><img src="http://smackdown.blogsblogsblogs.com/images/googlecanthearyou.png" onmouseup="hl2l(event);" alt="Google Cant Hear You!"></div>
<p> It has to be tough policing a program like AdSense. It must be exceptionally difficult during the holiday season, when the payoff to running scams grows so much more. It is so tough, in fact, that this year as the holiday shopping season grows near, with Black Friday just a few short days away, that apparently Google has finally decided to say &#8220;fuck it&#8221;, make it easier on themselves, just remove the ability for anyone to report any violations of the program whatsoever, and allow the scammers to have a field day in the mean time.</p>
<p>While Google may want to give the impression to their stockholders and the public that they have both the search engine spam and advertising program cheaters fully under control, the truth is that they rely quite a bit on reports from the community and consumers for both spam and AdSense violations. For any spam that they find, Google asks <span id="more-1053"></span>people to submit a <a href="https://www.google.com/webmasters/tools/spamreport?hl=en" target="_blank">Google spam report</a>. At this point they require that someone log in before actually filing the report itself. This makes sense, since it helps prevent people erroneously filing large amount of spam reports against their competitors. For the AdSense violations they supply a separate form that does not require a log in, titled simply <a href="http://www.google.com/adsense/support/bin/topic.py?hl=en&#038;topic=1190500&#038;ctx=as2&#038;rd=1" target="_blank">Reporting a Violation &#8211; AdSense Help</a>. Usually I don&#8217;t run into offending sites with AdSense on them that fill me with enough of a sense of civic duty where I feel compelled to actually fill out a report, but I happened to land on one such today that actually tricked me into clicking on an ad in such a way that it really did annoy me. The page I landed on was <a href="http://www.bigsiteofamazingfacts.com/how-much-does-the-earth-weigh" target="_blank" rel="nofolow">BigSiteofAmazingFacts How Much Does The Earth Weigh</a> (yes, I was distracted by trivial shit again, don&#8217;t judge me), and in the right sidebar there was what appeared to be an embedded Youtube Video from Family Guy:</p>
<p>&nbsp;</p>
<p><img src="http://smackdown.blogsblogsblogs.com/images/howmuchdoestheearthweigh.png" onmouseup="hl2l(event);" alt="I see a video"></p>
<p>&nbsp;</p>
<p>Still distracted (of course) I clicked Play on the video, only instead of playing it suddenly brought me to a site trying to sell me bras. So, thinking I must have <em>missed</em> the rather large video in the sidebar when I tried to click on it, I hit the back button&#8230; and noticed that suddenly the video was gone altogether, and where before I had seen 2 AdSense blocks and a video, now there were 3 AdSense blocks instead:</p>
<p>&nbsp;</p>
<p><img src="http://smackdown.blogsblogsblogs.com/images/howmuchdoestheearthweigh2.png" onmouseup="hl2l(event);" alt="What video?"></p>
<p>&nbsp;</p>
<p>I hit refresh a few times but the video didn&#8217;t return. At that point I realized that it was actually a scam, so I cleared my cookies for that domain, hit refresh again, and viola, the &#8220;video&#8221; reappeared once again. At this point I was sufficiently irked that I actually decided I was going to report this asshole. It&#8217;s bad enough that a site with crap content like this is ranking #1 (the weight of the Earth is increasing each year from salt from the ocean spray? Seriously, wtf?), while people with content that is just fine are getting penalized supposedly from the Panda fallout. To add in that the guy who owns the site is ripping off advertisers as well just makes it so much worse. So, I headed on over to the AdSense Violation report to be a good citizen&#8230; and I was greeted by this:</p>
<p>&nbsp;</p>
<p><img src="http://smackdown.blogsblogsblogs.com/images/adsense-violation-report-missing.png" onmouseup="hl2l(event);" alt="What AdSense violation report?"></p>
<p>&nbsp;</p>
<p>An essentially blank page, with only a header, navigation, and a box asking me to tell AdSense how they can improve. Go figure.</p>
<p>From a financial perspective it does make sense for Google to make reporting AdSense violators more difficult, especially during the holidays. People who run scams like this actually generate Google money through the AdSense program, a program which currently has <a href="http://musictechpolicy.wordpress.com/2011/09/27/will-google-adsense-submit-the-power-of-google-to-voluntary-oversight/" target="_blank">absolutely no oversight</a>. It is exactly this lack of oversight that means that Google is the only one who knows how much, if any, of the advertising dollars are credited back to the advertisers once these scams are revealed. Hiding the violations report means that much fewer sites will be reported, more scams will be able to run for longer periods of time, and more money will wind up in Google&#8217;s pockets.</p>
<p>Is this profit motive really the reason that the report form is missing? If you ask Google I am sure they would say &#8220;of course not, we&#8217;re Google, you can trust us&#8221;. And since everything with Google is proprietary &#8220;behind closed doors&#8221; trade secrets with them, there is no way to know exactly how many violation reports suddenly went missing that apparently no one has noticed yet. My hunch though is that with something like this, as online shopping hits the holiday rush, the lack of reports that are coming in at the moment is actually too big for them not to have noticed by now, and them not fixing it for this long must be at least in some part intentional on their end.</p>
<p><strong>Update</strong>: As Jen from <a href="http://www.jensense.com/" target="_blank">JenSense.com</a> pointed out in the comments, there is another newer page available where you can actually file the report <a href="http://www.google.com/adsense/support/as/bin/static.py?page=ts.cs&#038;ts=1190500" target="_blank">located here</a>. However, I am not sure that makes it any better, and may in fact make it worse. I wound up on the empty page by actually going to Google and searching for [<a href="http://www.google.com/search?q=report+adsense+violation&#038;num=10" target="_blank">report adsense violation</a>]. The page that Jen provided is in the list, but it is down under the blank page that I found, another unhelpful blank page, and underneath a list of discussion of other people looking for the form. This begs the question&#8230; why did Google leave an otherwise empty page behind with just enough text (ie. header and title) and all of the old link juice there to outrank the &#8220;real&#8221; form? If they redesigned the site, then why not 301 redirect the old form(s) to the new one? It&#8217;s not like they don&#8217;t know how search engines work, ya know?</p>
]]></content:encoded>
			<wfw:commentRss>http://smackdown.blogsblogsblogs.com/2011/11/22/google-says-fuck-it-for-the-christmas-season-removes-the-ability-to-report-adsense-violations/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>How Matt Cutts Leveraged The Stack Overflow And Hacker News Communities In Redefining The Phrase &#8220;Content Farms&#8221;</title>
		<link>http://smackdown.blogsblogsblogs.com/2011/01/31/how-matt-cutts-leveraged-the-stack-overflow-and-hacker-news-communities-in-redefining-the-phrase-content-farms/</link>
		<comments>http://smackdown.blogsblogsblogs.com/2011/01/31/how-matt-cutts-leveraged-the-stack-overflow-and-hacker-news-communities-in-redefining-the-phrase-content-farms/#comments</comments>
		<pubDate>Mon, 31 Jan 2011 21:52:15 +0000</pubDate>
		<dc:creator>Michael VanDeMar</dc:creator>
				<category><![CDATA[blogthropology]]></category>
		<category><![CDATA[Cuttisms]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[marketing]]></category>
		<category><![CDATA[search engines]]></category>
		<category><![CDATA[SEO]]></category>
		<category><![CDATA[spin]]></category>

		<guid isPermaLink="false">http://smackdown.blogsblogsblogs.com/?p=892</guid>
		<description><![CDATA[A little over a week ago, on the Friday before last, Matt Cutts, the head of Google&#8217;s Web Spam Team, wrote a post on the Official Google Blog titled &#8220;Google search and search engine spam&#8221;. This post, and the upcoming changes it discussed, were most likely in response to a growing trend of dissatisfaction with [...]]]></description>
			<content:encoded><![CDATA[<p>A little over a week ago, on the Friday before last, Matt Cutts, the head of Google&#8217;s Web Spam Team, wrote a post on the Official Google Blog titled <a href="http://googleblog.blogspot.com/2011/01/google-search-and-search-engine-spam.html" target="_blank">&#8220;Google search and search engine spam&#8221;</a>. This post, and the upcoming changes it discussed, were most likely in response to a <a href="http://www.codinghorror.com/blog/2011/01/trouble-in-the-house-of-google.html" totle="Trouble In the House of Google" target="_blank">growing trend of dissatisfaction with Google&#8217;s results</a> that have been cropping up around the blogosphere. In the post Matt talks about how Google feels that things are in fact not as bad as people are saying, and that &#8220;Google&#8217;s search quality is better than it has ever been in terms of relevance, freshness and comprehensiveness.&#8221; He does say that recently, due to increase in both &#8220;size and freshness&#8221; that of course some spam did get indexed, and also states that as the old, tired, run of the mill spam decreased in Google&#8217;s index that Google will now be shifting it&#8217;s focus on to content that just sucks:</p>
<blockquote><p>As &#8220;pure webspam&#8221; has decreased over time, attention has shifted instead to &#8220;content farms,&#8221; which are sites with shallow or low-quality content. <em>- Matt Cutts</em></p></blockquote>
<p>Whoa. This, especially coming from Matt Cutts, is huge. For those who don&#8217;t know, <a href="http://en.wikipedia.org/wiki/Content_farm" target="_blank">&#8220;content farms&#8221;</a> are <span id="more-892"></span>organizations that generate websites composed of large amounts of low cost &#8220;fluff&#8221; or filler content, with little to no regard to quality. The content is generated not based on having information and the desire to share it, but rather in response to queries that might get typed into a search engine, and are built for search spiders rather than human consumption. They include companies like <a href="http://www.seobook.com/demand-medias-ehow-com-using-interesting-expired-domain-redirect-seo-strategy" target="_blank">Demand Media</a>, <a href="http://smackdown.blogsblogsblogs.com/2010/03/08/mahalo-com-meet-the-new-spam-worse-than-the-old-spam/" target="_blank">Mahalo</a>, and Associated Content.</p>
<p>Historically speaking, Matt has pretty much refused to come right out and say that these content farms were indeed spam, despite the fact that they <a href="http://www.google.com/support/webmasters/bin/answer.py?answer=66355" target="_blank">clearly violated Google&#8217;s quality guidelines</a>:</p>
<blockquote><p>Doorway pages are typically large sets of poor-quality pages where each page is optimized for a specific keyword or phrase&#8230; Google&#8217;s aim is to give our users the most valuable and relevant search results. Therefore, we frown on practices that are designed to manipulate search engines and deceive users by directing them to sites other than the ones they selected, and that provide content solely for the benefit of search engines. Google may take action on doorway sites and other sites making use of these deceptive practice, including removing these sites from the Google index. <em>- Google Webmaster Tools Help</em></p></blockquote>
<p>Regardless of the very clear wording of their policies, Google has to date not banned any of these content farms for their violations. In fact, quite the opposite &#8211; Matt has in the past even defended these sites, and in Mahalo&#8217;s case at least given warnings to them which he then allowed them to ignore. He alluded to the fact that one of the algorithm updates from last year, Mayday, was supposed to help filter out &#8220;really kind of lower quality&#8221; sites, and many people thought he must be talking about content farms back then, but alas that <a href="http://smackdown.blogsblogsblogs.com/2010/06/11/was-the-google-mayday-update-a-complete-failure-then/" target="_blank">turned out to be a bust</a>. So when he comes right out and says, hey, you&#8217;ve waited long enough, now we&#8217;re going to target content farms for reals, y&#8217;all, then yeah, that&#8217;s a Pretty Big Deal.</p>
<p>Now, Richard Rosenblatt, the CEO of Demand Media, may be may be in denial about his company being a content farm, but that definition has existed for quite some time, and regardless of what you call it low quality content built specifically for search engines is in violation of Google&#8217;s guidelines. However, he still persists in his belief that as long as you can get some people to call it something else, his <a href="http://mediamemo.allthingsd.com/20110127/demand-media-says-its-getting-along-just-fine-with-google-thank-you-very-much/" target="_blank">&#8220;partnership with Google&#8221;</a> will keep them protected regardless of what happens:</p>
<blockquote><p>This is why our partnership with Google makes sense. 1) We help them fill the gaps in their index, where they don’t have quality content. 2) We’re the largest supplier of all video to YouTube, over two billion views and 3) we’re a large AdSense partner. So our relationship is synergistic, and it’s a great partnership. And it’s a partnership that we’re excited to continue to expand. <em>- Richard Rosenblatt, attempting to give Google&#8217;s PR team a heart attack</em></p></blockquote>
<p>I am guessing that Mr. Rosenblatt missed the section in Matt&#8217;s post where he very specifically discussed the fact that <em>no</em> special partnerships would protect the content mills from these changes:</p>
<blockquote><p>One misconception that we’ve seen in the last few weeks is the idea that Google doesn’t take as strong action on spammy content in our index if those sites are serving Google ads. To be crystal clear:</p>
<ul>
<li>Google absolutely takes action on sites that violate our quality guidelines regardless of whether they have ads powered by Google;</li>
<li>Displaying Google ads does not help a site’s rankings in Google; and</li>
<li>Buying Google ads does not increase a site’s rankings in Google’s search results.</li>
</ul>
<p><em> &#8211; Matt Cutts, being crystal clear</em></p></blockquote>
<p>Then Friday rolls around, and <a href="http://www.mattcutts.com/blog/algorithm-change-launched/" target="_blank">Matt announces that these changes already happened earlier in the week</a>. If you didn&#8217;t notice any changes, then that&#8217;s probably because, according to Matt, less than half of a percent of queries would show any perceptible ranking differences. If you didn&#8217;t notice any changes in queries involving content farms, well&#8230; as near as I can tell that is because there weren&#8217;t any. In fact, in his announcement post Matt doesn&#8217;t even use the phrase &#8220;content farms&#8221; at all, and instead only discusses that the net effect of these changes is that in cases where content was scraped, searchers are more likely to see the original content first. He then thanks <a href="http://www.codinghorror.com/blog/" target="_blank">Jeff Atwood</a> (one of the ones who wrote a story discussing Google&#8217;s decline in quality that had a large audience) and <a href="http://stackoverflow.com/" target="_blank">Stack Overflow&#8217;s team</a> (a site that Jeff co-founded) for their feedback. A few people asked about the omission in the comments, but as of yet anyway Matt has not replied to any of them.</p>
<p>As to the results themselves, for the most part I am seeing what I was seeing before, so that &#8220;less than half of a percent&#8221; doesn&#8217;t surprise me. If you search for [<a href="http://www.google.com/search?q=mcdonalds+coupons" target="_blank">mcdonalds coupons</a>] the #1 site is still a Mahalo page that doesn&#8217;t actually have any coupons on it, and very little original content. If you search for [<a href="http://www.google.com/search?q=mcdonalds+free+salad+coupons" target="_blank">mcdonalds free salad coupons</a>] you get a different Mahalo page that does actually have a picture of a coupon on it (good only in Canada, and expired in July 2010, however), and if you search for [<a href="http://www.google.com/search?q=mcdonalds+happy+meal+coupons" target="_blank">mcdonalds happy meal coupons</a>] the second listing is a Mahalo page, again with no coupons on it. These pages are filled with riveting dialog, such as the section labeled &#8220;McDonalds Happy Meal Coupons Coupon Policies,&#8221; which states:</p>
<blockquote><p>The policies for McDonalds Happy Meal coupons may have certain restrictions and these might include not being able to combine discounts or limiting the period of use. Make sure you read and understand the instructions listed on the coupon carefully to ensure that you know when the coupon will become valid and when it will expire as well as what special restrictions apply. Also included in this information will be which product or products the coupon can be used to purchase. Insuring that you understand the coupon policy can help you to avoid any mistakes during the checkout process. <em>- Content Mahalo actually paid for</em></p></blockquote>
<p>Seriously?</p>
<p>It&#8217;s not just Mahalo, of course&#8230; type in [<a href="http://www.google.com/search?q=how+to+reset+your+blackberry" target="_blank">how to reset your blackberry</a>] and you will find ranking just fine a page from eHow that is nothing more than the phrase <a href="http://www.ehow.com/how_4776425_reset-blackberry-removing-battery.html" target="_blank" rel="nofollow">&#8220;hit alt+right-shift+delete&#8221;</a> wrapped in light, fluffy filler content. I also still see queries where the duplicate content outranks the original, such as the copy of a Wikipedia page that ranks #1 for [<a href="http://www.google.com/search?q=elvett+semic" target="_blank">elvett semic</a>]. The changes, whatever they were, truly are barely (if at all) perceptible. The change was so small that one of Matt&#8217;s readers asked, &#8220;I&#8217;m wondering why announce it if you&#8217;ve gotten the feedback and the algorithm update would presumably be of such little consequence that no one would likely notice or comment on it unless you told everyone.&#8221; Indeed, why make such a big deal out of something when almost no one can tell the difference?</p>
<p>To answer that you need to take a look at exactly what it was that did change. When I search in Google now for questions that were asked on Stack Overflow, at least for the queries I checked, I now see SO ranking instead of sites that scrape their content. This is of course how it should be, and the main concern that the people from that community were <s>bitching</s> giving feedback about to Matt. Stack Overflow is, as I mentioned, the site that was co-founded by Jeff Atwood, who is the author of the much quoted post that generated quite a bit of buzz about Google&#8217;s decline in quality. Many of the frequenters of Stack Overflow are also regulars on <a href="http://news.ycombinator.com/" target="_blank">Hacker News</a> which (not so) coincidentally Matt decided to hold a good portion of the discussion about these changes, both before and after they were implemented. While the HN and SO communities in and of themselves might be tiny compared to the web as a whole, the fact is that their voices do carry within the online community. Start buzz there about Google showing quite a bit of improvement and it has a very good chance of spreading, even if the data set demonstrating that is overall quite small. Add to that the fact that Richard Rosenblatt, CEO of Demand Media <em>knows</em> that the changes aren&#8217;t targeted at his company (and when asked if Google had discussed the changes with him, replies &#8220;I can’t comment on that.&#8221;), and then toss in Jason Calacanis&#8217;s ingratiating comments on Matt&#8217;s blog post about the changes going live:</p>
<blockquote><p>It was clear that Mahalo was getting grouped into the &#8220;content farm&#8221; space&#8230; <em>- Jason Calacanis</em></p></blockquote>
<p>No kidding? Really? Past tense there, eh Jason?</p>
<p>So Matt loosely ties the concepts of &#8220;content farms&#8221; and &#8220;scrapers&#8221; together in a blog post on the official Google Blog, and claims that they are taking action against them. He then announces a change that appears to only affect scraper sites, and furthermore only those scraping a specific dissatisfied community, publicly thanks that community for their help, and then doesn&#8217;t mention the phrase &#8220;content farm&#8221; again. Even though the changes were practically non-existent, there is a good chance that the overall impression from those who don&#8217;t look too closely is that action was indeed taken, and that if what were <em>formerly</em> referred to as content farms are still ranking well, then obviously they must be there for a reason.</p>
<p>From a strategic standpoint it&#8217;s actually rather clever. If I were Google and I needed to conceal special relationships I had with companies (especially if I was thinking that the FTC might want to get involved in my business) then I too would probably try very hard to sway the public opinion about the labels attached to the sites those companies owned, and shift the focus to something I could fix without caring about the damage, and then crowd source a tech community to help spread the impression that things were better. Most people probably won&#8217;t even pay enough attention to notice.</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p><a href="http://www.seobook.com/images/content-farms.gif" target="_blank"><img src="/images/not-content-farms.gif" onmouseup="hl2l(event);" alt="Were not content farms! No! Moo!" border="0" width="500px"></a><br />
(<em>click to view original</em>)</p>
]]></content:encoded>
			<wfw:commentRss>http://smackdown.blogsblogsblogs.com/2011/01/31/how-matt-cutts-leveraged-the-stack-overflow-and-hacker-news-communities-in-redefining-the-phrase-content-farms/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
		<item>
		<title>Matt Cutts Criticizes Deceptive Ads, Doesn&#8217;t Realize Google Is The One Serving Them</title>
		<link>http://smackdown.blogsblogsblogs.com/2011/01/30/matt-cutts-criticizes-deceptive-ads-doesnt-realize-google-is-the-one-serving-them/</link>
		<comments>http://smackdown.blogsblogsblogs.com/2011/01/30/matt-cutts-criticizes-deceptive-ads-doesnt-realize-google-is-the-one-serving-them/#comments</comments>
		<pubDate>Sun, 30 Jan 2011 21:00:56 +0000</pubDate>
		<dc:creator>Michael VanDeMar</dc:creator>
				<category><![CDATA[blogthropology]]></category>
		<category><![CDATA[Cuttisms]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[lackofmeds]]></category>
		<category><![CDATA[marketing]]></category>
		<category><![CDATA[scams]]></category>
		<category><![CDATA[search engines]]></category>

		<guid isPermaLink="false">http://smackdown.blogsblogsblogs.com/?p=894</guid>
		<description><![CDATA[Yesterday over on Daggle.com Danny Sullivan published a post titled, Of Misleading Acai Berry Ads &#038; Fake Editorial Sites. In the article Danny discuses a rising trend of deceptive marketing practices involving fake news sites, the way they rip people off with products they are selling, and the fact that authority sites such as the [...]]]></description>
			<content:encoded><![CDATA[<p>Yesterday over on <a href="http://daggle.com" target="_blank">Daggle.com</a> Danny Sullivan published a post titled, <a href="http://daggle.com/misleading-acai-berry-ads-fake-editorial-sites-2435" target="_blank">Of Misleading Acai Berry Ads &#038; Fake Editorial Sites</a>. In the article Danny discuses a rising trend of deceptive marketing practices involving fake news sites, the way they rip people off with products they are selling, and the fact that authority sites such as the <a href="http://www.latimes.com/" target="_blank">LA Times</a> are the ones carrying these ads, lending them some credibility in the public eye. Danny states in the post that the ads showing are being served by Zedo, and that he wishes the ad network should raise it&#8217;s standards and not allow such blatantly misleading advertising:</p>
<blockquote><p>Personally, I’d like to see Zedo up its standards for the type of ads it will accept. This type of junk shouldn’t be allowed. <em>- Danny Sullivan</em></p></blockquote>
<p>He&#8217;s right, too, the ad networks <em>should</em> be policing this type of deception, by all means. Matt Cutts, Google&#8217;s head of the web spam team, agrees. He <a href="http://twitter.com/mattcutts/status/31751730140024832">tweeted about the story</a>, and also<span id="more-894"></span> commented his take on the matter in the post itself:</p>
<blockquote><p>    My favorite part of the disclaimer for those type of sites is &#8220;This website, and any page on the website, is based loosely off a true story, but has been modified in multiple ways including, but not limited to: the story, the photos, and the comments.&#8221;</p>
<p>    Oh, so I can trust the website except for the story, photos, and comments? In other words, the entire website?</p>
<p>    And if you read the disclaimer carefully, most of these sites promise a &#8220;free trial&#8221; with $1.95 in shipping, but actually set your card up with a recurring subscription. The &#8220;one weird old tip&#8221; ad that I clicked from the L.A. Times mentioned this in the fine print: &#8220;If you do not cancel within seven (7) days of the date that you enroll in the Program, we will charge the same card you provided at enrollment the non-refundable one-year membership fee of $149.95&#8243;. Then they also start charging you $12.95 a month. Grr. <em> &#8211; Matt Cutts, on deceptive &#8220;flat belly&#8221; ads</em></p></blockquote>
<p>Grr, indeed. </p>
<p>Danny also mentions in his post about how &#8220;The ad, unlike Google&#8217;s ads, doesn’t report what ad network is delivering them,&#8221; which if they did would be a form of disclosure. And Danny is right&#8230; except for one thing. Danny derived the fact that the ad was being served by Zedo by examining the url. However, if you view the source on the LA Times article and go to the spot on the page where the ad is showing, you don&#8217;t see the Zedo ad network code. The ad itself is being generated by Javascript that is being pulled from yet another ad network:</p>
<p>&nbsp;</p>
<p><img src="/images/latimes-source-doubleclick.png" onmouseup="hl2l(event);" alt="Doubleclick is the real culprit" border="0"></p>
<p>&nbsp;</p>
<p>The actual ad network that the LA Times has a relationship with, and the ones responsible for what ads show on their site, is Doubleclick. And who owns Doubeclick, you might ask? As most of you probably already know, <a href="http://www.google.com/doubleclick/" target="_blank">Google does</a>, since they <a href="http://www.nytimes.com/2007/04/14/technology/14DoubleClick.html" target="_blank">bought them back in 2007 for $3.1 billion</a>. So obviously not all of the ads Google delivers disclose what network they are from.</p>
<p>It gets better. AdSense, Google&#8217;s flagship advertising network, serves what are known as &#8220;contextual ads&#8221;, where in theory the ad targeting is based on the context of the page contents where the ad blocks are placed. Danny uses AdSense on his site, with one of the blocks being at the very top of the page. Due to the various feeds in the sidebar, the content of the article, and the title, &#8220;Acai Berry&#8221; is mentioned 8 times on that same page. Therefore it is only natural, of course, that this is what we see when we look at the ads being served on the top:</p>
<p>&nbsp;</p>
<p><a href="/images/fake-news-ads-daggle2.png" target="_blank"><img src="/images/fake-news-ads-daggle2-sm.png" onmouseup="hl2l(event);" alt="The worlds most resilient bittorrent site." border="0"></a><br />
(<em>click to enlarge</em>)</p>
<p>&nbsp;</p>
<p>Now, can you guess where that ad leads? That&#8217;s right:</p>
<p>&nbsp;</p>
<p><a href="/images/fake-news-site2.png" target="_blank"><img src="/images/fake-news-site2-sm.png" onmouseup="hl2l(event);" alt="The worlds most resilient bittorrent site." border="0"></a><br />
(<em>click to enlarge</em>)</p>
<p>&nbsp;</p>
<p>It&#8217;s a fake news site identical to the one Danny is discussing, with the same text, layout, and even images embedded in the &#8220;story&#8221;, with the only variation being that the one Danny landed on is &#8220;News 7&#8243;, and this one is &#8220;News 8&#8243;. </p>
<p>What makes this story particularly interesting is that recently Matt Cutts <a href="http://searchengineland.com/mr-cutts-goes-to-washington-61234" target="_blank">visited Washington D.C., lobbying the FTC</a> about Google&#8217;s integrity, trying to convince them that they don&#8217;t require government oversight, and how they could be trusted to police themselves. Google also happens to be in a very unique position to help clean up these kinds of abuses. Not only could they pull these ads from their own vast array of properties, and require their third party partners to do the same, but they could also warn publishers who use networks that continue to promote scams that their sites rankings could suffer, in the same way that they have punished websites in the past for what they said was deceptive marketing, in the form of <a href="http://www.mattcutts.com/blog/hidden-links/" target="_blank">undisclosed paid links</a>. Instead, they themselves appear to be participating in the problem, not the solution.</p>
<p>So, Matt, are you willing to back up your testimony to the FTC about Google&#8217;s integrity, and lobby within your own company to help eradicate deceptive marketing from the web? Do you feel that websites that allow deceptive advertising to be shown on their sites should have their trust revoked? </p>
]]></content:encoded>
			<wfw:commentRss>http://smackdown.blogsblogsblogs.com/2011/01/30/matt-cutts-criticizes-deceptive-ads-doesnt-realize-google-is-the-one-serving-them/feed/</wfw:commentRss>
		<slash:comments>21</slash:comments>
		</item>
		<item>
		<title>Breaking News: Google Borks the Earth</title>
		<link>http://smackdown.blogsblogsblogs.com/2010/08/23/breaking-news-google-borks-the-earth/</link>
		<comments>http://smackdown.blogsblogsblogs.com/2010/08/23/breaking-news-google-borks-the-earth/#comments</comments>
		<pubDate>Mon, 23 Aug 2010 21:22:53 +0000</pubDate>
		<dc:creator>Michael VanDeMar</dc:creator>
				<category><![CDATA[coding]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[lackofmeds]]></category>
		<category><![CDATA[nerdiness]]></category>
		<category><![CDATA[search engines]]></category>

		<guid isPermaLink="false">http://smackdown.blogsblogsblogs.com/?p=820</guid>
		<description><![CDATA[Want to explore the entire planet from your computer? Normally all anyone wanting to do so would have to do would be to trot on over to Google Earth, download and install their application, and off globe trotting they could go. Today, unfortunately, those who do not already have the program installed are apparently out [...]]]></description>
			<content:encoded><![CDATA[<p>Want to explore the entire planet from your computer? Normally all anyone wanting to do so would have to do would be to trot on over to <a href="" target="_blank">Google Earth</a>, download and install their application, and off globe trotting they could go. Today, unfortunately, those who do not already have the program installed are apparently out of luck. It looks like today one of the brighter Google engineers working for one of the world&#8217;s leading tech companies has somehow broken not just one of the download links for the application, but all of them. <span id="more-820"></span></p>
<p>The first place many people would find one of the download links is right in the Google serps, once under the Google Earth sitelinks and once as it&#8217;s own listing:</p>
<p>&nbsp;</p>
<p><img src="/images/google-earth-download-serps.png" onmouseup="hl2l(event);" alt="Google Earth in the serps"></p>
<p>&nbsp;</p>
<p>That particular download link, <a href="earth.google.com/download-earth.html" target="_blank">earth.google.com/download-earth.html</a>, is being redirected to what I am guessing is an agreement page, <a href="http://www.google.com/earth/download/ge/agree.html" target="_blank">http://www.google.com/earth/download/ge/agree.html</a>. This, however, returns a 404:</p>
<p>&nbsp;</p>
<p><img src="/images/google-earth-404.png" onmouseup="hl2l(event);" alt="Agreement page not found"></p>
<p>&nbsp;</p>
<p>The second place people could normally download Google Earth from would be to go to the Google Earth homepage, which was previously located at <a href="http://earth.google.com" target="_blank">earth.google.com</a>, but is now being redirected to <a href="http://www.google.com/earth/index.html" target="_blank">http://www.google.com/earth/index.html</a>. There you can find 2 links, one in the left navigation and one as a large blue button with the text &#8220;Download Google Earth 5&#8243;:</p>
<p>&nbsp;</p>
<p><img src="/images/downloadbutton.png" onmouseup="hl2l(event);" alt="Big Blue Button"></p>
<p>&nbsp;</p>
<p>As inviting as that button is, however, it is simply teasing you. Both the link and the button trigger a Javascript function named earth.downloadEarth(). Normally downloading the entire planet would be a huge power trip&#8230; today however you get from clicking the button is &#8220;Server not found&#8221;:</p>
<p>&nbsp;</p>
<p><img src="/images/problemloading.png" onmouseup="hl2l(event);" alt="Whole server not found"></p>
<p>&nbsp;</p>
<p>It looks like the reason for this one not working is because someone got sloppy when changing the links from earth.google.com to www.google.com, and simply combined the two into <a href="http://earth.googlewww.google.com/intl/en/download-earth.html" target="_blank">http://earth.googlewww.google.com/intl/en/download-earth.html</a>, although that particular page doesn&#8217;t exist on either domain so obviously they messed up more than once. Also, what is even odder, is that the Google Earth packages are <em>also</em> missing from the Ubuntu download repositories:</p>
<p>&nbsp;</p>
<p><img src="/images/google-earth-linuxpkgmanager.png" onmouseup="hl2l(event);" alt="Google Earth gone from Ubuntu too?"></p>
<p>&nbsp;</p>
<p>To have Google Earth not be installable from anywhere seems almost as if there is something deliberate going on. Is Google going to phase out one of it&#8217;s cooler applications? Or is something new coming down the pipes from them that will replace it? Only time will tell.</p>
<p><em>Thanks to <a href="http://twitter.com/DonnaFontenot" target="_blank">Donna Fontenot</a> for discovering this today!</p>
]]></content:encoded>
			<wfw:commentRss>http://smackdown.blogsblogsblogs.com/2010/08/23/breaking-news-google-borks-the-earth/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>zOMG! Jason Calacanis Lied Again?? Shocker!</title>
		<link>http://smackdown.blogsblogsblogs.com/2010/06/21/zomg-jason-calacanis-lied-again-shocker/</link>
		<comments>http://smackdown.blogsblogsblogs.com/2010/06/21/zomg-jason-calacanis-lied-again-shocker/#comments</comments>
		<pubDate>Mon, 21 Jun 2010 12:45:58 +0000</pubDate>
		<dc:creator>Michael VanDeMar</dc:creator>
				<category><![CDATA[blogthropology]]></category>
		<category><![CDATA[Cuttisms]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[lackofmeds]]></category>
		<category><![CDATA[scams]]></category>
		<category><![CDATA[search engines]]></category>
		<category><![CDATA[SEO]]></category>

		<guid isPermaLink="false">http://smackdown.blogsblogsblogs.com/?p=799</guid>
		<description><![CDATA[Last Thursday, in response to Matt Cutts stating that he needed more than &#8220;arbitrary inurl searches&#8221; to sway him (which was in turn in response to a Hacker News submission about Mahalo and the plethora of keyword rich domains they were apparently building out) I wrote a post explaining in some detail how the latest [...]]]></description>
			<content:encoded><![CDATA[<p>Last Thursday, in response to Matt Cutts stating that he needed more than &#8220;arbitrary inurl searches&#8221; to sway him (which was in turn in response to a <a href="http://news.ycombinator.com/item?id=1433676" target="_blank">Hacker News submission</a> about Mahalo and the plethora of keyword rich domains they were apparently building out) I wrote a post explaining in some detail how the latest <a href="http://smackdown.blogsblogsblogs.com/2010/06/17/need-help-understanding-the-latest-mahalo-spam/" target="_blank">Mahalo spam is in fact spam</a>. I demonstrated in the post how Jason had developed a linkfarm which was being used as a link source back to Mahalo.com. It wasn&#8217;t just that the individual sites were all linking back to the mother site, which would in fact be normal, but also that the pages were linking back to specific pages within the main site, pages that in many cases had few, if any, links going to them aside from the ones from this linkfarm.</p>
<p>Each time it happens Matt&#8217;s defense of Mahalo spamming Google just gets more perplexing. In this latest round he started by saying that his job was not to have knee jerk reactions, as if Mahalo hadn&#8217;t already established a <a href="http://www.seobook.com/official-mahalo-com-spam-according-googles-internal-spam-documents" target="_blank">pattern of spamming</a> over a long period of time, and that Matt is pretending he hadn&#8217;t already had a talk with Jason and told him that if he didn&#8217;t raise the bar with his site that <a href="http://outspokenmedia.com/internet-marketing-conferences/ask-the-search-engines/" target="_blank">Google would take action</a> on Mahalo. From there it got even weirder &#8211; Matt looked at the linkfarm and basically told me that a) he didn&#8217;t care as long as it wasn&#8217;t passing link juice, and b) he&#8217;s the only one who could tell if that was the case.</p>
<p>I could have sworn that it was if you were caught <em>trying</em> to spam you were penalized, and you couldn&#8217;t get the penalty removed unless you <em>promised not to do it again</em>. Now, where did I get such a crazy and wild idea? Oh yeah, I remember now&#8230; <span id="more-799"></span><em><a href="http://www.mattcutts.com/blog/reinclusion-request-howto/" target="_blank">it was from Matt Cutts</a></em>:</p>
<blockquote><p>Now we come to the heart of things: what goes into a reinclusion request. Fundamentally, Google wants to know two things: 1) that any spam on the site is gone or fixed, and 2) that it’s not going to happen again. &#8211; <em>Matt Cutts on the bare essentials of a reconsideration request</em></p></blockquote>
<p>The reasons Matt gives out for defending Mahalo seem to be getting more and more creative (even if not more believable). Jason&#8217;s, on the other hand, are the same old song and dance he has been spouting since I first <a href="http://smackdown.blogsblogsblogs.com/2010/02/22/apparently-jason-calacanis-knows-hes-spamming-he-just-thinks-its-no-big-deal/" target="_blank">called him on his bs</a> and demonstrated that the <a href="http://smackdown.blogsblogsblogs.com/2010/03/10/dear-jason-calacanis-this-isnt-an-absurd-microscope/" target="_blank">vast majority of his site</a> was nothing more than empty, auto-generated pages. On Thursday&#8217;s post, before he started to lose it with his <a href="http://smackdown.blogsblogsblogs.com/2010/06/17/need-help-understanding-the-latest-mahalo-spam/comment-page-1/#comment-50660" target="_blank">&#8220;fuck you losers, I&#8217;m rich&#8221;</a> tirade, Jason made this statement:</p>
<blockquote><p>We have humans write pages of at least 300 words. We don’t index 99.99% of pages with < 300 (it would have to be something unique), and we police the system to get short pages up to 300 words within 30 days. - <em>Jason Calacanis, 4 days ago</em></p></blockquote>
<p>Orly? Let&#8217;s take a look at those claims, shall we?</p>
<p>The Mahalo coupon pages are about the crappiest pages I have found on the site. When I was doing my initial investigation I stumbled across quite a few of them. My guess is that [{brand} coupon] generates AdSense blocks with a decent eCPM since they are, after all, &#8220;targeted&#8221; pages. None of the Mahalo &#8220;coupon&#8221; pages actually have any coupons, which of course means that the end user is much more likely to click on one of the ads when they land there, and more required clicks does means a poorer user experience. What content these pages do have is fluff text that gives ample opportunity for Mahalo to link back to itself, and have spammy signals that are easy to spot like when there are near-identical versions of the same topic page, usually by doing one page for &#8220;coupons&#8221; and another for &#8220;printable coupons&#8221; (and no, there is nothing to print out on those pages either). Therefore i picked those as where I would look first to point out, yet again, how Jason was simply pulling these claims out of his ass with no supporting truths behind them.</p>
<p>Digging back into my old data, from March 13th, I was able to determine that from the day the site started adding content up until that point in time Mahalo had amassed 2,655 coupon based pages. When I re-scanned and looked this time I found that there was now 16,601 of these pages. That is a huge increase for 3 months, and a ton of content to create uniquely, even if you ditch quality altogether. Mahalo currently only has a grand total of 90,494 of actual pages on that side of things, so that means 18% of the site is made up of &#8220;coupon&#8221; pages &#8211; and by that I mean coupon pages that don&#8217;t actually <em>have</em> any coupons on them.</p>
<p>What&#8217;s more, it actually looks like there is a chance that 9,932 of those pages were added last week, over a <em>3 day period</em>. How the hell do you get writers to create 9,932 pages of even crappy content, all about <em>coupons</em>, in only 3 days?</p>
<p>As I started looking into it I suddenly understood&#8230; they didn&#8217;t just ditch the quality to create those pages, they went ahead and ditched the <em>content</em>, yet again. I checked over 30 pages, and time after time I found what I found was auto-generated pages that were nothing but ads, affiliate links, and scraper feeds.</p>
<p><a href="http://www.mahalo.com/1800pools-coupons" target="_blank" rel="nofollow">http://www.mahalo.com/1800pools-coupons</a>:</p>
<p>&nbsp;</p>
<p><a href="/images/mahalo-1800Pools-coupons.png" target="_blank"><img src="/images/mahalo-1800Pools-coupons-sm.png" onmouseup="hl2l(event);" alt="Mahalo 1800pools (non)coupons" border="0"></a><br />
(<em>click to view full page screenshot</em>)</p>
<p>&nbsp;</p>
<p><a href="http://www.mahalo.com/tigerdirect-coupons" target="_blank" rel="nofollow">http://www.mahalo.com/tigerdirect-coupons</a>:</p>
<p>&nbsp;</p>
<p><a href="/images/mahalo-tigerdirect-coupons.png" target="_blank"><img src="/images/mahalo-tigerdirect-coupons-sm.png" onmouseup="hl2l(event);" alt="Mahalo TigerDirect (non)coupons" border="0"></a><br />
(<em>click to view full page screenshot</em>)</p>
<p>&nbsp;</p>
<p><a href="http://www.mahalo.com/topnotchcare-com-coupons" target="_blank" rel="nofollow">http://www.mahalo.com/topnotchcare-com-coupons</a>:</p>
<p>&nbsp;</p>
<p><a href="/images/mahalo-topnotchcare.com-coupons.png" target="_blank"><img src="/images/mahalo-topnotchcare.com-coupons-sm.png" onmouseup="hl2l(event);" alt="Mahalo TopNotchCare (non)coupons" border="0"></a><br />
(<em>click to view full page screenshot</em>)</p>
<p>&nbsp;</p>
<p>Most of the pages I checked had the affiliate links provided by Savings.com, and most linked to the same two questions pages: one discussing the Outback coupons page, and one discussing &#8220;grocery coupons&#8221;&#8230; and in every case neither question had anything to do with what the actual &#8220;coupon&#8221; page was supposedly about:</p>
<p>&nbsp;</p>
<p><img src="/images/mahalo-topnotchcare.com-coupons-qna-sm.png" onmouseup="hl2l(event);" alt="Mahalo TopNotchCare coupons questions?" border="0"></p>
<p>&nbsp;</p>
<p><img src="/images/mahalo-1800Pools-coupons-qna-sm.png" onmouseup="hl2l(event);" alt="Mahalo 1800pools coupons questions?" border="0"></p>
<p>&nbsp;</p>
<p><img src="/images/mahalo-tigerdirect-coupons-qna-sm.png" onmouseup="hl2l(event);" alt="Mahalo TigerDirect coupons questions?" border="0"></p>
<p>&nbsp;</p>
<p>For the pages that did not have Savings.com affiliate feeds on them it was because they were using as keywords the names of sites that wouldn&#8217;t actually be Savings.com publishers, like <a href="http://www.gbb.org/" target="_blank">GBB.org</a> and <a href="http://rlsforum.net/" target="_blank" rel="_nofollow">RLS Forum</a>. It looks like Jason somehow got his hands on a list of sites that for some reason or another looked like they <em>might</em> have offered some sort of coupon. These were then dumped into the database in the form of pages, and were then checked to see if they matched up with the Savings.com feed. If they did, great, if not that&#8217;s ok too, they still had AdSense on them &#8211; despite the fact that putting AdSense on pages without actual content is a <a href="" target="_blank">direct violation of Google AdSense policies</a>:</p>
<p>&nbsp;</p>
<p><img src="/images/adsense-policies.png" onmouseup="hl2l(event);" alt="Mahalo violates AdSense policies" border="0"></p>
<p>&nbsp;</p>
<p>That&#8217;s ok though, I am sure Jason doesn&#8217;t care that he is risking the bulk of the site&#8217;s revenue stream by violating the terms of the program, since it looks like the AdSense team is giving him just as much of a pass as the spam team is.</p>
<p>In addition to the pages simply being devoid of content, Jason also uses the tactic of creating near-duplicate versions of some of these pages in order to get the most out of the long-tail phrase variations:</p>
<p><a href="http://www.mahalo.com/1and1-coupons" target="_blank" rel="nofollow">http://www.mahalo.com/1and1-coupons</a><br />
<a href="http://www.mahalo.com/1and1-internet-coupons" target="_blank" rel="nofollow">http://www.mahalo.com/1and1-internet-coupons</a><br />
<a href="http://www.mahalo.com/1and1-web-hosting-coupons" target="_blank" rel="nofollow">http://www.mahalo.com/1and1-web-hosting-coupons</a><br />
<a href="http://www.mahalo.com/1and1affiliate-com-coupons" target="_blank" rel="nofollow">http://www.mahalo.com/1and1affiliate-com-coupons</a></p>
<p>Let&#8217;s look at Jason&#8217;s statements again&#8230;</p>
<p>&nbsp;</p>
<p><strong>We have humans write pages</strong></p>
<p>&nbsp;</p>
<p>Well, no. You have humans write <em>some</em> pages, but an assload are still auto-generated. In addition to the ones shown here, Google also says that you still have 13,200 pages that you <a href="http://smackdown.blogsblogsblogs.com/2010/03/11/jason-calacanis-backup-plan-for-replacing-content-steal-it-from-wikipedia/" target="_blank">scraped from Wikipedia</a> in their index:</p>
<p>&nbsp;</p>
<p><img src="/images/mahalo-13.2k-wikipedia-pages.png" onmouseup="hl2l(event);" alt="Mahalo Wikipedia scraped pages" border="0"></p>
<p>&nbsp;</p>
<p>Adding the above auto-generated pages in with the Wikipedia ones, that means that at this point an estimated 33% of the Mahalo content pages are scraped or auto-generated, <em>and that&#8217;s just the stuff that&#8217;s easy to find</em>. Yay footprints.</p>
<p>&nbsp;</p>
<p><strong>of at least 300 words</strong></p>
<p>&nbsp;</p>
<p>Again, no, even on the human generated pages that is not always true. Take a look, for instance, at the 1and1 page on Mahalo.com that all 4 of the above coupons reference:</p>
<p>&nbsp;</p>
<p><img src="/images/mahalo-1and1.png" onmouseup="hl2l(event);" alt="Mahalo 1and1 (very) short page" border="0"></p>
<p>&nbsp;</p>
<p>Including words of 3 letters and less that page still only has 212 words of human generated content on it. I also pointed out last week that some of the Wikipedia scraped pages remained thin, such as the one on &#8220;The Alice B. Toklas Cookbook&#8221;, which has only 261 words on it.</p>
<p>&nbsp;</p>
<p><strong>We don’t index 99.99% of pages with < 300 [words]</strong></p>
<p>&nbsp;</p>
<p>Bullshit. Not one single one of the pages I examined had a &#8220;noindex&#8221; tag on it, or was blocked by robots.txt. In fact, just the opposite &#8211; every single one of them was pushed to Mahalo&#8217;s sitemap, to make it <em>easier</em> for Google to find (and index) them.</p>
<p>&nbsp;</p>
<p><strong>we police the system to get short pages up to 300 words within 30 days</strong></p>
<p>&nbsp;</p>
<p>Again, bullshit. The 1and1 page has been that way since at least March 11th:</p>
<p>&nbsp;</p>
<p><img src="/images/mahalo-1and1-lastmod.png" onmouseup="hl2l(event);" alt="Mahalo 1and1 page last modified March 11th" border="0"></p>
<p>&nbsp;</p>
<p>And the Alice B. Toklas one since March 12th:</p>
<p>&nbsp;</p>
<p><img src="/images/mahalo-alice-b-toklas-lastmod.png" onmouseup="hl2l(event);" alt="Mahalo Alice B. Toklas page last modified March 12th" border="0"></p>
<p>&nbsp;</p>
<p>So Jason, please, enough with the bs. Quit claiming stuff that simply is not true, especially when it&#8217;s <em>so</em> damn easy to disprove what you say. I still have no idea why it is that Matt Cutts is choosing to ignore your spam, but to the rest of us it&#8217;s as plain as day. And no, Jason&#8230; going in now and trying to clean it up in no way changes the fact that you spammed in the first place.</p>
]]></content:encoded>
			<wfw:commentRss>http://smackdown.blogsblogsblogs.com/2010/06/21/zomg-jason-calacanis-lied-again-shocker/feed/</wfw:commentRss>
		<slash:comments>22</slash:comments>
		</item>
		<item>
		<title>Was The Google Mayday Update A Complete Failure Then?</title>
		<link>http://smackdown.blogsblogsblogs.com/2010/06/11/was-the-google-mayday-update-a-complete-failure-then/</link>
		<comments>http://smackdown.blogsblogsblogs.com/2010/06/11/was-the-google-mayday-update-a-complete-failure-then/#comments</comments>
		<pubDate>Fri, 11 Jun 2010 15:52:35 +0000</pubDate>
		<dc:creator>Michael VanDeMar</dc:creator>
				<category><![CDATA[blogthropology]]></category>
		<category><![CDATA[Cuttisms]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[lackofmeds]]></category>
		<category><![CDATA[search engines]]></category>
		<category><![CDATA[SEO]]></category>

		<guid isPermaLink="false">http://smackdown.blogsblogsblogs.com/?p=722</guid>
		<description><![CDATA[Earlier this week at SMX Advanced Seattle, during the You&#038;A With Matt Cutts, the topic of the latest Google update, dubbed Mayday by webmaster last month, happened to come up. According to Ryan Jones&#8217; live blogging account of the SMX Keynote the update had nothing to do with the web spam team. It was an [...]]]></description>
			<content:encoded><![CDATA[<p>Earlier this week at <a href="http://searchmarketingexpo.com/advanced/" target="_blank">SMX Advanced Seattle</a>, during the <strong>You&#038;A With Matt Cutts</strong>, the topic of the latest Google update, dubbed <a href="http://searchengineland.com/google-confirms-mayday-update-impacts-long-tail-traffic-43054" target="_blank">Mayday</a> by webmaster last month, happened to come up. According to Ryan Jones&#8217; <a href="http://www.dotcult.com/live-blogging-matt-cutts-you-a" target="_blank">live blogging account of the SMX Keynote</a> the update had nothing to do with the web spam team. It was an algorithmic change that was intended to &#8220;make long tail results more useful&#8221;. Matt made statements in effect telling webmasters who might have been affected by MayDay that they should look at their content and see how usefulness or unique content could be added to those pages. This indicates that the point of the Mayday update was to filter out or penalize results that are <em>not</em> unique content, or that are simply autogenerated results.</p>
<p>Matt made similar statements when he was <a href="http://www.webpronews.com/topnews/2010/06/09/google-mayday-update-designed-to-hit-auto-generated-pages-content-farms" target="_blank">interviewed by WebProNews</a> and the topic came up:<span id="more-722"></span></p>
<blockquote><p>How do I make sure that I am returning the highest quality content, stuff that&#8217;s really useful for users, whether it&#8217;s editorial discretion, unique content user generated content, you know, stuff that&#8217;s not available anywhere else, versus just something that&#8217;s scraped, or duplicate, or really kind of lower quality. &#8211; <em>Matt Cutts, explaining how not to get penalized by the Mayday Update</em></p></blockquote>
<p>During the keynote, in response to Matt&#8217;s explanation of what Mayday was supposed to accomplish Danny Sullivan indicated that he hopes that this update will help filter out results from <a href="http://www.seobook.com/content-mills" target="_blank">content mills</a> like Mahalo. <a href="http://smackdown.blogsblogsblogs.com/2010/03/08/mahalo-com-meet-the-new-spam-worse-than-the-old-spam/" target="_blank">Mahalo</a> certainly sounds exactly like what Matt was describing, as I have written about in the past. </p>
<p>So, did the Mayday update actually accomplish filtering out this &#8220;low quality&#8221; content from the search results? I went back and checked some of <a href="http://smackdown.blogsblogsblogs.com/2010/02/22/apparently-jason-calacanis-knows-hes-spamming-he-just-thinks-its-no-big-deal/" target="_blank">Jason Calacanis&#8217; spam pages on Mahalo</a> that I had blogged about in the past. Unsurprisingly enough, most of the pages I checked were still ranking just fine in Google. What was slightly unexpected, however, was what the listing looked like for one of them, [<a href="http://www.google.com/search?q=need+for+speed+walkthrough" target="_blank">need for speed walkthrough</a>]:</p>
<p>&nbsp;</p>
<p><img src="/images/mahalo-need-for-speed-stub-listing-sm.png" onmouseup="hl2l(event);" alt="Need for speed walkthrough stub listing"></p>
<p>&nbsp;</p>
<p>Notice the lack of a snippet for that listing in the search results? That is because due to my earlier write-ups about Mahalo and Google, in an attempt to <a href="http://smackdown.blogsblogsblogs.com/2010/03/08/jason-calacanis-makes-matt-cutts-a-liar/" target="_blank">keep up appearances with Matt Cutts</a>, Jason had the team move a bunch of the pages that I wrote about into a directory named /stub, and then <a href="http://www.mahalo.com/robots.txt" target="_blank">blocked that directory via robots.txt</a>. When Google encounters blocked content that has enough link juice it lists those pages, like you see here, as url-only. What Google usually doesn&#8217;t do, however, is actually rank those pages well in the search results.</p>
<p>Thinking that maybe it was a fluke, and that perhaps that particular listing just had some extra ranking power because I had blogged about it before and therefore it gained a few extra links, I delved further. This is just a sampling of what I found:</p>
<p>[<a href="http://www.google.com/search?&#038;q=fallen+angel+walkthrough" target="_blank">fallen angel walkthrough</a>] (#4)</p>
<p>&nbsp;</p>
<p><img src="/images/mahalo-fallen-angel-sm.png" onmouseup="hl2l(event);" alt="fallen angel walkthrough stub listing" border="0"></p>
<p>&nbsp;</p>
<p>[<a href="http://www.google.com/search?q=jack+keane+walkthrough" target="_blank">jack keane walkthrough</a>] (#7)</p>
<p>&nbsp;</p>
<p><img src="/images/mahalo-jack-keane-sm.png" onmouseup="hl2l(event);" alt="jack keane walkthrough stub listing" border="0"></p>
<p>&nbsp;</p>
<p>[<a href="http://www.google.com/search?q=how+to+plan+thanksgiving+dinner" target="_blank">how to plan thanksgiving dinner</a>] (#7)</p>
<p>&nbsp;</p>
<p><img src="/images/mahalo-thanksgiving-dinner-sm.png" onmouseup="hl2l(event);" alt="how to plan thanksgiving dinner stub listing" border="0"></p>
<p>&nbsp;</p>
<p>[<a href="http://www.google.com/search?q=allheart+coupons&#038;num=10&#038;start=10" target="_blank">allheart coupons</a>] (#11)</p>
<p>&nbsp;</p>
<p><img src="/images/mahalo-allheart-coupons-sm.png" onmouseup="hl2l(event);" alt="allheart coupons" border="0"></p>
<p>&nbsp;</p>
<p>[<a href="http://www.google.com/search?q=adult+friend+finder+coupons" target="_blank">adult friend finder coupons</a>] (#1)</p>
<p>&nbsp;</p>
<p><img src="/images/mahalo-adult-friend-finder-sm.png" onmouseup="hl2l(event);" alt="adult friend finder coupons" border="0"></p>
<p>&nbsp;</p>
<div class="standout">
<strong>Let&#8217;s look at that, especially the last one.</strong></p>
<p>This is a page with 0 (as in none, nil, nadda, zilch) spiderable content, yet Google has deemed it worthy of ranking it #1, above <em>every other page in the index</em> that matches that phrase. Every.Single.One. We&#8217;re given the company line telling webmasters that in order to succeed in ranking in Google one must focus on &#8220;quality&#8221; and &#8220;unique&#8221; content, yet Google decides to give Mahalo a golden ticket for pages they can&#8217;t even <em>see</em>? </p>
<p><strong>Wtf?</strong>
</div>
<p>&nbsp;</p>
<p>Hm. Maybe the key is that if you want your duplicate, low quality, spammy content ranked then all you have to do is block it with robots.txt&#8230;?</p>
<p>Obviously that is not the case, and anyone who understands at all about how these things work will recognize that statement as ludicrous&#8230; but just to be sure, let&#8217;s look at some of the non-blocked content Mahalo pages are currently ranking for:</p>
<p>[<a href="http://www.google.com/search?q=julmust" target="_blank">julmust</a>] (#7)</p>
<p>&nbsp;</p>
<p><img src="/images/mahalo-julmust-sm.png" onmouseup="hl2l(event);" alt="Mahalo julmust scraped content" border="0"></p>
<p>&nbsp;</p>
<p>[<a href="http://www.google.com/search?q=The+Alice+B.+Toklas+Cookbook&#038;num=10&#038;start=10" target="_blank">The Alice B. Toklas Cookbook</a>] (#11)</p>
<p>&nbsp;</p>
<p><img src="/images/mahalo-alice-b-toklas-sm.png" onmouseup="hl2l(event);" alt="Mahalo Alice B Toklas scraped content" border="0"></p>
<p>&nbsp;</p>
<p>These Mahalo pages being returned are both examples of the many pages on Mahalo.com that are nothing more than <a href="http://smackdown.blogsblogsblogs.com/2010/03/11/jason-calacanis-backup-plan-for-replacing-content-steal-it-from-wikipedia/" target="_blank">content scraped directly from Wikipedia</a>:</p>
<p><a href="http://en.wikipedia.org/wiki/Julmust" target="_blank">http://en.wikipedia.org/wiki/Julmust</a><br />
<a href="http://en.wikipedia.org/wiki/The_Alice_B._Toklas_Cookbook">http://en.wikipedia.org/wiki/The_Alice_B._Toklas_Cookbook</a></p>
<p>Note also that Wikipedia identifies some of their content as being &#8220;stubs&#8221; (the Alice B Toklas Cookbook page has a mere 261 words of content, including numbers and &#8220;a&#8221;, &#8220;and&#8221;, and &#8220;the&#8221;), but Mahalo is fine with presenting that exact same content as non-stub for whatever reason.</p>
<p>So, Danny, sorry&#8230; it looks as if Google did not achieve what it reportedly wanted to do with this latest update. It looks like both the low quality and completely duplicate (or even non-existant) content on Mahalo.com continues to rank. Maybe next time.</p>
]]></content:encoded>
			<wfw:commentRss>http://smackdown.blogsblogsblogs.com/2010/06/11/was-the-google-mayday-update-a-complete-failure-then/feed/</wfw:commentRss>
		<slash:comments>31</slash:comments>
		</item>
		<item>
		<title>Jason Calacanis: Screw You Google, Now I&#8217;ll Sell Links Too</title>
		<link>http://smackdown.blogsblogsblogs.com/2010/03/12/jason-calacanis-screw-you-google-now-ill-sell-links-too/</link>
		<comments>http://smackdown.blogsblogsblogs.com/2010/03/12/jason-calacanis-screw-you-google-now-ill-sell-links-too/#comments</comments>
		<pubDate>Fri, 12 Mar 2010 13:45:43 +0000</pubDate>
		<dc:creator>Michael VanDeMar</dc:creator>
				<category><![CDATA[Cuttisms]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[lackofmeds]]></category>
		<category><![CDATA[scams]]></category>
		<category><![CDATA[search engines]]></category>
		<category><![CDATA[SEO]]></category>
		<category><![CDATA[Social Media]]></category>

		<guid isPermaLink="false">http://smackdown.blogsblogsblogs.com/?p=605</guid>
		<description><![CDATA[By now Google has to be getting more than a little embarrassed about the behavior of Mr. Jason Calacanis and his site, Mahalo.com. Aaron Wall did a very well written piece explaining how Mahalo Makes Black Look White and the spammy techniques they were employing. This isn&#8217;t the first time Aaron has blogged about Mahalo [...]]]></description>
			<content:encoded><![CDATA[<p>By now Google has to be getting more than a little embarrassed about the behavior of Mr. Jason Calacanis and his site, Mahalo.com. Aaron Wall did a very well written piece explaining how <a href="http://www.seobook.com/black-hat-seo-case-study" target="_blank">Mahalo Makes Black Look White</a> and the spammy techniques they were employing. This isn&#8217;t the first time Aaron has <a href="http://www.seobook.com/mark-cubans-mahalo-wants-your-blood-and-gets-it-too" target="_blank">blogged about Mahalo</a> either, and talked about exactly how <a href="http://www.seobook.com/why-mahalo-and-other-content-scrapers-render-googles-spam-team-flaccid">this makes Google look bad</a>. For those who might not know, I <a href="http://smackdown.blogsblogsblogs.com/2010/02/22/apparently-jason-calacanis-knows-hes-spamming-he-just-thinks-its-no-big-deal/" target="_blank">have also</a> <a href="http://smackdown.blogsblogsblogs.com/2010/03/08/jason-calacanis-makes-matt-cutts-a-liar/" target="_blank">been blogging</a> <a href="http://smackdown.blogsblogsblogs.com/2010/03/08/mahalo-com-meet-the-new-spam-worse-than-the-old-spam/" target="_blank">about this</a> <a href="http://smackdown.blogsblogsblogs.com/2010/03/11/jason-calacanis-backup-plan-for-replacing-content-steal-it-from-wikipedia/" target="_blank">recently</a>.</p>
<p>While Google will <a href="http://groups.google.com/groups/search?hl=en&#038;ie=UTF-8&#038;q=banned+google" target="_blank">ban smaller websites</a> from their search results or from AdSense on a whim, usually it takes heavier coverage<span id="more-605"></span> for bigger players to get hit. Like, for instance, when <a href="http://blogoscoped.com/archive/2006-02-01-n31.html" target="_blank">Google Blogoscoped outed BMW</a> for spammy doorway pages. The story spread relatively fast, and Google wound up banning BMW for a short period of time. So when someone has a &#8220;special&#8221; relationship with Google, as Jason appears to have, and keeps getting second (and third, and fourth, and fifth&#8230;) chances to clean up their act, yet continues to snub their nose in Google&#8217;s general direction, it makes one wonder. Google has to be at least somewhat concerned that someone in the mainstream media will eventually notice and start to ask why someone like Jason would continually be allowed to get away with this stuff. Considering the unfairness and lack of impartiality of letting a site like Mahalo slide while punishing so many smaller sites for lesser offenses, my guess is that Google doesn&#8217;t actually want to discuss their reasons behind ignoring it. And so, each time Google does nothing, Jason decides to push things a little more.</p>
<p>This time it looks like Jason has decided to go ahead and violate the rules <a href="http://www.mattcutts.com/blog/selling-links-that-pass-pagerank/" target="_blank">closest to Matt Cutts heart</a>. While the layout I am showing here will change with time, since the header contains rotating articles, currently if you go to Mahalo at the top of every page (on the non-Answers side, anyways) you will see the following block of stories that Mahalo is highlighting (usually this area contains trending or hot news items):</p>
<p>&nbsp;</p>
<p><a href="/images/mahalo-ad-in-header.png" target="_blank"><img src="/images/mahalo-ad-in-header-sm.png" onmouseup="hl2l(event);" alt="Conundrum?" border="0"></a><br />
(<em>Click to enlarge</em>)</p>
<p>&nbsp;</p>
<p>While technically speaking just by looking at them there is nothing to distinguish one of those &#8220;featured stories&#8221; from another, the one that doesn&#8217;t actually belong is the third one in, with the caption &#8220;Best Pickup Line Ever?&#8221;. The reason that one is different from all the rest is simple&#8230; it&#8217;s not a Mahalo featured story at all, and has nothing to do with anything going on in the news. It&#8217;s an ad. It is a paid link that Mahalo.com sold, one that leads to a site built to <a href="http://www.whatsyourconundrum.com/love-and-relationships/best-pickup-line-ever" target="_blank" rel="nofollow">market a wine company</a>. There is nothing visual to distinguish or disclose that <em>as</em> an ad, and if we view the source of the page&#8230;</p>
<p>&nbsp;</p>
<p><a href="/images/mahalo-ad-in-header-source.png" target="_blank"><img src="/images/mahalo-ad-in-header-source-sm.png" onmouseup="hl2l(event);" alt="Wheres the nofollow...?" border="0"></a><br />
(<em>Click to enlarge</em>)</p>
<p>&nbsp;</p>
<p>&#8230; we can see that there is nothing <em>machine readable</em> (ie. nofollow attribute) to distinguish it as an ad, either.</p>
<p>Matt Cutts has been very, very clear on his take on sites that sell links that pass PageRank, or ones that don&#8217;t disclose that they are in fact ads: they are spamming. No if, ands, or buts about it, they deserve to get punished. In fact, he has even gone to far to state that in his view undisclosed paid links <a href="http://www.mattcutts.com/blog/hidden-links/" target="_blank">violate FTC guidelines</a>.</p>
<p>So, Matt, recently you put out a <a href="http://www.mattcutts.com/blog/calling-for-link-spam-reports/" target="_blank">call for link spam reports</a>, including &#8220;paid links that pass PageRank&#8221;. Really, though, is there any point in reporting Mahalo to you? Are you going to actually take action, or, like you have done with Jason&#8217;s spam in the past, will you continue to simply look the other way? Any other site would be faced with a penalize/ban first, make nice nice with Google later. Hell, with the BMW site you penalized them <em>after</em> they cleaned it up, just to make an example. I get the strangest feeling, though, that won&#8217;t happen with Mahalo&#8230;</p>
<p>Can you at least say <em>something</em> about this issue&#8230;?</p>
<p><strong>Update:</strong> in response to a comment below and a question posed by Matt Cutts about what makes me believe that this is indeed a paid link:</p>
<p><a href="http://smackdown.blogsblogsblogs.com/2010/03/13/the-mahalo-paid-link-evidence-trail/">The Mahalo Paid Link Evidence Trail</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://smackdown.blogsblogsblogs.com/2010/03/12/jason-calacanis-screw-you-google-now-ill-sell-links-too/feed/</wfw:commentRss>
		<slash:comments>13</slash:comments>
		</item>
		<item>
		<title>Jason Calacanis&#8217; Backup Plan For Replacing Content: Steal It From Wikipedia</title>
		<link>http://smackdown.blogsblogsblogs.com/2010/03/11/jason-calacanis-backup-plan-for-replacing-content-steal-it-from-wikipedia/</link>
		<comments>http://smackdown.blogsblogsblogs.com/2010/03/11/jason-calacanis-backup-plan-for-replacing-content-steal-it-from-wikipedia/#comments</comments>
		<pubDate>Thu, 11 Mar 2010 18:11:08 +0000</pubDate>
		<dc:creator>Michael VanDeMar</dc:creator>
				<category><![CDATA[blogthropology]]></category>
		<category><![CDATA[Cuttisms]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[lackofmeds]]></category>
		<category><![CDATA[scams]]></category>
		<category><![CDATA[search engines]]></category>
		<category><![CDATA[SEO]]></category>
		<category><![CDATA[Social Media]]></category>

		<guid isPermaLink="false">http://smackdown.blogsblogsblogs.com/?p=590</guid>
		<description><![CDATA[/sigh Ok Jason, we get it, you&#8217;re desperate. But stealing content from Wikipedia in order to replace what you deleted? Come on! I am flipping through Mahalo.com today, just seeing if you&#8217;re keeping your word or not, when all of a sudden I notice this huge amount of pages with odd names that somehow I [...]]]></description>
			<content:encoded><![CDATA[<p>/sigh</p>
<p>Ok Jason, we get it, you&#8217;re desperate. But stealing content from Wikipedia in order to replace what you deleted? Come on!</p>
<p>I am flipping through Mahalo.com today, just seeing if<span id="more-590"></span> you&#8217;re keeping your word or not, when all of a sudden I notice this <em>huge</em> amount of pages with odd names that somehow I missed before:</p>
<p><a href="http://www.mahalo.com/cgs-20625" target="_blank" rel="nofollow">http://www.mahalo.com/cgs-20625</a><br />
<a href="http://www.mahalo.com/cgs-9896" target="_blank" rel="nofollow">http://www.mahalo.com/cgs-9896</a><br />
<a href="http://www.mahalo.com/cp-154-526" target="_blank" rel="nofollow">http://www.mahalo.com/cp-154-526</a><br />
<a href="http://www.mahalo.com/daa-1097" target="_blank" rel="nofollow">http://www.mahalo.com/daa-1097</a><br />
<a href="http://www.mahalo.com/daa-1106" target="_blank" rel="nofollow">http://www.mahalo.com/daa-1106</a></p>
<p>These are all nothing more than content stolen from Wikipedia. Your version:</p>
<p>&nbsp;</p>
<p><img src="/images/mahalo-cgs-20625.png" onmouseup="hl2l(event);" alt="Mahalo CGS-20625" border="0"></p>
<p>&nbsp;</p>
<p><a href="http://en.wikipedia.org/wiki/CGS-20625" target="_blank">Wikipedia&#8217;s version</a>:</p>
<p>&nbsp;</p>
<p><img src="/images/wikipedia-cgs-20625.png" onmouseup="hl2l(event);" alt="Wikipedia CGS-20625" border="0"></p>
<p>&nbsp;</p>
<p>You even hyperlinked the same internal linking scheme to the same topics Wikipedia does, regardless of whether or not those pages exist on Mahalo.com. How the hell can you claim this is original content when it is nothing more than cut and paste? I mean, wtf, you just claimed that <a href="http://smackdown.blogsblogsblogs.com/2010/02/22/apparently-jason-calacanis-knows-hes-spamming-he-just-thinks-its-no-big-deal/" target="_blank">Wikipedia is nothing more than a free for all</a> a little under 3 weeks ago&#8230; does the content somehow take on some magical value after you scrape it and host in on your servers, trying to pass it off as something one of your users wrote? THIS is the &#8220;<a href="http://twitter.com/Jason/status/10187602318" target="_blank" rel="nofollow">our users build it</a>&#8221; content that you were referring to? </p>
<p>&nbsp;</p>
<p><img src="/images/jason-calacanis-our-users-build-it.png" onmouseup="hl2l(event);" alt="Our users build it" border="0"></p>
<p>&nbsp;</p>
<p>News flash, Jason&#8230; your users and Wikipedia&#8217;s users are <em>not</em> the same people. You lumped Squidoo into the same category back then as well. If I look close enough, will I find content stolen from them too?</p>
<p>Now, to be fair, maybe this content existed all along but was much, much less noticeable when you had all of those <em>other</em> pages of fluff in there, but now that you have <a href="http://smackdown.blogsblogsblogs.com/2010/03/10/dear-jason-calacanis-this-isnt-an-absurd-microscope/" target="_blank">deleted 78% of that side of Mahalo</a>, these scraped pages are practically impossible to miss. None of the pages I looked at were actually indexed in Google, but it looks like <a href="" target="_blank">at least 27,900</a> of them currently are:</p>
<p>&nbsp;</p>
<p><a href="/images/mahalo-wikipedia-pages-indexed.png" target="_blank"><img src="/images/mahalo-wikipedia-pages-indexed-sm.png" onmouseup="hl2l(event);" alt="27,900 indexed pages scraped from Wikipedia on Mahalo" border="0"></a><br />
(<em>Click to enlarge</em>)</p>
<p>&nbsp;</p>
<p>Considering that Google only has a portion of those indexed, and that at last count there were only 128,324 pages left on that side of your site, that means that at minimum over 21% (and in all likelihood 30% &#8211; 40%) of the remaining pages on Mahalo.com are these scraped ones. Is that really what you want on your &#8220;Human Powered Search Engine&#8221;&#8230;?</p>
<p>There are 2 major differences between the original content and your version. 1) The original content cites the sources directly there on the page, whereas Mahalo does not, and 2) Mahalo is using each and every one of these scraped pages to automatically create 2 (and sometimes 3) additional contentless pages under the guise of questions being asked anonymously, questions that no one ever actually asked, but that are there solely for the purpose of bolstering your indexed page count in Google. The first question asked of every drug is <a href="http://www.mahalo.com/answers/health/what-are-the-side-effects-of-cgs-20625" target="_blank" rel="nofollow">What are the side effects of {insert drug}?</a>:</p>
<p>&nbsp;</p>
<p><img src="/images/mahalo-answers-cgs-20625.png" onmouseup="hl2l(event);" alt="What are the side effects of CGS-20625" border="0"></p>
<p>&nbsp;</p>
<p>And the second is always <a href="http://www.mahalo.com/answers/health/where-can-i-get-cgs-20625" target="_blank" rel="nofollow">Where can I get {insert drug}</a>:</p>
<p>&nbsp;</p>
<p><img src="/images/mahalo-answers-cgs-20625-b.png" onmouseup="hl2l(event);" alt="Where can I get CGS-20625" border="0"></p>
<p>&nbsp;</p>
<p>I also saw a &#8220;Who makes {insert drug}&#8221; question here and there as well. These are empty questions, asked by a bot, that for the most part will never get answered (or even looked at) and were never intended to. Three plus free pages (two of them <em>completely</em> devoid of content) for the price of one scraped page. Jason, seriously, do you really think you are slick doing this?</p>
<p>By the way, a huge number of those pages are less than 100 words in length, yet none of them have the noindex tag. Regardless of of what the length is, however, it&#8217;s not what you are claiming Mahalo.com <em>is</em> Jason, it&#8217;s more fluff. Wikipedia already gives us those articles. You add nothing to the interwebs by copying them. In all seriousness you should just go through and delete them all.</p>
]]></content:encoded>
			<wfw:commentRss>http://smackdown.blogsblogsblogs.com/2010/03/11/jason-calacanis-backup-plan-for-replacing-content-steal-it-from-wikipedia/feed/</wfw:commentRss>
		<slash:comments>20</slash:comments>
		</item>
		<item>
		<title>Is Google Referrer Spamming Too Now?</title>
		<link>http://smackdown.blogsblogsblogs.com/2010/02/16/is-google-referrer-spamming-too-now/</link>
		<comments>http://smackdown.blogsblogsblogs.com/2010/02/16/is-google-referrer-spamming-too-now/#comments</comments>
		<pubDate>Tue, 16 Feb 2010 13:59:41 +0000</pubDate>
		<dc:creator>Michael VanDeMar</dc:creator>
				<category><![CDATA[Google]]></category>
		<category><![CDATA[lackofmeds]]></category>
		<category><![CDATA[MSN]]></category>
		<category><![CDATA[On The Ball-ness]]></category>
		<category><![CDATA[search engines]]></category>
		<category><![CDATA[SEO]]></category>

		<guid isPermaLink="false">http://smackdown.blogsblogsblogs.com/?p=463</guid>
		<description><![CDATA[Yesterday a friend of mine sent me a section of her traffic logs that were showing some odd information. According to what was recorded there her brand new, as of yet unlinked-to website was ranking on the first page of Google for the single keyword, [free]. If she actually had managed to rank for that [...]]]></description>
			<content:encoded><![CDATA[<p>Yesterday a friend of mine sent me a section of her traffic logs that were showing some odd information. According to what was recorded there her brand new, as of yet unlinked-to website was ranking on the first page of Google for the single keyword, [<a href="http://www.google.com/search?q=free" target="_blank">free</a>]. If she actually had managed to rank for that phrase it would be an amazing feat to say the least. The competition for that single word is enormous. Unsurprisingly, when performing that actual search her site is nowhere to be found. The site in question is barely one week old, and hasn&#8217;t even been launched yet.</p>
<p>What is surprising, to me anyways, is that it appears that the traffic is actually coming from a bot at Google&#8230; a bot that is cloaked, sending fake<span id="more-463"></span> referrers, and behaving in exactly the same manner as <a href="http://smackdown.blogsblogsblogs.com/2007/11/13/microsoft-needs-to-quit-fucking-with-my-adsense-scripts/" target="_blank">MSN&#8217;s referrer spamming</a> bot that first showed up a little over 2 years ago. I blogged about it back then, as did <a href="http://sebastians-pamphlets.com/msn-admits-clueless-and-ineffective-spamming/" target="_blank">many<a/> <a href="http://ekstreme.com/thingsofsorts/blogging/yell-if-microsofts-livecom-spammed-you-too" target="_blank">others</a>. Eventually, after much feedback from the community, they <a href="http://www.seroundtable.com/archives/020672.html" target="_blank">did halt</a> the referrer spam practice. It was a bad idea for them to do it in the first place, and quite a few webmasters were perturbed about it. Two years was too long for it to go on, but at least they did finally stop doing it.</p>
<p>Now it looks like Google, for some unfathomable reason, has decided to start doing the exact same thing. The entries in my friend&#8217;s traffic logs looked like this:</p>
<blockquote class="eml"><p>74.125.126.81 &#8211; - [14/Feb/2010:16:34:03 -0600] &#8220;GET / HTTP/1.1&#8243; 200 19361 &#8220;http://www.google.com/search?hl=en&#038;q=free&#038;btnG=Google+Search&#038;aq=f&#038;oq=&#8221; &#8220;Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)&#8221;</p>
<p>72.14.192.3 &#8211; - [14/Feb/2010:16:36:28 -0600] &#8220;GET / HTTP/1.1&#8243; 200 19361 &#8220;http://www.google.com/search?hl=en&#038;q=free&#038;btnG=Google+Search&#038;aq=f&#038;oq=&#8221; &#8220;Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)&#8221;</p></blockquote>
<p>The IP&#8217;s in question definitely belong to Google (as can be seen here <a href="http://ws.arin.net/whois/?queryinput=74.125.126.81" target="_blank">74.125.126.81</a>, and here <a href="http://ws.arin.net/whois/?queryinput=72.14.192.3" target="_blank">72.14.192.3</a>). However, unlike normal Googlebot IP&#8217;s, these are not associated with the Google domain name via dns. For instance, if you do a host lookup on 66.249.71.233 you will see that it resolves to the hostname crawl-66-249-71-233.googlebot.com. The IP&#8217;s that the referrer spam is coming from do not resolve to any hostname. Presumably, going on the logic that MSN gave when they were first called out for doing this, the reason for not having a reverse dns associated with the IP&#8217;s is to hide the fact that they actually are from Google. Similarly the user-agent of these bots is being cloaked as well. Instead actually identifying as Googlebot, &#8220;Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)&#8221;, these bots are pretending to be an actual user using IE6 on Windows.</p>
<p>Unlike actual web surfers, Google, you have no expectation of privacy. When you are a bot, skulking around trying to disguise yourself as someone else is poor netiquette to say the least. I am not sure exactly what prompted you to start doing this, but you really should just stop.</p>
<p><strong>Update:</strong></p>
<p>Barry Schwartz of Search Engine Land contacted Google about this, and they <a href="http://searchengineland.com/is-google-referrer-spamming-to-detect-spam-36453/" target="_blank">replied back</a> that this is indeed them performing cloaked spidering. However, according to them it is not being done for spam detection purposes, and the particular referrers used were in error:</p>
<blockquote><p>Turns out, we were running an experiment to detect malware targeting Hot Trends queries related to the Haiti crisis. Because this experiment was developed in response to an urgent situation we moved quickly and as a result used an incorrect Google search referrer which we’re now working to fix. Thanks for calling this issue to our attention and we apologize for any confusion we may have caused.</p></blockquote>
]]></content:encoded>
			<wfw:commentRss>http://smackdown.blogsblogsblogs.com/2010/02/16/is-google-referrer-spamming-too-now/feed/</wfw:commentRss>
		<slash:comments>12</slash:comments>
		</item>
		<item>
		<title>Google Fails 5th Grade Math Test</title>
		<link>http://smackdown.blogsblogsblogs.com/2010/02/01/google-fails-5th-grade-math-test/</link>
		<comments>http://smackdown.blogsblogsblogs.com/2010/02/01/google-fails-5th-grade-math-test/#comments</comments>
		<pubDate>Tue, 02 Feb 2010 01:58:32 +0000</pubDate>
		<dc:creator>Michael VanDeMar</dc:creator>
				<category><![CDATA[bad research]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[lackofmeds]]></category>
		<category><![CDATA[nerdiness]]></category>
		<category><![CDATA[search engines]]></category>

		<guid isPermaLink="false">http://smackdown.blogsblogsblogs.com/?p=448</guid>
		<description><![CDATA[So, I think I finally discovered the cause of global warming. No, for reals. From what I can tell, miss Mother Nature started using Google Calculator in helping her figure out what kind of weather she should serve up to us. Now, if she were trying to bake a cake, or perhaps get driving directions, [...]]]></description>
			<content:encoded><![CDATA[<p><img src="/images/wrongcalc.png" border="0" alt="Calculator says... idk, 7?" style="float: right;"  onmouseup="hl2l(event);"> So, I think I finally discovered the cause of global warming. No, for reals. From what I can tell, miss Mother Nature started using <a href="http://www.google.com/help/calculator.html" target="_blank">Google Calculator</a> in helping her figure out what kind of weather she should serve up to us. Now, if she were trying to bake a cake, or perhaps get driving directions, I am sure Google would have worked just fine. But for doing math involving temperatures&#8230;? Not so much.</p>
<p>I was playing around with the functions on Google Calculator last week, when I noticed some of the calculations weren&#8217;t quite right. Maybe Michael Bolton from <a href="http://www.imdb.com/title/tt0151804/" target="_blank">Office Space</a> was involved<span id="more-448"></span> in writing the Google Calculator app, and wound up putting a decimal in the wrong place, but <em>something</em> sure isn&#8217;t adding up.</p>
<p>For those that don&#8217;t know, water boils at 100 degrees Celsius and freezes at 0 degrees Celsius. In Fahrenheit the range is 212 degrees (boiling) to 32 degrees (freezing). Even for those of you who might not have remembered those figures off the top of your head, most of us did learn them fairly early in our academic careers&#8230; probably right around fourth or fifth grade. While those numbers may vary slightly under extreme pressures, for the most part they are pretty much standard. A simple search on Google will in fact verify that they are correct.</p>
<p>Overall I think that Google Calculator is a pretty cool tool. You can even type math in using English, and it will do it&#8217;s best to figure out how to interpret the numbers. Usually it does an excellent job. When I was playing with it the other day, however, I got this odd response:</p>
<p>&nbsp;</p>
<p><img src="/images/gmath1.png" onmouseup="hl2l(event);" alt="No, 64 degrees/2 does not equal a new ice age..."><br />
<em>Erm&#8230; no.</em></p>
<p>&nbsp;</p>
<p>Way off. Not even in the same ballpark. Google did manage to group the numbers in a meaningful way, correctly guessing what I actually meant by that question, and yet somehow still came up with the wrong answer. If you can&#8217;t do it in your head, that should be (64F/2) = 32F = 0C, or water&#8217;s freezing point. </p>
<p>Maybe it just has a problem with temperatures on the low end, I thought, and that if I go the other way we might get better results:</p>
<p>&nbsp;</p>
<p><img src="/images/gmath2.png" onmouseup="hl2l(event);" alt="Hot hot hot!"><br />
<em>Wow. That&#8217;s hot. Also wrong.</em></p>
<p>&nbsp;</p>
<p>Apparently not. That one should be 106F * 2 = 212F = 100C, or the temperature at which water boils. Instead we get temperatures hotter than most home ovens can go. <img src='http://smackdown.blogsblogsblogs.com/wp-includes/images/smilies/icon_biggrin.gif' alt=':D' class='wp-smiley' /> </p>
<p>To give Google the benefit of the doubt, I decided to try and take out the conversions altogether, and just let it do simple temperature calculations, staying in just one measurements system:</p>
<p>&nbsp;</p>
<p><img src="/images/gmath3.png" onmouseup="hl2l(event);" alt="No conversion involved, Google Calculator still gets it wrong?"><br />
<em>Not even close. Google fails.</em></p>
<p>&nbsp;</p>
<p>Nope, still no math love from the search giant. I guess Google just needs to go back to school. <img src='http://smackdown.blogsblogsblogs.com/wp-includes/images/smilies/icon_biggrin.gif' alt=':D' class='wp-smiley' /> </p>
]]></content:encoded>
			<wfw:commentRss>http://smackdown.blogsblogsblogs.com/2010/02/01/google-fails-5th-grade-math-test/feed/</wfw:commentRss>
		<slash:comments>20</slash:comments>
		</item>
	</channel>
</rss>

