<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Smackdown! &#187; psychoblogging</title>
	<atom:link href="http://smackdown.blogsblogsblogs.com/category/psychoblogging/feed/" rel="self" type="application/rss+xml" />
	<link>http://smackdown.blogsblogsblogs.com</link>
	<description>Smackdown!</description>
	<lastBuildDate>Tue, 22 Nov 2011 22:40:24 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>Apparently Jason Calacanis Knows He&#8217;s Spamming &#8211; He Just Thinks It&#8217;s No Big Deal</title>
		<link>http://smackdown.blogsblogsblogs.com/2010/02/22/apparently-jason-calacanis-knows-hes-spamming-he-just-thinks-its-no-big-deal/</link>
		<comments>http://smackdown.blogsblogsblogs.com/2010/02/22/apparently-jason-calacanis-knows-hes-spamming-he-just-thinks-its-no-big-deal/#comments</comments>
		<pubDate>Mon, 22 Feb 2010 13:30:17 +0000</pubDate>
		<dc:creator>Michael VanDeMar</dc:creator>
				<category><![CDATA[blogthropology]]></category>
		<category><![CDATA[Cuttisms]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[lackofmeds]]></category>
		<category><![CDATA[marketing]]></category>
		<category><![CDATA[psychoblogging]]></category>
		<category><![CDATA[scams]]></category>
		<category><![CDATA[SEO]]></category>

		<guid isPermaLink="false">http://smackdown.blogsblogsblogs.com/?p=490</guid>
		<description><![CDATA[Last month Jason Calacanis wrote a rather sarcastic post aimed at Aaron Wall, which I am assuming was written in response to Aaron&#8217;s post, &#8220;Black Hat SEO Case Study: How Mahalo Makes Black Look White!&#8220;. In it Aaron discusses how sites that are composed largely of nothing more than auto-generated pages wrapped in adsense can [...]]]></description>
			<content:encoded><![CDATA[<p>Last month Jason Calacanis wrote a <a href="http://calacanis.com/2010/01/25/my-thank-you-email-to-aaronwall-for-the-free-seo-advice-great-seo-great-guy/" target="_blank">rather sarcastic post</a> aimed at <a href="http://www.seobook.com/" target="_blank">Aaron Wall</a>, which I am assuming was written in response to Aaron&#8217;s post, &#8220;<a href="http://www.seobook.com/black-hat-seo-case-study" target="_blank">Black Hat SEO Case Study: How Mahalo Makes Black Look White!</a>&#8220;. In it Aaron discusses how sites that are composed largely of nothing more than auto-generated pages wrapped in adsense can get accepted and even gain authority in Google if they have enough financing and press. In Jason&#8217;s rebuttal to this was a claim about rankings that Mahalo had &#8220;earned&#8221; (and I use the term loosely) for &#8220;VIDEO GAME walkthrough&#8221;. I originally misinterpreted what he was trying to say, and thought that he meant rankings for that exact phrase. I commented how that wasn&#8217;t exactly a great accomplishment before realizing that what he actually meant was rankings for [{insert video game name} walkthrough], and that Mahalo has a couple top 10 rankings for that genre of search phrases.</p>
<p>Jason sent me an email to correct me on what he was talking about. We replied to each other back and forth a couple times, and a few very interesting things were revealed in that conversation:<span id="more-490"></span></p>
<blockquote class="eml"><p>On Mon, 2010-01-25 at 14:46 -0800, Jason Calacanis wrote:<br />
sorry, didn&#8217;t mean ranking for &#8220;video game walkthroughs&#8221; literally&#8230;.<br />
more like VIDEO GAME NAME walkthrough:<br />
<a href="http://www.google.com/search?&#038;q=call+of+duty+walkthrough" target="_blank">http://www.google.com/search?&#038;q=call+of+duty+walkthrough</a></p></blockquote>
<blockquote class="eml"><p>On Mon, Jan 25, 2010 at 3:38 PM, Michael VanDeMar wrote:<br />
Yeah, realized that and pointed out that I might have misinterpreted in the next comment. I only saw one or two that you were in the top 5 though, not a slew of them. But yeah, better than the generic version.</p>
<p>Thing is that you still have tons of auto or near-auto generated content out there ranked, stuff that is nothing more than noise, including stuff you are boasting about ranking for:</p>
<p><a href="http://www.mahalo.com/need-for-speed-prostreet-walkthrough" target="_blank" rel="nofollow">http://www.mahalo.com/need-for-speed-prostreet-walkthrough</a></p>
<p>I commented about this last year I believe&#8230; this is not a legitimate or ethical way to go about what you are doing:</p>
<p>1) Auto-generate tons of pages with no content, with AdSense embedded on them, based solely on search phrases<br />
2) See what winds up ranking<br />
3) Go in and put valid content in afterwards, pretend that all along the content was not only valid, but actually better that what you outranked.</p>
<p>You want a more palatable model? How about this&#8230; there are tons of sites out there that are full of great content that no one will ever see, because they&#8217;re simply not seo&#8217;d (ie. unknown and unlinked). Why don&#8217;t you:</p>
<p>1) Find a way to identify the search results that are already full of crap sites<br />
2) Identify quality sites that should be there instead, and<br />
3) Use your pull and popularity to get *those* sites ranked.</p>
<p>If you find a way to do just that and somehow build a successful business model around it then that would be much, much better than what Mahalo is now.</p>
<p>    -Michael</p></blockquote>
<blockquote class="eml"><p>On Mon, 2010-01-25 at 15:43 -0800, Jason Calacanis wrote:<br />
the part you&#8217;re leaving out is:</p>
<p>a) we used to noindex these and we are going to again<br />
b) if any page gets any traffic we pay someone to build it out&#8211;so it&#8217;s only short for a couple of days. <img src='http://smackdown.blogsblogsblogs.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
<p>if a page is started by a user and is short it a) won&#8217;t rank 99% of the time and b) if it happens to we see it in analytics and build it out. </p>
<p>not really a big deal IMO</p>
<p>best j</p></blockquote>
<p>Let&#8217;s take a look a little closer to some of the things that Jason is saying here:</p>
<p><strong>&#8220;we used to noindex these and we are going to again&#8221;</strong></p>
<p>At one point Mahalo had the noindex tag on the pages that were fully automated, and Jason is freely admitting that <em>they should have the nodindex tag now</em>. These pages have no added value and should not be indexed by Google. Jason must believe this, or he would not say that they were planning to add the noindex them again. If he thought they deserved to be in the index then there is no reason to tell the search engine spiders not to include them. Now, according to Jason&#8217;s rebuttal to Aaron, the removal of the nodindex tag was an &#8220;accident&#8221; that happened in the migration to Mahalo 3.0. For those who do not know, Mahalo 3.0 was released back in November of last year. Mahalo is a template driven site. All it takes to &#8220;fix&#8221; the tag that was lost &#8220;by accident&#8221; is a coder opening up the template header and including 1 line of conditional code. In the 3 months prior to our conversation, and in the 3 weeks since, no one has bothered to do so. To this day those pages are still getting added to the index at a very large rate.</p>
<p>How large, you might ask? I have been checking here and there for the past few weeks, and when I looked on any given day Google was reporting from 2,000 to 20,000 new pages on Mahalo.com:</p>
<p>&nbsp;</p>
<p><img src="/images/mahalo-site-24hrs-sm.png" onmouseup="hl2l(event);" alt="Tons of automated spammy pages from Mahalo.com"></p>
<p>&nbsp;</p>
<p>I clicked through quite a few of these pages to see what the content looked like. Almost every singe page I looked at was completely automated content. In fact, when I looked today, the only one I found that wasn&#8217;t <em>totally</em> automated was the Mahalo entry on <a href="http://www.mahalo.com/hootsuite" target="_blank" rel="nofollow">Hootsuite</a>&#8230; which happened to have 2 additional manually added sentences in it, content which &#8220;enhanced&#8221; the scraped content that surrounded it: &#8220;HootSuite is the professional Twitter client. With HootSuite, you can manage multiple Twitter profiles, pre-schedule tweets, and measure your success.&#8221; The entire &#8220;Human Powered&#8221; portion of that page is tinier than just one AdSense block located directly under it:</p>
<p>&nbsp;</p>
<p><img src="/images/mahalo-hootsuite-entry-sm.png" onmouseup="hl2l(event);" alt="Yay, its Human Powered content to the rescue!"></p>
<p>&nbsp;</p>
<p>And that AdSense block is one of 4 located on the page. Yeah, let&#8217;s hear it for &#8220;added value&#8221; folks.</p>
<p>As a side note, it looks like Jason is <a href="https://www.google.com/adsense/support/bin/answer.py?hl=en&#038;answer=9735" target="_blank">violating Google AdSense TOS</a> by placing more than 3 AdSense blocks on each page. Not entirely sure why they are letting him do that, since supposedly that lowers the overall value each advertiser gets&#8230; but I digress.</p>
<p><strong>&#8220;if any page gets any traffic we pay someone to build it out&#8221;</strong></p>
<p>Here Jason is clearly admitting that he thinks it&#8217;s fine to rank your scraped content first, and then add quality if and only if it gets traffic (and remember, 1-2 sentences of &#8220;quality&#8221; is fine). Essentially he&#8217;s saying, go ahead and spam, but if it looks like it might get enough traffic someone will actually notice, add content.</p>
<p>I wrote back:</p>
<blockquote class="eml"><p>On Mon, Jan 25, 2010 at 4:51 PM, Michael VanDeMar wrote:<br />
So, you&#8217;re not denying that is indeed your model here (rank first, quality later), you&#8217;re just saying that you think that&#8217;s it&#8217;s fine to do it that way, since you pay people to build it out within a couple of days after you start to get traffic?</p>
<p>Your assertion that a page on your devoid of content won&#8217;t rank is completely untrue, by the way. Generally people with quality content wind up ranking because other people link to them. That&#8217;s what the whole basis of Google&#8217;s base algorithm is built upon. Your site, however, gets pages ranked solely by virtue of other internal pages linking to them. You have so much link juice, due mostly to controversy and press, ranking power that you spread around your site from one page to another, that you can rank relatively competitive phrases with no effort at all. That does not somehow make the page quality due to some sort of mystical link juice feedback. Quality attracts link juice. Link juice does not impart quality.</p>
<p>Take a look at these pages, all which rank with pretty much nothing but links from within Mahalo itself:</p>
<p><a href="http://www.google.com/search?q=valentine%27s+day+cupcakes" target="_blank">http://www.google.com/search?q=valentine%27s+day+cupcakes</a><br />
#1: <a href="http://www.mahalo.com/valentines-day-cupcakes" target="_blank" rel="nofollow">http://www.mahalo.com/valentines-day-cupcakes</a><br />
1 external link, from a scraper: <a href="http://search.yahoo.com/search?p=link%3Ahttp%3A%2F%2Fwww.mahalo.com%2Fvalentines-day-cupcakes+-site%3Amahalo.com" target="_blank">http://search.yahoo.com/search?p=link%3Ahttp%3A%2F%2Fwww.mahalo.com%2Fvalentines-day-cupcakes+-site%3Amahalo.com</a></p>
<p><a href="http://www.google.com/search?hl=en&#038;safe=off&#038;num=10&#038;q=halo+3+walkthrough&#038;btnG=Search" target="_blank">http://www.google.com/search?hl=en&#038;safe=off&#038;num=10&#038;q=halo+3+walkthrough&#038;btnG=Search</a><br />
#5: <a href="http://www.mahalo.com/halo-3-walkthrough" target="_blank" rel="nofollow">http://www.mahalo.com/halo-3-walkthrough</a><br />
Again, only linked externally from another scraper: <a href="http://search.yahoo.com/search?p=link%3Ahttp%3A%2F%2Fwww.mahalo.com%2Fhalo-3-walkthrough+-site%3Amahalo.com" target="_blank">http://search.yahoo.com/search?p=link%3Ahttp%3A%2F%2Fwww.mahalo.com%2Fhalo-3-walkthrough+-site%3Amahalo.com</a></p>
<p><a href="http://www.google.com/search?q=how+to+invest+online" target="_blank">http://www.google.com/search?q=how+to+invest+online</a><br />
#8: <a href="http://www.mahalo.com/how-to-invest-online" target="_blank" rel="nofollow">http://www.mahalo.com/how-to-invest-online</a><br />
Zero links from external sites, even from scrapers: <a href="http://search.yahoo.com/search?p=link%3Ahttp%3A%2F%2Fwww.mahalo.com%2Fhow-to-invest-online+-site%3Amahalo.com" target="_blank">http://search.yahoo.com/search?p=link%3Ahttp%3A%2F%2Fwww.mahalo.com%2Fhow-to-invest-online+-site%3Amahalo.com</a></p>
<p>Google describes it&#8217;s ranking algorithm as something that leverages the democratic nature of the web. No one, however, is voting for your pages. Why should they rank? Simply because you paid someone to flesh them out?</p>
<p>    -Michael</p>
<p>PS. Do you mind if I blog this? Or are any of your answers being said with an expectation of privacy?</p></blockquote>
<blockquote class="eml"><p>On Mon, 2010-01-25 at 17:00 -0800, Jason Calacanis wrote:<br />
Actually, most of our pages come in VERY LARGE now because we don&#8217;t allow folks to just make them on the live site any more. They have to go through Mahalo Tasks now. in the old days it was a free for all like Wikipedia/Squidoo&#8230;. we didn&#8217;t like the results. </p>
<p>Very few of our pages start short. So, it&#8217;s not really a strategy to put up short pages and wait&#8230; our strategy is to get a TON of people to make pages in Mahalo Tasks. </p>
<p>Think about it: which is a better way to build a business&#8230; make a bunch of stubs or make a bunch of high-quality pages? The later is better and it&#8217;s really not very expensive. Especially now that we have revenue sharing with our page managers. </p>
<p>The three pages you&#8217;re talking about all have a LARGE amount of original content, and I would say they are B+ to A+ content, no? I think google and yahoo take into account a large amount of content plus the amount of time a user sits on a page. At least i&#8217;ve been told that they look at time spent on page. as you can imagine&#8230; the time spent on a recipe, walkthrough or howto article is VERY LONG. <img src='http://smackdown.blogsblogsblogs.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' />  </p>
<p>thanks for the feedback!</p>
<p>best j</p></blockquote>
<p>Again, let&#8217;s look at what he&#8217;s saying here:</p>
<p><strong>&#8220;we don&#8217;t allow folks to just make [pages] on the live site any more. They have to go through Mahalo Tasks now.&#8221;</strong></p>
<p>That&#8217;s right, Jason would like us to believe that the Mahalo pages, like the one on <a href="http://www.mahalo.com/13year-rape" target="_blank" rel="nofollow">13 year olds and rape</a>, all now pass an editorial review before going live. Even if that were true (which it isn&#8217;t, btw) I don&#8217;t think that I would be boasting that, based on the pages that I looked at.</p>
<p><strong>in the old days it was a free for all like Wikipedia/Squidoo&#8230;. we didn&#8217;t like the results.</strong></p>
<p>Now Jason Calacanis is saying that the 100% user contributed, zero ads Wikipedia is nothing more than a &#8220;free for all&#8221;, and that his 99% auto-generated content content laden with AdSense is therefore better. Gotcha.</p>
<p><strong>&#8220;Very few of our pages start short.&#8221;</strong></p>
<p>Either Jason thinks he&#8217;s talking about a different site, or he has absolutely no concept of what the phrase &#8220;very few&#8221; actually means. Currently when I look, Google tells me that <a href="http://www.google.com/search?q=site%3Awww.mahalo.com" target="_blank">Mahalo has 356,000 pages indexed</a>:</p>
<p>&nbsp;</p>
<p><img src="/images/mahalo-site-20100219.png" onmouseup="hl2l(event);" alt="356,000 pages indexed, most of them scraped content"></p>
<p>&nbsp;</p>
<p>As I mentioned earlier, in contrast to &#8220;very few&#8221; of these pages being &#8220;short&#8221;, the vast majority of them are nothing more than scraped content. From what I can tell it looks like less than 4% of the Mahalo.com pages getting indexed on a daily basis actually involve humans adding unique content in to them. The rest of the pages are being auto-generated by a Mahalo bot aptly named &#8220;searchclick&#8221;. If you go to any of these non-content pages and click on the &#8220;View Page History&#8221; link in the bottom of the right column you can see exactly who it is that generates the majority of this &#8220;Human Powered Search&#8221; website:</p>
<p>&nbsp;</p>
<p><img src="/images/13year-rape-history-sm.png" onmouseup="hl2l(event);" alt="History of 13 year old rape topic"></p>
<p>&nbsp;</p>
<p>&#8220;searchclick&#8221; is of course not an actual user. It&#8217;s the name of the robot that Mahalo uses to generate all of these pages on it&#8217;s site. According to <a href="http://www.mahalo.com/answers/mahalo-profile/who-is-the-user-searchclick" target="_blank" rel="nofollow">the Mahalo website</a>, </p>
<blockquote><p>Searchclick is not really a user at all. The name &#8220;searchclick&#8221; is used to represent a individual &#8220;search click&#8221; by a Mahalo user who has searched a particular term and Mahalo made a &#8220;created by searchclick&#8221; meaning the page had been searched for the first time.</p></blockquote>
<p>These pages on Mahalo that are getting indexed by the thousands are nothing more than searches that users performed that no one thought valuable enough to create a topic page for in the first place. The content used to populate these pages is nothing more than scraped versions of <em>other</em> websites search pages, such as Youtube, Flickr, and Google itself. </p>
<p>Google is very clear on it&#8217;s viewpoint of indexing search pages. Matt Cutts, head of Google&#8217;s Webspam team, <a href="http://www.mattcutts.com/blog/search-results-in-search-results/" target="_blank">wrote about the subject</a> back in 2007:</p>
<blockquote><p>But just to close the loop on the original question on that thread and clarify that Google reserves the right to reduce the impact of search results and proxied copies of web sites on users, Vanessa also had someone add a line to the <a href="http://www.google.com/support/webmasters/bin/answer.py?answer=35769" target="_blank">quality guidelines</a> page. The new webmaster guideline that you&#8217;ll see on that page says &#8220;Use robots.txt to prevent crawling of search results pages or other auto-generated pages that don’t add much value for users coming from search engines.&#8221; &#8211; <em>Matt Cutts</em></p></blockquote>
<p>There really is not a whole lot of room for ambiguity in what Matt is saying.</p>
<p>As it turns out, Mahalo can not block these pages using robots.txt. They go out of their way to make these search generated pages blend in with every other page on their site. Not only are they all located in the same root directory, there is nothing at all that allows the fact that they are auto-generated to be detected in any sort of machine readable way. Is this by accident? From the way Jason talks, you would think so. <em>However&#8230;</em></p>
<p>The mere creation of pages is not enough to get them discovered and indexed in the search engines. The search engine spiders must have some means of locating these pages in order to know that they exist. Most sites simply rely on links from other pages on their sites, or links from other people&#8217;s sites, for their content to get noticed. There are alternate means though. For instance, Google allows webmasters to set up special sitemaps, meant for search engine spiders alone, in order to insure that sets of pages that the webmaster considers important do in fact get found. These spider-only sitemaps are in XML format, and are not meant for human visitors to use. Does Mahalo use XML sitemaps? </p>
<p>Yep, they sure do. In fact, they use a dynamically generated sitemap, one that is generated on the fly with a url parameter, &#8220;p=&#8221;, used to distinguish which page of the sitemap you want to view. If we actually open up Mahalo&#8217;s sitemap, what do we see?</p>
<p>&nbsp;</p>
<p><img src="/images/mahalo-spam-in-xml-sm.png" onmouseup="hl2l(event);" alt="Spam pages included in Mahalos xml sitemap"></p>
<p>&nbsp;</p>
<p>That&#8217;s right&#8230; Mahalo&#8217;s XML sitemap is packed with tons of spammy auto-generated pages for Google find. Not only is Mahalo <em>not</em> blocking these pages from getting indexed, they are actually <em>going out of their way</em> to make sure Googlebot finds them.</p>
<p>While Jason&#8217;s assertion that these pages don&#8217;t directly generate a ton of traffic for him may in fact be true, they do play a very important role in his overall ranking scheme. With nothing but the power of the domain they reside on many of these very little competition, zero traffic phrases will actually result in rankings. On occasion you will find spammers scraping these rankings from Google, which then result in a very low pagerank link back to Mahalo.com. Individually each of these links adds almost no value. However, as I mentioned earlier, Mahalo has <em>hundreds of thousands</em> of these pages indexed. In conjunction with link juice from 1 or 2 links from his own blog often times Jason can now get some moderately competitive phrases ranked with no effort, or votes from the rest of the world, at all. For instance, currently at the top of many pages on Mahalo you will see a link to the page for <a href="http://www.mahalo.com/robert-kissel" target="_blank" rel="nofollow">Robert Kissel</a>, who happens to be in the news currently. Search on Google for him and you can see that Mahalo page is currently in the top 10. If we look at the links pointed to this page we see that aside from interior pages on Mahalo.com, exactly <a href="http://search.yahoo.com/search?p=link%3Ahttp%3A%2F%2Fwww.mahalo.com%2Frobert-kissel+-site%3Amahalo.com" target="_blank">4 sites</a> link to it, <em>3 of which are from scraping Google&#8217;s results</em>:</p>
<p>&nbsp;</p>
<p><img src="/images/mahalo-links-robert-kissel-sm.png" onmouseup="hl2l(event);" alt="Links to Robert Kissel on Mahalo"></p>
<p>&nbsp;</p>
<p>Even if we are to believe that the Jason himself believes the naive assertion that the sheer amount of content or time a visitor spends on a page are major ranking factors, we all know that those alone will not get you ranked, even if your content truly is quality. As to Jason&#8217;s statement that the pages on Mahalo are &#8220;quality&#8221; and deserve to rank? In most of the cases that I saw, and I am talking about the &#8220;human powered&#8221; ones now, they simply aren&#8217;t. Take for instance Mahalo&#8217;s top 10 ranking for [<a href="http://www.google.com/search?q=need+for+speed+prostreet+walkthrough" target="_blank">need for speed prostreet walkthrough</a>], if you look at the page <em>there isn&#8217;t even a walkthrough on it</em>:</p>
<p>&nbsp;</p>
<p><img src="/images/mahalo-need-for-speed-walkthrough-sm.png" onmouseup="hl2l(event);" alt="Need for speed walkthrough my ass!"></p>
<p>&nbsp;</p>
<p>Aside from that tiny little blurb and links scraped from the search engines, the only thing on that page is an embeded video of a walkthrough that was made by someone other than a Mahalo user&#8230; and that video doesn&#8217;t even exist anymore. If Mahalo used content that they generated and hosted themselves that would never be an issue.</p>
<p>At this point in the game of course, simply noindexing the pages in and of itself is not really a solution. Unless Mahalo moves the search auto-generated content into it&#8217;s own subdirectory so it can be blocked by robots.txt, noindexes <em>and</em> nofollows the existing pages (to prevent grabbing unwarranted juice from serps scrapers), and removes them from their sitemap, then they are in clear violation of Google&#8217;s Webmaster Guidelines. Why Google won&#8217;t actually take action on them is anyone&#8217;s guess.</p>
]]></content:encoded>
			<wfw:commentRss>http://smackdown.blogsblogsblogs.com/2010/02/22/apparently-jason-calacanis-knows-hes-spamming-he-just-thinks-its-no-big-deal/feed/</wfw:commentRss>
		<slash:comments>14</slash:comments>
		</item>
		<item>
		<title>Why The Renewed Interest In The Linkscape Scams And Deception..?</title>
		<link>http://smackdown.blogsblogsblogs.com/2010/01/22/why-the-renewed-interest-in-the-linkscape-scams-and-deception/</link>
		<comments>http://smackdown.blogsblogsblogs.com/2010/01/22/why-the-renewed-interest-in-the-linkscape-scams-and-deception/#comments</comments>
		<pubDate>Fri, 22 Jan 2010 22:15:25 +0000</pubDate>
		<dc:creator>Michael VanDeMar</dc:creator>
				<category><![CDATA[blogthropology]]></category>
		<category><![CDATA[lackofmeds]]></category>
		<category><![CDATA[psychoblogging]]></category>
		<category><![CDATA[scams]]></category>
		<category><![CDATA[SEO]]></category>
		<category><![CDATA[WTF]]></category>

		<guid isPermaLink="false">http://smackdown.blogsblogsblogs.com/?p=433</guid>
		<description><![CDATA[Yesterday a friend of mine, Sebastian, wrote a post titled, &#8220;How do Majestic and LinkScape get their raw data?&#8220;. Basically it is a renewed rant about SEOmoz and their deceptions surrounding the Linkscape product that they launched back in October 2008, a little over 15 months ago. The controversy is based around the fact that [...]]]></description>
			<content:encoded><![CDATA[<p>Yesterday a friend of mine, <a href="http://twitter.com/SebastianX" target="_blank">Sebastian</a>, wrote a post titled, &#8220;<a href="http://sebastians-pamphlets.com/linkscape-opensiteexplorer-majestic-data-sources-shady-or-not/" target="_blank">How do Majestic and LinkScape get their raw data?</a>&#8220;. Basically it is a renewed rant about SEOmoz and their deceptions surrounding the Linkscape product that they launched back in October 2008, a little over 15 months ago. The controversy is based around the fact that moz basically lied about how it was exactly they were obtaining their data, which in part was probably motivated by wanting to make themselves look like they were more technically capable than they actually are.</p>
<p>Now, I covered this back when the launch actually happened, in <a href="http://smackdown.blogsblogsblogs.com/2008/10/17/how-to-block-the-bots-seomoz-isnt-telling-you-about/" target="_blank">this Linkscape post</a>, resulting in quite a few comments, and there was more than a little heated conversation in the <a href="http://sphinn.com/story/79700" target="_blank">Sphinn thread</a> as well. This prompted some people, both on Sebastian&#8217;s post and in the Sphinn thread on it, to ask <a href="http://sebastians-pamphlets.com/linkscape-opensiteexplorer-majestic-data-sources-shady-or-not/#comment-2184" target="_blank">why all of the renewed interest</a>?</p>
<blockquote><p>It is not extreme, its just that it isn’t new. The fact that they bought the index (partially)? That was known from the beginning. The fact that they don’t provide a satisfying way of blocking their bots (or the fact that they didn’t want to reveal their bots user agent)? Check. The fact that they make hyped statements to push Linkscape? Check. {&#8230;} I don’t get the renewed excitement. &#8211; <em>Branko, aka <a href="http://www.seo-scientist.com/" target="_blank">SEO Scientist</a></em></p></blockquote>
<p>Well, I guess you could say that it&#8217;s my fault. Or, you could blame it on SEOmoz themselves, or their employees, depending on how you look at it. You see, the story goes like this&#8230;</p>
<p>Back when SEOmoz first launched Linkscape, it would have been damn near impossible for a shop their size to have performed the feats they were claiming, all on their own. Rand was making the claim &#8220;Yes &#8211; We spidered all 30 billion pages&#8221;. He also claimed to have done it within &#8220;several weeks&#8221;. Now, even if we stretch &#8220;several&#8221; to mean something that it normally would not, say, 6 (since a 6 week update period is now what they are claiming for the tool), we&#8217;re still talking a huge amount of resources to accomplish that task. A conservative estimate of the average website, considering only html, is 25KB of text:</p>
<p>30,000,000,000 websites x (25 x 1024) bytes per website = 768,000,000,000,000 bytes of data (768 trillion bytes, which is 698.4TB)</p>
<p>(698.4TB / 45 days of crawling) x 30 days in a month = 465.6TB bandwidth per month</p>
<p>Now, I know that one of the reasons that Rand can get away with some of his claims is that most people just don&#8217;t grasp the sheer size<span id="more-433"></span> of those numbers. In todays age, bandwidth is cheap, with many hosts even boasting of unmetered, or unlimited, bandwidth on their accounts, and computers are fast. But in reality the reason they can make those claims is that in all likelihood no one on a shared server or a cluster will ever hit their bandwidth limit, because their processor usage will cause them to go over their limits way before actual data transfer becomes an issue. On dedicated servers, where the resources are not shared, hosts actually care about how much bandwidth you use. For instance, last August <a href="http://www.shareasale.com/r.cfm?b=106084&#038;u=189767&#038;m=15362&#038;urllink=&#038;afftrack=" target="_blank">The Planet</a> (one of the best hosts I know of for dedicated servers) upgraded their plans to offer 10TB/month at no additional cost. Prior to that they only included 1TB with their plans. On most hosts the charges for people who go over their bandwidth allotment are usually rather steep. </p>
<p>This means that basically for what Rand was claiming to be 100% true, they pretty much would have needed to own their own datacenter. Now, these days, of course, there is another option. Five months ago a new company, named <a href="http://80legs.com/" target="_blank">80legs</a>, came out of beta. With 80legs pretty much anyone can build their own spiders, run them on 80leg&#8217;s servers, and spider 2 billion pages per day. They can do this of course because they rent the service out to many people, it&#8217;s not just one company powering one link tool. However, 15 months ago when moz launched their tool, 80legs wasn&#8217;t an option.</p>
<p>So, I called them on their claims, and a bit of controversy followed from it. Moz refused to clearly identify how they were actually gathering the data, and would not release information on how to keep whatever spiders were being used off of their sites. They did release a list of fairly widespread bots, and suggested that if you wanted to keep SEOmoz from scraping your sites via robots.txt, well, then, you&#8217;re just going to have to block Google, Yahoo, and MSN as well. They also came up with their lame assed version of a &#8220;solution&#8221; to people&#8217;s concerns, and stated that people could also add an SEOmoz meta tag to their pages to keep them from being indexed (which would not, however, keep them from being crawled in the first place). Despite the fact that many webmasters made it clear that this was unacceptable, to date nothing about that situation has changed. They still do not offer a clear concise way to allow webmasters to instruct SEOmoz to not spider their site, or give people an option to keep information about their site from showing in the Linkscape data.</p>
<p>The thread on Sphinn went where it did, and the next day one of the admin&#8217;s decided to close the discussion, even though it was far from being resolved. No more comments were allowed. Period. End of story.</p>
<p>I moved on.</p>
<p>Fast forward 15 months. I get an email from SEOmoz, touting their new tool, which is apparently powered from the Linkscape index. So, I trot on over and take a look. There, on the front page, are their same outrageous claims&#8230; only more so. The graphic stats that in the past 45 days, they have crawled 700 Billion Links, 55 Billion URLs, and 63 Million Root Domains:</p>
<p><img src="/images/ls-crawl-stats.png" onmouseup="hl2l(event);"></p>
<p>Now, I and others, when this first happened, put <a href="http://sebastians-pamphlets.com/crawling-vs-indexing/" target="_blank">Rand to task</a> for trying to interchange &#8220;crawling&#8221; with &#8220;indexing&#8221;. Therefore, when he states in that graphic that they &#8220;crawled&#8221; 700 billion links in 45 days it&#8217;s not because he&#8217;s too stupid to know the difference. The SEOmoz employees know very well that they while they may have &#8220;found&#8221; an huge amount of links in their index, they did not crawl them. This is actually aside from whether or not it was them who did the actual crawling. Of course, they do try and set toss in some confusion there, just in case someone calls them on their bullshit again, by stating that they crawled 55 billion urls at the same time, as if there is some sort of relevant distinction between a url and a link&#8230; which, for crawling purposes, there isn&#8217;t. The only real way there would be a difference is if they were trying to say that 645 billion of the links they found were mailto: or javascript: links, but even if that were the case, you wouldn&#8217;t &#8220;crawl&#8221; those anyways.</p>
<p>So, upon seeing this I of course get irked all over again. I went back and revisited the <a href="http://sphinn.com/story/79700" target="_blank">unresolved Sphinn thread</a> that had gotten locked, just to refresh my memory of how the conversation went. I got to the end of the conversation, and I saw something that struck me as just a teensy bit odd:</p>
<p><img src="/images/after-the-fact.png" onmouseup="hl2l(event);"></p>
<p>Wtf? Apparently Scott Willoughby (<em>note: please see update below</em>), an employee of SEOmoz, contacted an admin or mod on Sphinn a little over 5 months ago, 9 months <em>after</em> the conversation ended, and had them unlock the thread, all so he could post this way out of left field comment calling me a liar, and then had them lock it again. I mean, seriously. Why the hell would someone do that? A little over 5 months ago&#8230; hm&#8230; what happened 5 months ago&#8230; wait! Wasn&#8217;t that when 80legs.com went live? I wonder&#8230;</p>
<p>So, off I went to look at the list of &#8220;sources&#8221; that SEOmoz had listed on Linkscape. Lo and behold, there it was:</p>
<p><img src="/images/new-ls-source.png" onmouseup="hl2l(event);"></p>
<p>So it seems that what happened is that in the summer of 2009 SEOmoz learned that there was a new service about to go live, one that had it existed way back when Linkscape launched would have provided an alibi to moz&#8217;s claims, one that would at least put them in the realm of  feasibility. Therefore they went through the effort of having the thread re-opened, just so that someone could post one more claim that yes, they actually did crawl their own data. Of course, this still doesn&#8217;t explain a damn thing about what user agent they were (or are, for that matter) using, or how to keep those bots from hitting your site. Apparently someone in the organization felt strongly enough that it is possible to have future technology retroactively bolster bullshit claims that they actually went down the path of trying to cover their tracks that way.</p>
<p>I sent some messages to Sebastian about all this, since I knew he&#8217;d get a kick out of them yet again trying to confuse people about spidering vs. crawling, and that prompted him to blog about the whole thing again. </p>
<p>On a side note, I do want to address a recurring theme that keeps coming up in the comments throughout this whole issue. Some people are asking, if the tool is useful, who cares if they lie to promote it? Without getting into the whole argument over whether or not link intelligence is worth $800/year when the majority of it is available for free, there are both ethical as well as legal ramifications about what SEOmoz is doing. One of the biggest selling points for this is that this data is presented with SEOmoz&#8217;s own metric, something that they have dubbed as mozRank (mR). This metric is exclusive to SEOmoz, and <em>only holds value if it&#8217;s not more made up bullshit</em>. If they do indeed get exposed for selling snake oil, then anything sold under the pretext of &#8220;we&#8217;re experts&#8230; trust us!&#8217; becomes worthless. </p>
<p>Additionally, they are still gathering this data without full disclosure on how to keep their alleged bots off of our servers, and therefore doing so without our permission. According to the Revised Code of Washington <a href="http://apps.leg.wa.gov/RCW/default.aspx?cite=9A.52.110" target="_blank">9A.52.110</a> (SEOmoz is headquartered in WA), <strong>Computer trespass in the first degree</strong>:</p>
<blockquote><p>(1) A person is guilty of computer trespass in the first degree if the person, without authorization, intentionally gains access to a computer system or electronic database of another; and</p>
<p>&nbsp;&nbsp;(a) The access is made with the intent to commit another crime; or</p>
<p>&nbsp;&nbsp;(b) The violation involves a computer or database maintained by a government agency.</p>
<p>(2) Computer trespass in the first degree is a class C felony.</p></blockquote>
<p>So, it&#8217;s only a crime to deliberately scrape people&#8217;s content if your are doing so in conjunction with committing a crime. According to RCW <a href="http://apps.leg.wa.gov/rcw/default.aspx?cite=9.04.050" target="_blank">9.04.050</a> <strong>False, misleading, deceptive advertising</strong>:</p>
<blockquote><p>It shall be unlawful for any person to publish, disseminate or display, or cause directly or indirectly, to be published, disseminated or displayed in any manner or by any means, including solicitation or dissemination by mail, telephone, electronic communication, or door-to-door contacts, any false, deceptive or misleading advertising, with knowledge of the facts which render the advertising false, deceptive or misleading, for any business, trade or commercial purpose or for the purpose of inducing, or which is likely to induce, directly or indirectly, the public to purchase, consume, lease, dispose of, utilize or sell any property or service, or to enter into any obligation or transaction relating thereto: PROVIDED, That nothing in this section shall apply to any radio or television broadcasting station which broadcasts, or to any publisher, printer or distributor of any newspaper, magazine, billboard or other advertising medium who publishes, prints or distributes, such advertising in good faith without knowledge of its false, deceptive or misleading character.</p></blockquote>
<p>While many of us may take it in stride that we will get lied to when people try and sell us things, trust me, it still does not make it acceptable, and there is law that backs that up.</p>
<p><strong><a name="update1" class="nolink">Update:</a></strong> Apparently there was a glitch in Sphinn when they migrated to new software. The comment that I accused Scott Willoughby of making 9 months after the conversation had been closed (which would have required the involvement of a Sphinn employee) was in fact a Desphinn that he made at the time the post was first submitted. This glitched caused that and 1,530 <em>other</em> Desphinns to all incorrectly get imported as comments&#8230; and all with the exact same timestamp, ie. 7/14/2009. Whoops. <img src='http://smackdown.blogsblogsblogs.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p>Thank you to <a href="http://sphinn.com/user/Michelle/" target="_blank">Michelle Robbins</a>, Third Door Media&#8217;s Director of Technology, for discovering how that actually happened. It does prove that not all conspiracy theories are true. <img src='http://smackdown.blogsblogsblogs.com/wp-includes/images/smilies/icon_biggrin.gif' alt=':D' class='wp-smiley' />  I do, however, stand by the rest of the post.</p>
]]></content:encoded>
			<wfw:commentRss>http://smackdown.blogsblogsblogs.com/2010/01/22/why-the-renewed-interest-in-the-linkscape-scams-and-deception/feed/</wfw:commentRss>
		<slash:comments>12</slash:comments>
		</item>
		<item>
		<title>Robert Scoble Chews Out Lisa Barone&#8217;s Ass For Taking His Name In Vain &#8211; WTF?</title>
		<link>http://smackdown.blogsblogsblogs.com/2009/03/02/robert-scoble-chews-out-lisa-barones-ass-for-taking-his-name-in-vain-wtf/</link>
		<comments>http://smackdown.blogsblogsblogs.com/2009/03/02/robert-scoble-chews-out-lisa-barones-ass-for-taking-his-name-in-vain-wtf/#comments</comments>
		<pubDate>Tue, 03 Mar 2009 04:36:34 +0000</pubDate>
		<dc:creator>Michael VanDeMar</dc:creator>
				<category><![CDATA[blogthropology]]></category>
		<category><![CDATA[lackofmeds]]></category>
		<category><![CDATA[On The Ball-ness]]></category>
		<category><![CDATA[psychoblogging]]></category>
		<category><![CDATA[Social Media]]></category>

		<guid isPermaLink="false">http://smackdown.blogsblogsblogs.com/?p=292</guid>
		<description><![CDATA[Tonight Robert &#8216;I Am Thy Lord And Thou Shalt Kneel, Bitches!&#8217; Scoble, a blogger who has some claim to internet fame through his blog Scobleizer, decided that the title of &#8220;technical evangelist&#8221; that has been often attributed him simply wasn&#8217;t enough, and that deity is apparently more fitting. Lisa Barone wrote a piece talking about [...]]]></description>
			<content:encoded><![CDATA[<p>Tonight <a href="http://twitter.com/Scobleizer" target="_blank">Robert &#8216;I Am Thy Lord And Thou Shalt Kneel, Bitches!&#8217; Scoble</a>, a blogger who has some claim to internet fame through his blog Scobleizer, decided that the title of &#8220;<a href="http://en.wikipedia.org/wiki/Robert_Scoble" target="_blank">technical evangelist</a>&#8221; that has been often attributed him simply wasn&#8217;t enough, and that deity is apparently more fitting.</p>
<p>Lisa Barone <a href="http://outspokenmedia.com/branding/false-idols/" target="_blank">wrote a piece</a> talking about personal brands and false idols on the web. In it she wrote the following paragraph:</p>
<blockquote><p>Don&#8217;t support personal brands built on smoke and mirrors. Make people work for the brands they&#8217;re trying to create. Don&#8217;t let them <strong>scoble</strong> their way in. Don&#8217;t accept that someone is important just because they act like they are or someone told you they were.</p></blockquote>
<p>Apparently Robert is the ultra sensitive type, and didn&#8217;t take too kindly<span id="more-292"></span> to her choice of wordage. Here is his reply:</p>
<p><img src="/images/scoble.png" alt="...you might do some research behind how I actually got here before you take my name in vain. - Robert Scoble" onmouseup="hl2l(event);" class="centered"></p>
<p>Wow, Bob. Way to identify with the lowly masses out there. <img src='http://smackdown.blogsblogsblogs.com/wp-includes/images/smilies/icon_biggrin.gif' alt=':D' class='wp-smiley' /> </p>
]]></content:encoded>
			<wfw:commentRss>http://smackdown.blogsblogsblogs.com/2009/03/02/robert-scoble-chews-out-lisa-barones-ass-for-taking-his-name-in-vain-wtf/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
		<item>
		<title>My Blog Hacked, Yet Again &#8211; WordPress 2.6.5 Vulnerability / Exploit?</title>
		<link>http://smackdown.blogsblogsblogs.com/2009/01/16/my-blog-hacked-yet-again-wordpress-265-vulnerability-exploit/</link>
		<comments>http://smackdown.blogsblogsblogs.com/2009/01/16/my-blog-hacked-yet-again-wordpress-265-vulnerability-exploit/#comments</comments>
		<pubDate>Fri, 16 Jan 2009 20:51:16 +0000</pubDate>
		<dc:creator>Michael VanDeMar</dc:creator>
				<category><![CDATA[blogthropology]]></category>
		<category><![CDATA[coding]]></category>
		<category><![CDATA[psychoblogging]]></category>
		<category><![CDATA[web design]]></category>
		<category><![CDATA[Wordpress]]></category>

		<guid isPermaLink="false">http://smackdown.blogsblogsblogs.com/?p=218</guid>
		<description><![CDATA[Again, I&#8217;ve been hacked. Well, not me personally&#8230; I wear the most up to date tinfoil attire, I assure you, and no one is getting into my head&#8230; but my blog was. This time I was running WordPress 2.6.5 when it happened. Those who know me know that I always prefer to do manual upgrades, [...]]]></description>
			<content:encoded><![CDATA[<p><img src="/images/broken-wordpress-lock.png" border="0" alt="Busted WordPress security." style="float: right;"  onmouseup="hl2l(event);"> Again, I&#8217;ve been hacked. Well, not me personally&#8230; I wear the most up to date tinfoil attire, I assure you, and no one is getting into <em>my</em> head&#8230; but my blog was. This time I was running WordPress 2.6.5 when it happened. </p>
<p>Those who know me know that I always prefer to do manual upgrades, wiping everything out and starting over completely fresh each time, whether I have been hacked or not. This way if there was an intrusion it should still <a href="" target="_blank">clean the hack</a> out completely, even if I don&#8217;t know it&#8217;s there. As it happens, when I upgraded to 2.6.5 from 2.6.2 I did not do this. I merely upgraded the 2 files involved in the security portion of the <a href="http://wordpress.org/development/2008/11/wordpress-265/" target="_blank">WP 2.6.5 upgrade</a> (which were wp-includes/feed.php and wp-includes/version.php). However, <span id="more-218"></span>to date those are still the only two files from that version with a security risk according to WordPress, and I upgraded them well before I was hacked.</p>
<p>I noticed something was wrong earlier this week, after I wrote the post on how to easily find <a href="http://smackdown.blogsblogsblogs.com/2009/01/12/how-to-find-the-best-free-imagephotographics-downloads-for-your-blog-posts/" target="_blank">free photo downloads</a> for your blog posts. I was checking to see if the post had any rankings a couple of days later, when I noticed that it wasn&#8217;t showing in Google. I don&#8217;t mean that it wasn&#8217;t ranking, either&#8230; I mean it wasn&#8217;t showing at all. I checked, and sure enough Google had definitely cached the post shortly after I had published it. It was showing when I did a site: command too. I realized that somethings was weird, however, when I saw that the description for my homepage still showed my post from November as being the snippet in the serps:</p>
<p><img src="/images/smackdown-old-snippet.png" alt="Old Smackdown snippet showing in Google" onmouseup="hl2l(event);"></p>
<p>I checked that as well, and just as I thought they had already re-cached the homepage too, which means that showing the old snippet made no sense:</p>
<p><img src="/images/smackdown-recent-cache-sm.png" alt="Freshly cached post in Google" onmouseup="hl2l(event);"></p>
<p>Neither the post itself nor the homepage were showing in the serps at all, even for exact phrases unique to those pages, phrases that <em>were</em> showing in Google&#8217;s cache, and therefore should have been searchable. My first thought was that I had been penalized for some reason (the conspiracy theorist in me even considered it might be the &#8220;PageRank for Sale&#8221; alt text on an image from Novembers post <img src='http://smackdown.blogsblogsblogs.com/wp-includes/images/smilies/icon_razz.gif' alt=':P' class='wp-smiley' />  ), so I sent off a couple of <a href="http://twitter.com/mvandemar" target="_blank">Tweets</a> asking people if they saw anything I might have missed. </p>
<p>Luckily, one of the people listening who was kind enough to respond was <a href="http://twitter.com/JohnMu" target="_blank">John Mueller</a>. After telling me that I need to upgrade my tinfoil hat to <a href="" target="_blank">Mu-metal one</a> (heh, thanks John! <img src='http://smackdown.blogsblogsblogs.com/wp-includes/images/smilies/icon_biggrin.gif' alt=':D' class='wp-smiley' />  ), he found the issue fairly quickly:</p>
<p><img src="/images/johnmu-smackdowns-hacked-sm.png" alt="Link to the text-only version of Googles cache of my pages" onmouseup="hl2l(event);"></p>
<p>The link he gave me pointed to the text-only version of Google&#8217;s <a href="http://cli.gs/sLVArj" target="_blank">cache of my homepage</a>, and when I scrolled down, sure enough there it was:</p>
<p><img src="/images/smackdown-hacked-2.6.5.png" alt="Text-only version of Googles cache of Smackdowns homepage" onmouseup="hl2l(event);"></p>
<p>I upgraded last night (complete wipe and reinstall this time), but I&#8217;m a little concerned still. Since there has still been no word from WordPress about 2.6.5 being vulnerable, that may mean that it is something that they are completely unaware of, and therefore was carried over into WordPress 2.7. I did some research on other hacked blogs, and while I did find one other 2.6.5 blog and one 2.7, comparing their caches against other older cached pages on the same site it looks like both of those were hacked prior to them upgrading. If anyone else find more information about blogs that have gotten hacked <em>after</em> upgrading to WP 2.6.5 or above, please let me know.</p>
<div><em>Original <a title="rusty-lock" href="http://flickr.com/photos/8323834@N07/500995147/">rusty lock image</a> by <a href="http://www.subcircle.co.uk/">subcircle</a></em></div>
]]></content:encoded>
			<wfw:commentRss>http://smackdown.blogsblogsblogs.com/2009/01/16/my-blog-hacked-yet-again-wordpress-265-vulnerability-exploit/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>How To Remove Your Website From Linkscape *Without* An SEOmoz Meta Tag</title>
		<link>http://smackdown.blogsblogsblogs.com/2008/10/21/how-to-remove-your-website-from-linkscape-without-an-seomoz-meta-tag/</link>
		<comments>http://smackdown.blogsblogsblogs.com/2008/10/21/how-to-remove-your-website-from-linkscape-without-an-seomoz-meta-tag/#comments</comments>
		<pubDate>Tue, 21 Oct 2008 07:55:50 +0000</pubDate>
		<dc:creator>Michael VanDeMar</dc:creator>
				<category><![CDATA[blogthropology]]></category>
		<category><![CDATA[lackofmeds]]></category>
		<category><![CDATA[nerdiness]]></category>
		<category><![CDATA[On The Ball-ness]]></category>
		<category><![CDATA[psychoblogging]]></category>
		<category><![CDATA[scams]]></category>
		<category><![CDATA[SEO]]></category>
		<category><![CDATA[Social Media]]></category>
		<category><![CDATA[web design]]></category>

		<guid isPermaLink="false">http://smackdown.blogsblogsblogs.com/?p=156</guid>
		<description><![CDATA[Over the past couple of weeks, one of the biggest concerns about SEOmoz&#8217;s new Linkscape tool (which I recently blogged about in reference to the bots that Rand refuses to identify, and then again due to suspicious additions of a phantom 7 billion pages to one of his index sources) has been the complete lack [...]]]></description>
			<content:encoded><![CDATA[<p><img src="/images/gavel.jpg" border="0" alt="You do have rights to your content." style="float: right;"  onmouseup="hl2l(event);"> Over the past couple of weeks, one of the biggest concerns about SEOmoz&#8217;s new Linkscape tool (which I recently blogged about in reference to the <a href="/2008/10/17/how-to-block-the-bots-seomoz-isnt-telling-you-about/" target="_blank">bots that Rand refuses to identify</a>, and then again due to <a href="/2008/10/20/how-to-add-7-billion-pages-to-your-index-overnight" target="_blank">suspicious additions of a phantom 7 billion pages</a> to one of his index sources) has been the complete lack of a method available for someone to remove their data from the tool. Assuming that all of the hints Rand has been so &#8220;subtly&#8221; dropping are accurate, and the one bot that they do actually have control over is in fact <a href="http://www.dotnetdotcom.org/" target="_blank">DotBot</a>, then from the beginning the data was collected under false pretenses. The DotBot website clearly states<span id="more-156"></span> the following as it&#8217;s purpose:</p>
<blockquote><p>Our purpose is rather simple. We want to make the internet as open as possible. Currently only a select few corporations have a complete and useful index of the web. Our goal is to change that fact by crawling the web and releasing as much information about its structure and content as possible. We plan on doing this in a manner that will cover our costs (selling our index) and releasing it for free for the benefit of all webmasters.</p></blockquote>
<p>If, again, DotBot is owned by SEOmoz, then actual goal of collecting those webpages was the development of a commercial tool. With that in mind, Rand&#8217;s refusal to remove pages from the index that the owners do not want in there takes on a whole new level of unreasonableness. When <a href="http://sphinn.com/story/80142#c56146" target="_blank">pressed about it</a>, this is the most Rand is willing to compromise as far as removing sites from the index:</p>
<blockquote><p>3)SEOmoz will ONLY remove your site from DISPLAYING your data through Linkscape if you add a customized SEOmoz meta tag to each and every page on your site, and even then, only after a 30-60 day time period.</p>
<p>Yes, although we are looking at ways to block an entire site from being shown in the future through a registration system. And yes, we can&#8217;t block anything until we&#8217;ve re-crawled and re-indexed that page, which can take 30-60 days depending on the speed with which we crawl/re-crawl a given URL.</p>
<p>4)SEOmoz is &#8220;unwilling to provide a clear concise way to keep data out of Linkscape.&#8221;</p>
<p>That&#8217;s what you said, and I merely copied it to point out that it had an exception. I know it&#8217;s a fun soundbyte, but without the important caveat in the sentence it was in, it&#8217;s really unfair to keep using this phrase. That caveat is that we are willing to provide one clear, concise way to keep data out of Linkscape &#8211; the seomoz noindex meta tag.</p></blockquote>
<p>So, the only way Rand will <em>voluntarily</em> remove your site from his index is if you agree to basically brand your website with a meta tag using his company name, and then wait 30-60 days. Unfortunately for him, that&#8217;s really not his call.</p>
<p>You own your website and the data it contains (assuming you did not scrape it from somewhere else, of course), and that ownership is protected under US copyright law. Anyone whose rights are violated under that law have specific remedies available to them under the <a href="http://en.wikipedia.org/wiki/Digital_Millennium_Copyright_Act" target="_blank">Digital Millennium Copyright Act</a>.</p>
<p><strong>Now, I cannot stress this strongly enough&#8230;</strong> these remedies are <em>not</em> intended to harass a website owner. They should be used neither frivolously nor fraudulently, and <em>there are penalties for filing false information</em>. You should under no circumstances perform this process for any urls or domains that you do not explicitly own, and if a counter-notification does get filed then you should in fact follow through with a lawsuit.</p>
<p>For all <em>valid</em> claims, I am outlining an easy to follow process for requesting that your information be removed from his index.</p>
<p>First, verify that your content is indeed in their tool. If it is, then the next step is to contact SEOmoz directly. Give them a chance to rectify the situation within a timely manner. Send a polite request that your entire domain be completely removed from the index powering their Linkscape tool, and for a way to confirm that it has indeed been done once they have. The support email for SEOmoz is listed on the site as <a href="mailto:sitesupport@seomoz.org">sitesupport@seomoz.org</a>, or you can fax them the request at (206) 338-3797. In this request you should list who you are, the address of your domain, and your contact information. Despite Rand&#8217;s insistence that they cannot do this, it might turn out that they do in fact have the ability after all. Do not skip the step of contacting them first. For tracking purposes, you might want to CC their ISP with this initial request, to document that you did indeed attempt to resolve the issue with them first, although this is not required. If you do decide to do that, SEOmoz&#8217;s ISP is <a href="http://www.hopone.net/" target="_blank">HopOne Internet Corporation</a>. The appropriate email to use for these matters, according to <a href="http://www.hopone.net/aup.php" target="_blank">HopOne&#8217;s AUP</a>, is <a href="mailto:abuse@hopone.net">abuse@hopone.net</a>, and their fax is (604) 608-2953.</p>
<p>If after a reasonable amount of time, say, 24 hours, they still have not removed your sites information, then you can consider sending a formal DMCA letter to their ISP, HopOne. The requirements for such a letter are very specific, and are laid out in <a href="http://www4.law.cornell.edu/uscode/17/512.html#c_3" target="_blank">17 U.S.C. § 512(c)(3)</a>, &#8221; Elements of notification&#8221;. A sample DMCA notice for this purpose might look something like this:</p>
<blockquote><p>To: abuse@hopone.net<br />
Subject: Notice of Copyright Infringement<br />
The copyrighted work at issue is the the entire set of links appearing on my domain at {<strong>www.mydomain.com</strong>}, each comprised of their respective URLs, anchor texts, and attributes, including both those constituting my websites navigation, as well as those linking my website to other websites on the Internet. While I acknowledge than an individual url in and of itself may not be copyrightable, I maintain that the set of links residing on my website taken as a whole or in sections do in fact comprise a structure that is unique and my own property.</p>
<p>The freely accessible URL where my copyrighted material is located is accessed through the gateway page located at http://www.seomoz.org/linkscape . Since the interface that is displaying my content is only visible via an http POST request, it is necessary to enter my domain {<strong>www.mydomain.com</strong>} into the text box presented, and then press the button labeled &#8220;GO&#8221;, in order to view the infringing material. Note that while this does demonstrate the existence of the infringing material being used on the server, it is only the one open to the general public without paying a fee, although this request is for the removal of the information from the index completely, including from areas accessible only to paying members of the website.</p>
<p>The contact information for the company of the infringing website, as indicated by their Contact Us page, is as follows:<br />
Office: (206) 632-3171<br />
Fax: (206) 338-3797<br />
sitesupport@seomoz.org<br />
SEOmoz.org<br />
1221 E. Pike St., Suite 200<br />
Seattle, WA 98122</p>
<p>I can be reached at {<strong>your@email.com</strong>}, or via telephone at {<strong>your telephone number</strong>}. My mailing address is {<strong>your full mailing address, including street and number, any apartment number, city, state, and zip code</strong>}.</p>
<p>I have a good faith belief that use of the copyrighted materials described above as allegedly infringing is not authorized by the copyright owner, its agent, or the law.</p>
<p>I swear, under penalty of perjury, that the information in the notification is accurate and that I am the copyright owner or am authorized to act on behalf of the owner of an exclusive right that is allegedly infringed.</p>
<p>At your earliest convenience, please respond to this letter at my email address listed above, and let me know what actions have been taken to resolve this matter. Thank you.</p>
<p>My electronic signature is below:<br />
{<strong>Put Your Name Here</strong>}</p></blockquote>
<p>Bottom line is, it would be nice if Rand would simply step up to the plate and actually <em>be</em> the nice guy he wants everyone to believe that he is. Until such time as that actually happens, however, as sad as it may be, this may be our only recourse to keep him from using our information without consent.</p>
]]></content:encoded>
			<wfw:commentRss>http://smackdown.blogsblogsblogs.com/2008/10/21/how-to-remove-your-website-from-linkscape-without-an-seomoz-meta-tag/feed/</wfw:commentRss>
		<slash:comments>13</slash:comments>
		</item>
		<item>
		<title>How To Add 7 Billion Pages To Your Index Overnight</title>
		<link>http://smackdown.blogsblogsblogs.com/2008/10/20/how-to-add-7-billion-pages-to-your-index-overnight/</link>
		<comments>http://smackdown.blogsblogsblogs.com/2008/10/20/how-to-add-7-billion-pages-to-your-index-overnight/#comments</comments>
		<pubDate>Mon, 20 Oct 2008 07:22:00 +0000</pubDate>
		<dc:creator>Michael VanDeMar</dc:creator>
				<category><![CDATA[blogthropology]]></category>
		<category><![CDATA[coding]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[psychoblogging]]></category>
		<category><![CDATA[scams]]></category>
		<category><![CDATA[SEO]]></category>
		<category><![CDATA[Social Media]]></category>

		<guid isPermaLink="false">http://smackdown.blogsblogsblogs.com/?p=150</guid>
		<description><![CDATA[A couple of days ago I posted my assertion that Rand Fishkin had lied about the details of the new Linkscape tool on SEOmoz. During the discussion that followed, Rand continued to maintain that they owned the bots that collected the data that powered the tool, despite several points on that being very unclear, and [...]]]></description>
			<content:encoded><![CDATA[<p>A couple of days ago I posted my assertion that <a href="/2008/10/17/how-to-block-the-bots-seomoz-isnt-telling-you-about/" target="_blank">Rand Fishkin had lied about the details of the new Linkscape</a> tool on SEOmoz. During the discussion that followed, Rand continued to maintain that they owned the bots that collected the data that powered the tool, despite several points on that being very unclear, and that his bots had collected those 30 billion pages.</p>
<p>Right in the heat of the argument, someone decided to drop a comment on my blog that struck me as a little odd<span id="more-150"></span> for some reason:</p>
<blockquote><p>So who&#8217;s behind dotnetdotcom.org? &#8220;few Seattle based guys&#8221; &#8220;Trust us&#8221; ? WTF!? Why are there absolutely no names on that site? &#8211; <em>some guy called smallfish</em></p></blockquote>
<p>I had looked at that site before when Rand had released all of the info as to where the data from his tool actually came from. I had dismissed it, since Rand was claiming to have 30 billion pages in his index. The download on this site was only for 3.2 million pages out the initial 11 million pages that they had collected so far, what they were calling &#8220;the first part&#8221; of their index.</p>
<p>Since right at that moment Rand and I were arguing about whether or not Linkscape actually had a bot of it&#8217;s own that had collected the pages in their index, it hit me. &#8220;Aha!&#8221;, I thought. &#8220;Rand is probably going to reveal at some point that they actually own the DotBot. I mean, being able to say that you collected 11 million of the pages is better than having not collected <em>any</em> of them, right?&#8221; </p>
<p>So, I trotted off to <a href="http://www.dotnetdotcom.org/" target="_blank">dotnetdotcom.org</a> to take a second look, just in case that turned out to be what was happening. Once I got there, I notice that something was different. When I had visited the page on Friday, these were the stats I saw:</p>
<p><a href="/images/dotbot-numbers20081015.png" target="_blank"><img src="/images/dotbot-numbers20081015-sm.png" alt="Original DotBot claims" onmouseup="hl2l(event);"></a><br />
(<em>Click to enlarge.</em>)</p>
<p>Saturday night, however, when I went to look, this is what I saw:</p>
<p><a href="/images/dotbot-numbers20081019.png" target="_blank"><img src="/images/dotbot-numbers20081019-sm.png" alt="Vastly inflated DotBot claims" onmouseup="hl2l(event);"></a><br />
(<em>Click to enlarge.</em>)</p>
<p>That&#8217;s right&#8230; smack in the middle of an argument between Rand and myself, where he was insisting that he owned a bot capable of spidering an index of the size he was boasting, one of the sources he listed (the one that no one knew who the owners really were) jumped 7 <em>billion</em> pages in size. Talk about your random coincidences.</p>
<p>Let&#8217;s take a peek behind the scenes here, and see exactly how they managed to accomplish this truly humongous task. Viewing the source on the page, we can see that the counter is driven by a Javascript routine:</p>
<p><a href="/images/dotbot-js-orig.png" target="_blank"><img src="/images/dotbot-js-orig-sm.png" alt="Original DotBot Javascript" onmouseup="hl2l(event);"></a><br />
(<em>Click to enlarge.</em>)</p>
<p>What it does is it starts with the date that the DotBot went online, which is June 10th, 2008, calculates the number of seconds between then and now, and uses that as the starting point for how many pages it has spidered so far. It then counts up the display at one page per second.</p>
<p>While obviously this would only be an estimate of how many pages were crawled, <em>this is a perfectly reasonable way to do it</em>. For a smaller, non-Google sized company, with a server dedicated to nothing but crawling web pages, grabbing a page a second is about what we would expect. However, what they did next is another story altogether. How did a company capable of spidering a page a second around the clock get such an amazing boost in capacity? When we view the current Javascript, we see this:</p>
<p><a href="/images/dotbot-js-new.png" target="_blank"><img src="/images/dotbot-js-new-sm.png" alt="Original DotBot Javascript" onmouseup="hl2l(event);"></a><br />
(<em>Click to enlarge.</em>)</p>
<p>All they did was add in 7 billion pages to the start number, and added in a proportional boost to the other factors they are displaying as well (domains, robots.txt, and &#8220;clogged tubes&#8221;). They even left the clock counting up at 1 page per second. <img src='http://smackdown.blogsblogsblogs.com/wp-includes/images/smilies/icon_biggrin.gif' alt=':D' class='wp-smiley' /> </p>
<p>Now, to put that in perspective&#8230; 7 billion pages at 1 page per second would in fact take 7 billion seconds to spider. This is not counting any processing or indexing time, this is just the collection of the raw data itself. That is:</p>
<p>116.6 million minutes<br />
 or<br />
1.94 million hours<br />
 or<br />
81,018 days<br />
 or<br />
221 <em>years</em> (not counting leap years or time travel, of course)</p>
<p>in order for them to actually gather all of those pages. And they&#8217;re claiming that they did it in 4 months. </p>
<p>Uh huh. Right. <img src='http://smackdown.blogsblogsblogs.com/wp-includes/images/smilies/icon_biggrin.gif' alt=':D' class='wp-smiley' />  </p>
<p>I figured even if my hunch about this being related to the Linkscape issue was wrong, it&#8217;s still noteworthy that a company that is self professed to be worried about making the internet &#8220;as open as possible&#8221; would be trying to pull a fast one like this.</p>
]]></content:encoded>
			<wfw:commentRss>http://smackdown.blogsblogsblogs.com/2008/10/20/how-to-add-7-billion-pages-to-your-index-overnight/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>How To Block The Bots SEOmoz *Isn&#8217;t* Telling You About</title>
		<link>http://smackdown.blogsblogsblogs.com/2008/10/17/how-to-block-the-bots-seomoz-isnt-telling-you-about/</link>
		<comments>http://smackdown.blogsblogsblogs.com/2008/10/17/how-to-block-the-bots-seomoz-isnt-telling-you-about/#comments</comments>
		<pubDate>Fri, 17 Oct 2008 18:05:54 +0000</pubDate>
		<dc:creator>Michael VanDeMar</dc:creator>
				<category><![CDATA[blogthropology]]></category>
		<category><![CDATA[coding]]></category>
		<category><![CDATA[lackofmeds]]></category>
		<category><![CDATA[nerdiness]]></category>
		<category><![CDATA[On The Ball-ness]]></category>
		<category><![CDATA[psychoblogging]]></category>
		<category><![CDATA[scams]]></category>
		<category><![CDATA[SEO]]></category>
		<category><![CDATA[Social Media]]></category>

		<guid isPermaLink="false">http://smackdown.blogsblogsblogs.com/?p=132</guid>
		<description><![CDATA[Ok, so, looks like Rand and gang finally decided to reveal their top-secret recipe about how they gathered all that information on everybody&#8217;s websites without anyone noticing what they were doing. There was quite a bit of hoopla over the fact that when they announced their new index of 30 billion web pages (and the [...]]]></description>
			<content:encoded><![CDATA[<p><img src="/images/oath_witness.png" border="0" alt="I swear to tell the... wait, what did you say..?" style="float: right;"  onmouseup="hl2l(event);"> Ok, so, looks like Rand and gang finally decided to reveal their top-secret recipe about how they gathered all that information on everybody&#8217;s websites without anyone noticing what they were doing. There was <a href="http://ekstreme.com/thingsofsorts/fun-web/the-seomoz-linkscape-ghost" target="_blank">quite a bit of hoopla</a> over the fact that <a href="http://www.seomoz.org/blog/announcing-seomozs-index-of-the-web-and-the-launch-of-our-linkscape-tool" target="_blank">when they announced their new index of 30 billion web pages</a> (and the new tool powered by that index), due to the fact that they never gave webmasters the chance to block them from gathering this data. In fact, they never even<span id="more-132"></span> announced their presence at all.</p>
<p>While this is a huge breach of netiquette as it pertains to crawlers, at least today <a href="http://sphinn.com/story/77000#c55704" target="_blank">Rand finally announced</a> that they are now disclosing their sources for data. In fact, this was how he worded it to the community:</p>
<blockquote><p>we are now disclosing our sources for data &#8211; <em>Rand Fiskin, SEOmoz CEO and really, really open guy</em></p></blockquote>
<p>Better late than never, right?</p>
<p>The thing is, as I was looking over the list of bots that you would need to block in order to prevent mozzers from gathering your data, I noticed this subtle, easy to miss pattern in what they were listing. You have to look really, really close to see it, and the untrained eye might never see it at all, but luckily, eventually, I did see it for myself:</p>
<p><img src="/images/moz-crawlers2.png" alt="other data sources and additional crawls...?" onmouseup="hl2l(event);"></p>
<p>That&#8217;s right folks, if you <em>do</em> decide to keep moz out by blocking all of the big guys (Google, Yahoo, MSN, Ask, Amazon, and Alexa), the lesser known guys (Dotnetdotcom, Grub, Page-Store, and Exalead), and that one <em>fictional</em> guy they threw in there (Gagablast), you still won&#8217;t have them blocked. Fear not though&#8230; after much work, I finally figured out that Rand was indeed true to his word, and that they did in fact release enough information to block the bots. All you have to do is add the following lines to your robots.txt, and you&#8217;ll be golden*:</p>
<pre>
<code>User-Agent: *and other data sources
Disallow: /
User-Agent: Additional crawls*
Disallow: /</code>
</pre>
<p>See? I got ya covered! <img src='http://smackdown.blogsblogsblogs.com/wp-includes/images/smilies/icon_biggrin.gif' alt=':D' class='wp-smiley' /> </p>
<p>Seriously though, despite the fact that Rand and Co. are still less than forthcoming about all of the bots that are used (I&#8217;m guessing he doesn&#8217;t actually know, to be honest), something much more revealing is highlighted in this information, namely, <em>they lied about having their own crawler</em>.</p>
<p>Let&#8217;s take a quick review of some statements Rand made in the initial announcement about the tool:</p>
<ul>
<li>Our crawl biases towards having pages and data&#8230;</li>
<li>As others who&#8217;ve invested energy into crawling the web&#8230;</li>
<li>our crawl biases towards this &#8220;center&#8221;&#8230;</li>
<li>Our process for crawling the web&#8230;</li>
<li>Moving forward, we&#8217;ll&#8230; invest in better and faster crawling&#8230;</li>
<li>In comparing our crawls against the engines&#8230;</li>
<li>we&#8217;ll be releasing more information about our crawl&#8230;</li>
</ul>
<p>You also have statements made by moz employee Nick Gerner like this:</p>
<ul>
<li>we&#8217;re crawling everything we can&#8230;</li>
</ul>
<p>You even have him claiming bullshit like this:</p>
<blockquote><p>We do prioritize the crawl according to pages we think are important. For now, and probably for the foreseeable future we&#8217;re going to rely on link endorsement to make that decision. Make good content, get good links. Keep it publicly available. We&#8217;ll get there soon enough &#8211; <em>Nick Gerner, moz employee</em></p></blockquote>
<p>That&#8217;s right, not only did they claim that they were crawling the web, they wanted us to believe  that they prioritized how they crawled based on an <em>importance</em> algorithm! <img src='http://smackdown.blogsblogsblogs.com/wp-includes/images/smilies/icon_biggrin.gif' alt=':D' class='wp-smiley' /> </p>
<p>I have to admit, Rand, it&#8217;s pretty bold to basically admit this late in the game that you guys lied through your teeth and grossly misrepresented the facts, just so you could appear to have accomplished a much bigger task than you actually did, all in the name of getting more money from webmasters. That&#8217;s a much bigger admission than saying you cloaked your bot, if you ask me. Gratz on coming clean.</p>
<div><em><strong>*Disclaimer:</strong> The code I listed is sarcasm, btw. Those robots.txt lines won&#8217;t actually block anything, just in case you didn&#8217;t know.</em></div>
]]></content:encoded>
			<wfw:commentRss>http://smackdown.blogsblogsblogs.com/2008/10/17/how-to-block-the-bots-seomoz-isnt-telling-you-about/feed/</wfw:commentRss>
		<slash:comments>78</slash:comments>
		</item>
		<item>
		<title>Google Allows Ads Mocking Suicide</title>
		<link>http://smackdown.blogsblogsblogs.com/2008/10/10/google-allows-ads-mocking-suicide/</link>
		<comments>http://smackdown.blogsblogsblogs.com/2008/10/10/google-allows-ads-mocking-suicide/#comments</comments>
		<pubDate>Fri, 10 Oct 2008 19:40:16 +0000</pubDate>
		<dc:creator>Michael VanDeMar</dc:creator>
				<category><![CDATA[blogthropology]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[lackofmeds]]></category>
		<category><![CDATA[On The Ball-ness]]></category>
		<category><![CDATA[psychoblogging]]></category>
		<category><![CDATA[Social Media]]></category>

		<guid isPermaLink="false">http://smackdown.blogsblogsblogs.com/?p=123</guid>
		<description><![CDATA[During the Great Depression, the suicide rate jumped over 21.4%. It was a sad time for all, and the unemployment rate skyrocketed. Many people lost their homes and farms. The shame of not being able to provide for their families was simply too much for some. Last June, &#8220;Good Morning America&#8221; did a segment titled [...]]]></description>
			<content:encoded><![CDATA[<p><img src="/images/jumper.png" border="0" alt="Google jumper" style="float: right;"  onmouseup="hl2l(event);"> During the Great Depression, the suicide rate jumped over 21.4%. It was a sad time for all, and the unemployment rate skyrocketed. Many people lost their homes and farms. The shame of not being able to provide for their families was simply too much for some. Last June, &#8220;Good Morning America&#8221; did a segment titled &#8220;Recession Depression&#8221;, where reporter Chris Cuomo drew analogies between the events back then and our current financial crisis, warning that we could <a href="http://businessandmedia.org/articles/2008/20080610143517.aspx" target="_blank">possibly see similar psychological impacts</a> with todays economy:<span id="more-123"></span></p>
<blockquote><p>The link between financial troubles and psychological problems is well documented. In the U.S. the seminal example is the Great Depression, when the suicide rate jumped from 14 to 17 for every 100,000 Americans. And today, with the threat of recession looming large, the price we pay physically may skyrocket as well. &#8211; Chris Cuomo</p></blockquote>
<p>The Great Depression certainly wasn&#8217;t a funny time in our history, and the current financial market really isn&#8217;t a laughing matter either. While we aren&#8217;t quite as bad (not yet, anyways) as we were back then, if things did get to the point where people started killing themselves due to their circumstances I hardly think it would be something we should joke about in the mainstream media. However, one company, Woot.com, apparently thinks it is. </p>
<p>I was performing my daily search to see how bad Google (<a href="http://finance.google.com/finance?client=ob&#038;q=NASDAQ:GOOG">NASDAQ: GOOG</a>) was doing. As <a href="http://www.techcrunch.com/2008/10/10/google-employees-watch-in-horror-as-60-percent-of-their-stock-options-drown/" target="_blank">others have noted</a>, they aren&#8217;t doing so hot. This I pretty much expected. What I <em>didn&#8217;t</em> expect was the AdSense ad making fun of the situation off to the right:</p>
<p><a href="/images/woot-ad.png" target="_blank"><img src="/images/woot-ad-sm.png" alt="Before you jump out that window, why not spend your last remaining dollars at Woot?" onmouseup="hl2l(event);"></a><br />
(<em>Click to enlarge.</em>)</p>
<p>The ad reads, &#8220;Before you jump out that window, why not spend your last remaining dollars at Woot?&#8221; Considering that the ad is showing on a stock quote search, there is no room for mistaking their message. Pretty fucking tacky, Woot. Don&#8217;t get me wrong, it is funny&#8230; but really not appropriate advertising for mainstream media. Considering all the press Google has gotten in the past for <em>accidental</em> inappropriate ads (like like <a href="http://www.reubenyau.com/black-people-on-ebay-again/" target="_blank">Ebay selling Black People</a>, and <a href="/2007/12/18/matt-cutts-says-paid-ads-are-a-type-of-search/" target="_blank">low prices on Child Brides</a> at Amazon.com), I&#8217;m surprised that they are letting something like this slide.</p>
]]></content:encoded>
			<wfw:commentRss>http://smackdown.blogsblogsblogs.com/2008/10/10/google-allows-ads-mocking-suicide/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Write A Bad Check: Bond $1,500&#8230; Possession Of Cocaine: Bond $5 Bucks. Wtf?</title>
		<link>http://smackdown.blogsblogsblogs.com/2008/07/21/write-a-bad-check-bond-1500-possession-of-cocaine-bond-5-bucks-wtf/</link>
		<comments>http://smackdown.blogsblogsblogs.com/2008/07/21/write-a-bad-check-bond-1500-possession-of-cocaine-bond-5-bucks-wtf/#comments</comments>
		<pubDate>Mon, 21 Jul 2008 15:31:22 +0000</pubDate>
		<dc:creator>Michael VanDeMar</dc:creator>
				<category><![CDATA[blogthropology]]></category>
		<category><![CDATA[lackofmeds]]></category>
		<category><![CDATA[On The Ball-ness]]></category>
		<category><![CDATA[psychoblogging]]></category>

		<guid isPermaLink="false">http://smackdown.blogsblogsblogs.com/?p=87</guid>
		<description><![CDATA[I live in Largo, FL, which is located in Pinellas County. People and the way things work aren&#8217;t always what you would call &#8220;normal&#8221; around here. Not sure if it&#8217;s the incessant heat, or something they put in the water, or the fact that we live within shouting distance of Scientology Central (which is located [...]]]></description>
			<content:encoded><![CDATA[<p><img src="/images/parkingticket.jpg" border="0" alt="Ok, so, thats $2,000 for illegally parking. Oh, and another $5 for the 3 kilos of cocaine in your trunk. Have a nice day." style="float: right;"  onmouseup="hl2l(event);"> I live in Largo, FL, which is located in Pinellas County. People and the way things work aren&#8217;t always what you would call &#8220;normal&#8221; around here. Not sure if it&#8217;s the incessant heat, or something they put in the water, or the fact that we live within shouting distance of <a href="http://scientology.fso.org/" target="_blank">Scientology Central</a> (which is located in Clearwater). Either way, for whatever the reason, sometimes things just aren&#8217;t right around these parts.</p>
<p>I swear, you can only find this stuff in the St. Petersburg, FL area. Someone showed me this recently on the Pinellas County Jail Inmate search: <span id="more-87"></span></p>
<p><img src="/images/posscocaine-5bucks-sm.png" alt="Bad Check: Bond $1,500... Possession Of Cocaine: Bond $5 Bucks" onmouseup="hl2l(event);"></p>
<p>From: <a href="http://www.pcsoweb.com/Inmate/" target="_blank">http://www.pcsoweb.com/Inmate/</a></p>
<p>I mean, seriously, wtf?</p>
<div><em>Original <a href="http://www.flickr.com/photos/dasqfamily/189785933/" target="_blank">ticket writing officer</a> attribution goes to <a href="http://www.flickr.com/photos/dasqfamily/">Qfamily</a>.</em></div>
]]></content:encoded>
			<wfw:commentRss>http://smackdown.blogsblogsblogs.com/2008/07/21/write-a-bad-check-bond-1500-possession-of-cocaine-bond-5-bucks-wtf/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>No, Just Lying About It Is NOT Effective Reputation Management</title>
		<link>http://smackdown.blogsblogsblogs.com/2008/07/08/no-just-lying-about-it-is-not-effective-reputation-management/</link>
		<comments>http://smackdown.blogsblogsblogs.com/2008/07/08/no-just-lying-about-it-is-not-effective-reputation-management/#comments</comments>
		<pubDate>Wed, 09 Jul 2008 01:46:25 +0000</pubDate>
		<dc:creator>Michael VanDeMar</dc:creator>
				<category><![CDATA[blogthropology]]></category>
		<category><![CDATA[lackofmeds]]></category>
		<category><![CDATA[psychoblogging]]></category>
		<category><![CDATA[scams]]></category>
		<category><![CDATA[Social Media]]></category>

		<guid isPermaLink="false">http://smackdown.blogsblogsblogs.com/?p=83</guid>
		<description><![CDATA[If your reputation management strategy is going to be centered around lying about things, then you should at least have the sense to lie in ways that aren&#8217;t easy to refute. For instance, you should steer well clear from trying to lie about things that were said publicly on the Internet. It takes a special [...]]]></description>
			<content:encoded><![CDATA[<p><img src="/images/takingoath.jpg" border="0" alt="Well, I swear to toss some truth in... that count?" style="float: right;" onmouseup="hl2l(event);"> If your reputation management strategy is going to be centered around lying about things, then you should at least have the sense to lie in ways that aren&#8217;t easy to refute. For instance, you should steer well clear from <span id="more-83"></span>trying to lie about things that were said publicly on the Internet. It takes a special kind of stupid, honestly, to think that you won&#8217;t be called on it. For real now.</p>
<p>Daniel Scocco said it best on <a href="http://www.dailyblogtips.com/" target="_blank">Daily Blogging Tips</a> last year, in his post titled, &#8220;<a href="http://www.dailyblogtips.com/put-honesty-and-integrity-above-everything-else/" title="_blank">Put Honesty and Integrity Above Everything Else</a>&#8220;:</p>
<blockquote><p>One slip and you will ruin your reputation for good. And there are no erasers on the Internet. &#8211; <em>Daniel Scocco</em></p></blockquote>
<p>Obviously, this is true for more than just what you put out there on your blog. It extends to all forms of communication on the internet that are indexed and searchable: Twitter, Plurk, Digg/Sphinn/Mixx/blog comments&#8230; and of course, forum posts.</p>
<p>Recently Michael Martinez wrote a blog post having to do with some experimentation dealing with <a href="http://smackdown.blogsblogsblogs.com/2007/10/25/single-source-page-link-test-using-multiple-links-with-varying-anchor-text-part-two/" target="_blank">multiple links from one page to another</a>, and whether they all passed weight or only the first one counted (the second scenario is what mine and <a href="http://www.seo-scientist.com/first-link-counted-rebunked.html" target="_blank">others&#8217; testing</a> showed). The results of his test aren&#8217;t really that important, and contain more words than substance, but at the end of his post he felt the need to include the following assertion:</p>
<blockquote><p>Mr. VanDeMar has been bashing me on various occasions for at least a year. He made the comment on the SEO Scientist blog that you can safely ignore me because 9 times out of 10 I’m wrong. Frankly, after being flamed and attacked by him numerous times on SEO Refugee I just stopped trying to respond to his nonsense. People who resort to unprofessional behavior and poison pen campaigns in order to win arguments will never impress me as being knowledgable or trustworthy. Anyone who wants to believe Michael VanDeMar’s assertions about me, PageRank, or anything else is more than welcome to. But I will certainly appreciate people ignoring all the flame-bait and insults and reaching their own conclusions (about how often I may be right or wrong) based on the evidence I provide. &#8211; <em>Michael Martinez</em></p></blockquote>
<p>I did in fact <a href="http://www.seo-scientist.com/first-link-counted-rebunked.html#comment-5046" target="_blank">make the statement</a> that 9 times out of 10 he is wrong about what he says. This was based on interactions with him personally over the past 2 years or so, and a random sampling of posts or comments he has made that other people have seen fit to show me. I do not read his blog on a regular basis, so there is always a chance that my estimation of his bs percentage is slightly off&#8230; but I have yet to see <em>anything</em> to make me feel that I would be off by much.</p>
<p>The &#8220;flaming&#8221; that Michael refers to in his commentary started on <a href="http://www.seorefugee.com/forums/" target="_blank">SEO Refugee Forums</a> a couple of years ago, because apparently trying to get him to clarify statements he makes is an offense against his sensibilities, and grounds for him to start acting like an asshole. Among the claims I was trying to get clear on were concepts such as:</p>
<ul>
<li>You can have off-page optimization without having links.</li>
<li>You can have competitive phrases that hardly anyone else is trying to rank for.</li>
<li>All links have seo value (said on his blog) and link building is useless (said in the forum) are both true statements.</li>
</ul>
<p>When faced with these questions, Michael&#8217;s response was to immediately get belligerent and very, very wordy, as if just piling on more words and obfuscating the issues would prove his point. It was extremely similar to the tactic that <a href="http://smackdown.blogsblogsblogs.com/2007/08/06/rand-fishkin-the-troll-defense/" target="_blank">Rand Fishkin</a> used the following year when confronted about his own words, with one very clear distinction&#8230; it really and truly did appear that Michael Martinez had somehow convinced himself that his statements made sense.</p>
<p>I commented on Michael&#8217;s blog, trying to point out the truth to him, again, about what was actually said back then. Of course, as with most emotionally challenged people, having evidence to refute the reality they have built up in their minds is simply unacceptable, so he did delete the comment. Since I had a strong feeling he would do that, I went ahead and saved it first:</p>
<blockquote><p>*Ahem*</p>
<blockquote><p>Mr. VanDeMar has been bashing me on various occasions for at least a year. He made the comment on the SEO Scientist blog that you can safely ignore me because 9 times out of 10 I’m wrong. Frankly, after being flamed and attacked by him numerous times on SEO Refugee I just stopped trying to respond to his nonsense.</p></blockquote>
<p>See, Michael, it&#8217;s complete and utter bullshit such as that causing you and I to not get along. I&#8217;d say you were simply lying through your teeth, except that I honestly think you&#8217;ve deluded yourself into believing what you post. Not that I expect you to actually allow this comment to go live, but here is the exact &#8220;flaming&#8221; you are referring to.</p>
<p>Started here, when I tried to get you to clarify some statements that you made and then you got belligerent, after which you refused to reply to me and started referring to me as a troll:</p>
<p><a href="http://www.seorefugee.com/forums/showthread.php?p=35745" target="_blank">http://www.seorefugee.com/forums/showthread.php?p=35745</a></p>
<p>Continued here, where I disagreed with an assertion you tried to make, that Toolbar PageRank, because it was a rounded value and not an exact figure, could not be passed on:</p>
<p><a href="http://www.seorefugee.com/forums/showthread.php?p=46585" target="_blank">http://www.seorefugee.com/forums/showthread.php?p=46585</a></p>
<p>From that point on, in a very troll-like manner, you simply refused to back up <i>anything</i> you said. Now, of course, this being the internet, all of our conversations are still out there for everyone to see&#8230; so of course, feel free to &#8220;debunk&#8221; this claim by through links or exact quotes you feel are relevant. You&#8217;ll find my questioning your claims on how someone can have off-page optimization that has nothing to do with links. You&#8217;ll see where I pointed out that you made this claim in a forum post:</p>
<blockquote><p>I&#8217;ve seen the insults. I&#8217;ve seen the disagreements. I&#8217;ve seen the attempts to box me with meanings for my words that I didn&#8217;t associate with them. But I haven&#8217;t seen anything to persuade me that there might be a reason to care about backlinks or to believe they are relevant to SEO.</p></blockquote>
<p>which is a statement you made a mere <i>3 weeks</i> after making this claim on your blog:</p>
<blockquote><p>Just because people have engaged in cheap link building strategies with low-value sites doesn’t mean that all links from unrelated pages are harmful or unhelpful. Quite the contrary: most such links do help, and in many cases they help better than so-called &#8220;relevant links from relevant pages&#8221;.</p></blockquote>
<p>You&#8217;re trying to call me a troll for pointing out your completely meaningless and contradictory statements? Michael, in case you didn&#8217;t know it, your behavior is the very <i>definition</i> of &#8220;troll&#8221;. You come to the forums and other people&#8217;s blogs, spewing this nonsense, and you do it with the attitude that other people must be idiots. You come off insulting people from the get go. </p>
<p>It&#8217;s obvious to me that your brain is cooked, and that you should be pitied&#8230; but you make it very hard to emphasize, Michael, when you behave like an ass.</p>
<p>Why don&#8217;t you go ahead and let this comment go live, with the links and quotes intact, and allow people to decide for themselves if your claims about me are true, Michael. &#8211; <em>Michael VanDeMar</em></p></blockquote>
<p>So, Michael, if you truly meant what you said about people drawing their own conclusions about you based on the evidence you provide, then maybe you should consider not trying to actually hide some of  the evidence. Just a thought. <img src='http://smackdown.blogsblogsblogs.com/wp-includes/images/smilies/icon_biggrin.gif' alt=':D' class='wp-smiley' /> </p>
]]></content:encoded>
			<wfw:commentRss>http://smackdown.blogsblogsblogs.com/2008/07/08/no-just-lying-about-it-is-not-effective-reputation-management/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>

