<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: How To Block The Bots SEOmoz *Isn&#8217;t* Telling You About</title>
	<atom:link href="http://smackdown.blogsblogsblogs.com/2008/10/17/how-to-block-the-bots-seomoz-isnt-telling-you-about/feed/" rel="self" type="application/rss+xml" />
	<link>http://smackdown.blogsblogsblogs.com/2008/10/17/how-to-block-the-bots-seomoz-isnt-telling-you-about/</link>
	<description>Smackdown!</description>
	<lastBuildDate>Sun, 14 Mar 2010 03:45:59 -0500</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: Why The Renewed Interest In The Linkscape Scams And Deception..? &#124; Smackdown!</title>
		<link>http://smackdown.blogsblogsblogs.com/2008/10/17/how-to-block-the-bots-seomoz-isnt-telling-you-about/comment-page-2/#comment-32818</link>
		<dc:creator>Why The Renewed Interest In The Linkscape Scams And Deception..? &#124; Smackdown!</dc:creator>
		<pubDate>Fri, 22 Jan 2010 22:15:36 +0000</pubDate>
		<guid isPermaLink="false">http://smackdown.blogsblogsblogs.com/?p=132#comment-32818</guid>
		<description>[...] I covered this back when the launch actually happened, in this Linkscape post, resulting in quite a few comments, and there was more than a little heated conversation in the [...]</description>
		<content:encoded><![CDATA[<p>[...] I covered this back when the launch actually happened, in this Linkscape post, resulting in quite a few comments, and there was more than a little heated conversation in the [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Zaphod</title>
		<link>http://smackdown.blogsblogsblogs.com/2008/10/17/how-to-block-the-bots-seomoz-isnt-telling-you-about/comment-page-2/#comment-24294</link>
		<dc:creator>Zaphod</dc:creator>
		<pubDate>Fri, 04 Sep 2009 07:24:36 +0000</pubDate>
		<guid isPermaLink="false">http://smackdown.blogsblogsblogs.com/?p=132#comment-24294</guid>
		<description>Well pardon me, I was just trying to help. Seems a GNU General Public License author can&#039;t get any respect anywhere.

Please deleted this post, and the previous one I made. I surely don&#039;t want to be associated with the likes of you!</description>
		<content:encoded><![CDATA[<p>Well pardon me, I was just trying to help. Seems a GNU General Public License author can&#8217;t get any respect anywhere.</p>
<p>Please deleted this post, and the previous one I made. I surely don&#8217;t want to be associated with the likes of you!</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Ryan</title>
		<link>http://smackdown.blogsblogsblogs.com/2008/10/17/how-to-block-the-bots-seomoz-isnt-telling-you-about/comment-page-2/#comment-24269</link>
		<dc:creator>Ryan</dc:creator>
		<pubDate>Thu, 03 Sep 2009 22:02:29 +0000</pubDate>
		<guid isPermaLink="false">http://smackdown.blogsblogsblogs.com/?p=132#comment-24269</guid>
		<description>Sorry to resurrect an old thread, but has ANYONE figured out how to block linkscape outside of the ridiculous meta tag?</description>
		<content:encoded><![CDATA[<p>Sorry to resurrect an old thread, but has ANYONE figured out how to block linkscape outside of the ridiculous meta tag?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Zaphod</title>
		<link>http://smackdown.blogsblogsblogs.com/2008/10/17/how-to-block-the-bots-seomoz-isnt-telling-you-about/comment-page-2/#comment-24231</link>
		<dc:creator>Zaphod</dc:creator>
		<pubDate>Thu, 03 Sep 2009 10:40:03 +0000</pubDate>
		<guid isPermaLink="false">http://smackdown.blogsblogsblogs.com/?p=132#comment-24231</guid>
		<description>Thanks for the pertinent info on these goofballs. I have also found that blocking them by user agent helps too. The updated signatures for {my spammy product} will be able to deal with these keyword scrapers even better.

I did try to block them before by domain, but they weaseled around that, the give-away was their user agent showing up from other IPs. I just wanted to make sure (and this page helped me decide) that I wanted to deep-six their crawler from wherever it came from.

I personally would rather toss them a 403 a few times with reasons, than give them a robots.txt . Most scrapers, and bot probes (laycat.com) just ignore robots.txt anyway. After they hit {my spammy product} 3 times, it switches them to a 503 permanently. Quite better at bandwidth saving.</description>
		<content:encoded><![CDATA[<p>Thanks for the pertinent info on these goofballs. I have also found that blocking them by user agent helps too. The updated signatures for {my spammy product} will be able to deal with these keyword scrapers even better.</p>
<p>I did try to block them before by domain, but they weaseled around that, the give-away was their user agent showing up from other IPs. I just wanted to make sure (and this page helped me decide) that I wanted to deep-six their crawler from wherever it came from.</p>
<p>I personally would rather toss them a 403 a few times with reasons, than give them a robots.txt . Most scrapers, and bot probes (laycat.com) just ignore robots.txt anyway. After they hit {my spammy product} 3 times, it switches them to a 503 permanently. Quite better at bandwidth saving.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Steve</title>
		<link>http://smackdown.blogsblogsblogs.com/2008/10/17/how-to-block-the-bots-seomoz-isnt-telling-you-about/comment-page-2/#comment-10539</link>
		<dc:creator>Steve</dc:creator>
		<pubDate>Wed, 17 Dec 2008 23:14:08 +0000</pubDate>
		<guid isPermaLink="false">http://smackdown.blogsblogsblogs.com/?p=132#comment-10539</guid>
		<description>I have also been ordered to fight this on our European servers that currently host 32,000 websites and complaints have been filed. I think this will have seriosu repurcussions.</description>
		<content:encoded><![CDATA[<p>I have also been ordered to fight this on our European servers that currently host 32,000 websites and complaints have been filed. I think this will have seriosu repurcussions.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Lance</title>
		<link>http://smackdown.blogsblogsblogs.com/2008/10/17/how-to-block-the-bots-seomoz-isnt-telling-you-about/comment-page-2/#comment-10505</link>
		<dc:creator>Lance</dc:creator>
		<pubDate>Wed, 22 Oct 2008 22:37:39 +0000</pubDate>
		<guid isPermaLink="false">http://smackdown.blogsblogsblogs.com/?p=132#comment-10505</guid>
		<description>Interesting enough, the project&#039;s code name was Carhole, and the registered name for dotnetdotcom.org is Jeff Albertson. Jeff Albertson just happens to be comic book guys name............</description>
		<content:encoded><![CDATA[<p>Interesting enough, the project&#8217;s code name was Carhole, and the registered name for dotnetdotcom.org is Jeff Albertson. Jeff Albertson just happens to be comic book guys name&#8230;&#8230;&#8230;&#8230;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: NotHappy</title>
		<link>http://smackdown.blogsblogsblogs.com/2008/10/17/how-to-block-the-bots-seomoz-isnt-telling-you-about/comment-page-2/#comment-10480</link>
		<dc:creator>NotHappy</dc:creator>
		<pubDate>Tue, 21 Oct 2008 18:08:32 +0000</pubDate>
		<guid isPermaLink="false">http://smackdown.blogsblogsblogs.com/?p=132#comment-10480</guid>
		<description>Doug (Or anyone else) feel free to write a post with this info, but as Michael points out it&#039;s not a single bullet proof solution because they can buy more IP&#039;s and it doesn&#039;t stop the &quot;other sources&quot;. But it will block the /28 IP block that has been assigned to Dotbot.

The /28 is a block of 16 IP addresses, in this case from:

208.115.111.240

to
 
208.115.111.255

So if you firewall those 2 IP&#039;s and the 14 in between you will be blocking the Dotbot IP range. The Dotbot site itself is hosted in the http://208.115.111.242 IP.

In the past i have found the best way to deal with situations like this is to target the data sources rather than trying to fight the offender.

Our company is writing letters to all data sources on the LinkScrape page besides the big 3 (G, Y and MSN) notifying them we have completely blocked their crawlers from 23 webservers encompassing thousands of websites due to the data harvesting practices of SEOMoz.org who are utilizing and selling their data services.

I doubt these companies will be impressed when the value of their services are lowered, user experience reduced etc because of some company in Seattle flogging $79 a month subscriptions with the data. Especially when they realize they have been blocked from sub 5,000 Alexa sites due to SEOMoz.</description>
		<content:encoded><![CDATA[<p>Doug (Or anyone else) feel free to write a post with this info, but as Michael points out it&#8217;s not a single bullet proof solution because they can buy more IP&#8217;s and it doesn&#8217;t stop the &#8220;other sources&#8221;. But it will block the /28 IP block that has been assigned to Dotbot.</p>
<p>The /28 is a block of 16 IP addresses, in this case from:</p>
<p>208.115.111.240</p>
<p>to</p>
<p>208.115.111.255</p>
<p>So if you firewall those 2 IP&#8217;s and the 14 in between you will be blocking the Dotbot IP range. The Dotbot site itself is hosted in the <a href="http://208.115.111.242" rel="nofollow">http://208.115.111.242</a> IP.</p>
<p>In the past i have found the best way to deal with situations like this is to target the data sources rather than trying to fight the offender.</p>
<p>Our company is writing letters to all data sources on the LinkScrape page besides the big 3 (G, Y and MSN) notifying them we have completely blocked their crawlers from 23 webservers encompassing thousands of websites due to the data harvesting practices of SEOMoz.org who are utilizing and selling their data services.</p>
<p>I doubt these companies will be impressed when the value of their services are lowered, user experience reduced etc because of some company in Seattle flogging $79 a month subscriptions with the data. Especially when they realize they have been blocked from sub 5,000 Alexa sites due to SEOMoz.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Michael VanDeMar</title>
		<link>http://smackdown.blogsblogsblogs.com/2008/10/17/how-to-block-the-bots-seomoz-isnt-telling-you-about/comment-page-2/#comment-10479</link>
		<dc:creator>Michael VanDeMar</dc:creator>
		<pubDate>Tue, 21 Oct 2008 16:10:30 +0000</pubDate>
		<guid isPermaLink="false">http://smackdown.blogsblogsblogs.com/?p=132#comment-10479</guid>
		<description>Well, blocking by IP is ok as long as they don&#039;t decide to switch or add new IP&#039;s. The thing is, that&#039;s one bot, and I still firmly believe that it only accounts for a small amount of the collection process they are claiming. It&#039;s the only way that things add up.

Also, since that&#039;s like closing the barn door after SEOmoz kicked it in and raped the horses, here&#039;s a possible method for dealing with what they took already:

http://smackdown.blogsblogsblogs.com/2008/10/21/how-to-remove-your-website-from-linkscape-without-an-seomoz-meta-tag/</description>
		<content:encoded><![CDATA[<p>Well, blocking by IP is ok as long as they don&#8217;t decide to switch or add new IP&#8217;s. The thing is, that&#8217;s one bot, and I still firmly believe that it only accounts for a small amount of the collection process they are claiming. It&#8217;s the only way that things add up.</p>
<p>Also, since that&#8217;s like closing the barn door after SEOmoz kicked it in and raped the horses, here&#8217;s a possible method for dealing with what they took already:</p>
<p><a href="http://smackdown.blogsblogsblogs.com/2008/10/21/how-to-remove-your-website-from-linkscape-without-an-seomoz-meta-tag/" rel="nofollow">http://smackdown.blogsblogsblo.....-meta-tag/</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Doug Heil</title>
		<link>http://smackdown.blogsblogsblogs.com/2008/10/17/how-to-block-the-bots-seomoz-isnt-telling-you-about/comment-page-2/#comment-10478</link>
		<dc:creator>Doug Heil</dc:creator>
		<pubDate>Tue, 21 Oct 2008 16:02:59 +0000</pubDate>
		<guid isPermaLink="false">http://smackdown.blogsblogsblogs.com/?p=132#comment-10478</guid>
		<description>@Nothappy; I would appreciate it if you could please write up a blog post about how &quot;exactly&quot; website owners and webmasters can block this bot. If not you; someone you give the info to would suffice. I think it would be very beneficial to the industry. This bot is like any o&#039;l rogue bot not wanted on our servers. It needs to be dealt with by a good majority. What I mean is, make things very clear so as a new webmaster/owner clearly sees what to do step by step.</description>
		<content:encoded><![CDATA[<p>@Nothappy; I would appreciate it if you could please write up a blog post about how &#8220;exactly&#8221; website owners and webmasters can block this bot. If not you; someone you give the info to would suffice. I think it would be very beneficial to the industry. This bot is like any o&#8217;l rogue bot not wanted on our servers. It needs to be dealt with by a good majority. What I mean is, make things very clear so as a new webmaster/owner clearly sees what to do step by step.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: NotHappy</title>
		<link>http://smackdown.blogsblogsblogs.com/2008/10/17/how-to-block-the-bots-seomoz-isnt-telling-you-about/comment-page-2/#comment-10477</link>
		<dc:creator>NotHappy</dc:creator>
		<pubDate>Tue, 21 Oct 2008 15:35:52 +0000</pubDate>
		<guid isPermaLink="false">http://smackdown.blogsblogsblogs.com/?p=132#comment-10477</guid>
		<description>Just as i thought Rand, the old &quot;Can&#039;t say competitive intelligence&quot; line which puts SEOMoz at the bottom of the barrel with all the other scrapers. Actually they are better, they are harvesting it for things like spam which i can identify and do something about.

So looks like your Dotbot (which may be a red herring anyway) has been issued a /28 so:

# iptables -A INPUT -s 208.115.111.240/28 -j DROP 

That will block 208.115.111.240 - 208.115.111.255 at the firewall (The Dotbot site is using 208.115.111.242 if you visit in the browser)

BTW Rand, how did you receive a /28 IP allocation? Running a scraper network is not valid ARIN justification. Might have to send off an email.

Anyhow if everyone issues this command:

# iptables -A INPUT -s 208.115.111.240/28 -j DROP

Alternatively you can block that IP range with .htaccess if you don&#039;t have root access, and i will dig in to the logs and see what this LinkScraper is doing.

Sorry to say, all respect i had for SEOMoz is now gone and i will not be visiting it again.</description>
		<content:encoded><![CDATA[<p>Just as i thought Rand, the old &#8220;Can&#8217;t say competitive intelligence&#8221; line which puts SEOMoz at the bottom of the barrel with all the other scrapers. Actually they are better, they are harvesting it for things like spam which i can identify and do something about.</p>
<p>So looks like your Dotbot (which may be a red herring anyway) has been issued a /28 so:</p>
<p># iptables -A INPUT -s 208.115.111.240/28 -j DROP </p>
<p>That will block 208.115.111.240 &#8211; 208.115.111.255 at the firewall (The Dotbot site is using 208.115.111.242 if you visit in the browser)</p>
<p>BTW Rand, how did you receive a /28 IP allocation? Running a scraper network is not valid ARIN justification. Might have to send off an email.</p>
<p>Anyhow if everyone issues this command:</p>
<p># iptables -A INPUT -s 208.115.111.240/28 -j DROP</p>
<p>Alternatively you can block that IP range with .htaccess if you don&#8217;t have root access, and i will dig in to the logs and see what this LinkScraper is doing.</p>
<p>Sorry to say, all respect i had for SEOMoz is now gone and i will not be visiting it again.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
