apache、iis6、ii7独立ip主机拦截蜘蛛抓取(适用vps云主机服务器)

如果是正常的搜索引擎蜘蛛访问,不建议对蜘蛛进行禁止,否则网站在百度等搜索引擎中的收录和排名将会丢失,造成客户流失等损失。可以优先考虑升级虚拟主机型号以获得更多的流量或升级为云服务器(不限流量)。

Linux下 规则文件.htaccess(手工创建.htaccess文件到站点根目录)


<IfModule mod_rewrite.c>

RewriteEngine On

#Block spider

RewriteCond %{HTTP_USER_AGENT} "Webdup|AcoonBot|AhrefsBot|Ezooms|EdisterBot|EC2LinkFinder|jikespider|Purebot|MJ12bot|WangIDSpider|WBSearchBot|Wotbox|xbfMozilla|Yottaa|YandexBot|Jorgee|SWEBot|spbot|TurnitinBot-Agent|mail.RU|curl|perl|Python|Wget|Xenu|ZmEu" [NC]

RewriteRule !(^robots\.txt$) - [F]

</IfModule>


windows2003下 规则文件httpd.conf


#Block spider

RewriteCond %{HTTP_USER_AGENT} (Webdup|AcoonBot|AhrefsBot|Ezooms|EdisterBot|EC2LinkFinder|jikespider|Purebot|MJ12bot|WangIDSpider|WBSearchBot|Wotbox|xbfMozilla|Yottaa|YandexBot|Jorgee|SWEBot|spbot|TurnitinBot-Agent|mail.RU|curl|perl|Python|Wget|Xenu|ZmEu) [NC]

RewriteRule !(^/robots.txt$) - [F]


windows2008下 web.config


<?xml version="1.0" encoding="UTF-8"?>
<configuration>
    <system.webServer>
     <rewrite>  
       <rules>         

<rule name="Block spider">

      <match url="(^robots.txt$)" ignoreCase="false" negate="true" />

      <conditions>

        <add input="{HTTP_USER_AGENT}" pattern="Webdup|AcoonBot|AhrefsBot|Ezooms|EdisterBot|EC2LinkFinder|jikespider|Purebot|MJ12bot" ignoreCase="true" />

      </conditions>

      <action type="CustomResponse" statusCode="403" statusReason="Forbidden" statusDes cription="Forbidden" />

</rule>

        </rules>  
      </rewrite>  
     </system.webServer>
</configuration>




注:规则中默认屏蔽部分不明蜘蛛,要屏蔽其他蜘蛛按规则添加即可

附各大蜘蛛名字:

google蜘蛛:googlebot

百度蜘蛛:baiduspider

yahoo蜘蛛:slurp

alexa蜘蛛:ia_archiver

msn蜘蛛:msnbot

bing蜘蛛:bingbot

altavista蜘蛛:scooter

lycos蜘蛛:lycos_spider_(t-rex)

alltheweb蜘蛛:fast-webcrawler

inktomi蜘蛛:slurp

有道蜘蛛:YodaoBot和OutfoxBot

热土蜘蛛:Adminrtspider

搜狗蜘蛛:sogou spider

SOSO蜘蛛:sosospider

360搜蜘蛛:360spider