Fast-Webcrawler??

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • MarkB
    Senior Member
    • Apr 2001
    • 1253

    Fast-Webcrawler??

    Does anyone know what Fast-Webcrawler is? It's showing up as a spider on my web logs, and in a day and a half has made (according to the logs) over 23,000 hits on my site -- could this be why my bandwidth is going through the roof?

    It's not something I want, obviously, so any tips on stopping spiders in their tracks?
  • filburt1
    Senior Member
    • Feb 2002
    • 6606

    #2
    Check out how to add a robots.txt to your root dir to stop bots.
    --filburt1, vBulletin.org/vBulletinTemplates.com moderator
    Web Design Forums.net: vB Board of the Month
    vBulletin Mail System (vBMS): webmail for your forum users

    Comment

    • MarkB
      Senior Member
      • Apr 2001
      • 1253

      #3
      I've just added one, thanks But I understand webcrawler ignores robots.txt files? Or at least that's what I'm understanding after reading some usenet posts...

      Here's hoping it's blocked now. Cheers!

      Comment

      • andrewpfeifer
        Senior Member
        • Oct 2000
        • 729
        • 3.5.x

        #4


        That may help you, also, if the previous thing didn't.
        - Andrew Pfeifer

        Comment

        • MarkB
          Senior Member
          • Apr 2001
          • 1253

          #5
          I have put in the robots.txt file as well as the NOINDEX meta tag, which has slowed it a little - but it's still made almost 10,000 hits since yesterday

          Comment

          • Thomas P
            Senior Member
            • Apr 2001
            • 1497
            • 5.6.4

            #6
            Hi,
            maybe checking the netblock of the IP (e.g. at www.all-nettools.com or similar tools) the bot come from and contacting that ISP helps.
            good luck anyway,
            -Tom
            www.MCSEboard.de
            German Windows Server & IT Pro Community dedicated to Windows Client & Server Systems. MVPs inside

            Comment

            • filburt1
              Senior Member
              • Feb 2002
              • 6606

              #7
              You can also try asking your host to block requests from that source. Better yet, add a .htaccess file that bans that IP.
              --filburt1, vBulletin.org/vBulletinTemplates.com moderator
              Web Design Forums.net: vB Board of the Month
              vBulletin Mail System (vBMS): webmail for your forum users

              Comment

              • MarkB
                Senior Member
                • Apr 2001
                • 1253

                #8
                Originally posted by filburt1
                You can also try asking your host to block requests from that source. Better yet, add a .htaccess file that bans that IP.
                What format would the .htaccess file take?

                Comment

                • Ian
                  Senior Member
                  • Mar 2002
                  • 132

                  #9
                  Originally posted by MarkB


                  What format would the .htaccess file take?
                  Something like this:
                  Code:
                  <Limit GET>
                  order allow,deny
                  deny from 123.456.789.0
                  deny from 123.45.67
                  allow from all
                  </Limit>

                  Comment

                  • MarkB
                    Senior Member
                    • Apr 2001
                    • 1253

                    #10
                    Thanks - did that, and access to my forums came up with a 500 Error

                    Comment

                    • filburt1
                      Senior Member
                      • Feb 2002
                      • 6606

                      #11
                      500 is Internal Server Error You didn't deny yourself acccess, did you?
                      --filburt1, vBulletin.org/vBulletinTemplates.com moderator
                      Web Design Forums.net: vB Board of the Month
                      vBulletin Mail System (vBMS): webmail for your forum users

                      Comment

                      • neocivitas
                        Senior Member
                        • Apr 2002
                        • 142

                        #12
                        Originally posted by Ian

                        Code:
                        <Limit GET>
                        order allow,deny
                        deny from 123.456.789.0
                        deny from 123.45.67
                        allow from all
                        </Limit>
                        What IPs should you have listed to deny?

                        Comment

                        • MarkB
                          Senior Member
                          • Apr 2001
                          • 1253

                          #13
                          Originally posted by filburt1
                          500 is Internal Server Error You didn't deny yourself acccess, did you?
                          hehehe - no

                          This is the log entry for FAST-WebCrawler:

                          Code:
                          66.77.73.69 - - [28/Apr/2002:20:23:56 -0400] "GET /forum/showthread.php?goto=lastpost&threadid=21745 HTTP/1.0" 302 0 "-" "FAST-WebCrawler/3.5 (atw-crawler at fast dot no; [url]http://fast.no/support.php?c=faqs/crawler[/url])"
                          So I put 66.77.73 in the htaccess file.

                          I also have robots.txt with the correct deny info as per their FAQ...

                          Comment

                          • MarkB
                            Senior Member
                            • Apr 2001
                            • 1253

                            #14
                            This is the htaccess file:

                            Code:
                            <Limit GET>
                            order allow,deny
                            deny from 66.77.73
                            allow from all
                            </Limit>

                            Comment

                            • MarkB
                              Senior Member
                              • Apr 2001
                              • 1253

                              #15
                              Problem solved! I had a clashing <directory> entry in my httpd.conf file - fixed that, and the bot hasn't hit my site for the last 30 minutes (and it was there 24/7 seemingly!).

                              Ahh... The Days Of Our Server continues

                              Comment

                              widgetinstance 262 (Related Topics) skipped due to lack of content & hide_module_if_empty option.
                              Working...