+ Reply to Thread
Results 1 to 6 of 6

Thread: Inktomi Slurp Attack

  1. #1

    Inktomi Slurp Attack

    On August 3, 2003, the vbulletin forum on my site was targeted by the "Slurp" bot from Inktomi. For 36 hours, there were as many as 50 connections from 66.196.72.xx. I knew at the time that it was a search engine spider, but assumed that its activty would be benign. That was until I checked my logs. In that 36 hours, the spider generated 10GB of data transfer. This is over 2/3 of our monthly allotment and means that I will likely have to pay for excess bandwidth. As soon as I discovered this excess traffic, I checked the web and found out how to implement "robots.txt" to exclude bot access to the vbulletin directory. This worked, and sent the bot on its way.

    I assumed the bulk of my problems were over, but this was not the case. Just this weekend, access to my site was locked due to exceeding the disk storage allotment from our ISP. An investigation revealed that the Inktomi bot had generated so much activity that the log file of its access was 155MB! I have subsequently deleted this to restore access.

    I fully blame Inktomi for this mess and am in the process of seeking restitution. However, I fully realize that my prospects are slim. The reason that I am writing on this forum is to ask if there is something about the structure of vbulletin that compounds this problem. I would also like to know if there is something that can be done in the standard installation so that others to not experience a similar horror story. Obviously, I know that "robots.txt" is a solution. My guess is that few users will even be aware of the need for such a solution until after they experience an attack like I did.

    I have had previous forum packages installed on my site and have been visited by "Slurp" before. However, nothing like this has ever happened. For whatever reason, once the bot hits a vbulletin forum, it appears to latch on in an endless loop. I don't know that it ever would have detached on its own.

    Regards
    Don McRitchie
    Webmaster
    Lansing Heritage Website
    http://www.audioheritage.org
    Last edited by Don McR; Tue 12th Aug '03 at 1:12pm.

  2. #2
    Senior Member firewire is on a distinguished road firewire's Avatar
    Join Date
    May 2000
    Location
    Frankfurt/Germany
    Posts
    187
    The server logs should really NOT count for your disk quota, that's a bad business practice for a provider.

    The best hint is robots.txt. Most search engines and web spiders respect that file, I only know very few who don't, mostly user-controlled mirroring tools. There should be a vB option to limit requests from a single IP per timeframe but I realize this is difficult to achieve.

  3. #3
    Quote Originally Posted by firewire
    The server logs should really NOT count for your disk quota, that's a bad business practice for a provider.

    The best hint is robots.txt. Most search engines and web spiders respect that file, I only know very few who don't, mostly user-controlled mirroring tools. There should be a vB option to limit requests from a single IP per timeframe but I realize this is difficult to achieve.
    The actual server logs do not count against the disk quota. However, there is a utility that generates detailed reports on site access on a weekly basis. If you don't delete the old reports, they count against the quota. One of the reports lists access to individual directories. The report for the vbulletin directory was 155MB for the one week that included the "Slurp" attack.

    I'm still curious why the spider would latch onto vbulletin and not let go. The forum database is 20MB, yet the bot downloaded 10GB just from the vbulletin directory. That is the equivalent of downloading the entire forum 500 times. I can assume that the dynamic nature of forum has something to do with it, but this is ludicrous. Of course Inktomi (Yahoo) will not respond to my emails.

    Regards
    Don McRitchie
    Webmaster
    Lansing Heritatge Website
    http://www.audioheritage.org

  4. #4

    I have the same problem

    My forums are new and very small - www.hostcompanies.com/forums. However that hasnt stopped the intomi bot from using over 1GB of my bandwidth in the last 21 days!
    Add your web design and web hosting companies to my Web Hosting Directory

  5. #5
    Senior Member Joe is on a distinguished road
    Join Date
    May 2000
    Location
    Highland, Utah.
    Age
    31
    Posts
    2,435
    Block it. Better yet, ad some code to your vB script so bots are not shown session ID's. Apparently Slurp dosent handle session ID's very well...

  6. #6

    Where do i get that from?

    Quote Originally Posted by Joe
    Block it. Better yet, ad some code to your vB script so bots are not shown session ID's. Apparently Slurp dosent handle session ID's very well...
    Do you know where i could get that code from?
    Is its just a few lines of code, could you post that here in the thread? I would appreciate it.
    I still have lots of bandwidth before i go over, but i don't like so much bandwidth used by a bot, besides if i got it as bad as the poor 10 GB guy, i might go over my bandwidth.
    Add your web design and web hosting companies to my Web Hosting Directory

+ Reply to Thread

Similar Threads

  1. inktomi spider
    By jacobi in forum Chit Chat
    Replies: 23
    Last Post: Fri 8th Aug '08, 12:21am
  2. Inktomi shy from vBulletin?
    By Kwak in forum vBulletin 3.0 How Do I and Troubleshooting Forum
    Replies: 3
    Last Post: Fri 30th Apr '04, 4:06pm
  3. Inktomi bots not being recognised
    By Oblivion Knight in forum vBulletin 3.0 How Do I and Troubleshooting Forum
    Replies: 4
    Last Post: Wed 10th Mar '04, 9:12am
  4. Lots of Guests from Inktomi
    By UKCobra in forum vBulletin 2 'How Do I' and Troubleshooting
    Replies: 3
    Last Post: Sun 6th Jul '03, 8:06pm
  5. Replies: 22
    Last Post: Sun 6th Jul '03, 12:11am

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts