Here is a nice simple way to stop the majority of robots spidering files that you don't want to or they shouldn't need to.
Place the following code in robots.txt and upload it to your domain root so when you go to http://forums.site.com/robots.txt you get this file
This will stop them from trying to access files that won't have anything intresting to spider. Also stops them getting images which should save on bandwidth when you have lots of spiders on your forums.
Note: The majority of spiders check for these not all do.
Place the following code in robots.txt and upload it to your domain root so when you go to http://forums.site.com/robots.txt you get this file
Code:
User-agent: * Disallow: attachment.php Disallow: avatar.php Disallow: editpost.php Disallow: member.php Disallow: member2.php Disallow: misc.php Disallow: moderator.php Disallow: newreply.php Disallow: newthread.php Disallow: online.php Disallow: poll.php Disallow: postings.php Disallow: printthread.php Disallow: private.php Disallow: private2.php Disallow: report.php Disallow: search.php Disallow: sendtofriend.php Disallow: threadrate.php Disallow: usercp.php Disallow: /admin/ Disallow: /images/ Disallow: /mod/
Note: The majority of spiders check for these not all do.
Comment