PDA

View Full Version : Google spider question


amykhar
Thu 25th Sep '03, 3:21pm
I am trying to keep Google off of certain functions because I don't want the bot wasting time trying to edit and report every post it sees.

I have set up my robots.txt file to keep it away from register.php, newreply.php, member.php, memberlist.php, newthread.php and usercp.php. I realize the bot only checks the robots.txt file when it first hits the site; so I don't expect that to take effect right away.

BUT, I also changed my templates to not show the postbit buttons (email, pm, report, edit, reply, etc.) if somebody isn't logged in. The spider still seems to be trying to edit every post though. I cleared the session table, and they showed right back up, editing and reporting posts and trying to check ips.

My question is, does the spider load up a bunch of main pages at once, and then start processing the links contained on those pages in a batch? If so, that would explain why I am still seeing the spiders reporting and editing posts. (Yes, I know they are getting the no permission screen.)

If that's not the case, what am I missing and why are they still monkeying around where they don't belong?

Amy

Zecherieh
Sat 27th Sep '03, 9:01pm
I am trying to keep Google off of certain functions because I don't want the bot wasting time trying to edit and report every post it sees.

I have set up my robots.txt file to keep it away from register.php, newreply.php, member.php, memberlist.php, newthread.php and usercp.php. I realize the bot only checks the robots.txt file when it first hits the site; so I don't expect that to take effect right away.

BUT, I also changed my templates to not show the postbit buttons (email, pm, report, edit, reply, etc.) if somebody isn't logged in. The spider still seems to be trying to edit every post though. I cleared the session table, and they showed right back up, editing and reporting posts and trying to check ips.

My question is, does the spider load up a bunch of main pages at once, and then start processing the links contained on those pages in a batch? If so, that would explain why I am still seeing the spiders reporting and editing posts. (Yes, I know they are getting the no permission screen.)

If that's not the case, what am I missing and why are they still monkeying around where they don't belong?

Amy

Just write some if then statments based on if its google or not.

merk
Sat 27th Sep '03, 9:27pm
Just write some if then statments based on if its google or not.
I also changed my templates to not show the postbit buttons (email, pm, report, edit, reply, etc.) if somebody isn't logged in.
This message is too short aparently :(

Zecherieh
Sun 28th Sep '03, 10:42am
This message is too short aparently :(.


Thats what I get for not reading the whole thing :)

That said still the question is are you hiding buttons and not the links? if the url is still there in the html, good chance google will find it., also realize that some of what it is indexing is from what it already has, so if they have five thousand links to you that are posting links or what not, they will keep going there.