I just thought of the best little anti-bad-robot trick ever. It’s pretty nasty though. Many bad robots read the robots.txt file and go to the places they shouldn’t, thinking there will be good stuff for them. So in my robots.txt I added a row to stop them from accessing honeypot.php. (DON’T GO THERE!)
Anyone who accesses that page gets IP banned from my website. Thus, all robots that don’t care about the robots.txt file get banned. The only bad thing is that if you go there you get banned too. Anyway, the honeypot file simply adds a rewrite rule to the htaccess file, containing the IP, and denies acces for it. It also logs the event in a log file, bans.log.
Now, my site is not visited by robots very often, and I have a load of anti-robot rules in my htaccess file, but I think it’s just as well. I mean, sooner or later bots are going to visit, and when that time comes they are screwed. Anyway, you can find the source of my nasty little script here: honeypot.phps. You’ll also need a robots.txt file, that looks something like mine.