LATEST POST - My first Arduino Circuit
Posted In Web Design on April 13th, 2011
Over the weekend I have been coding a new BOT protection system for the new version of PixelGrafters. Over many years now PixelGrafters has frequently been bombarded by bots due to the nature of the dynamic content I display and it annoys me that I cannot tell very easily what is a human visitor. I decided I wanted that to change so I got coding, here is some details on what I came up with over the past 3 days.

^ PixelGrafters Admin Panel ^
The idea was simple, ban nasty bots and send them 403 errors and allow good ones to crawl as normal. I started off with a little re-search on the practises of bad bots. It turns out that mostly a bad bot will do spoofing of useragents so you cant block them that way through the robots.txt and do the opposite of the rules you lay down in the robots.txt file most of the time. This made me think I could trap them with some “honey traps”.
The first idea was to deny access to a directory in robots.txt and then when the bot visits that directory they get banned. To ban them I needed to store all data about them so with some nifty PHP I logged all possible info the server could give me and did a reverse DNS lookup on them which is only 1 single function in PHP (nice!). After storing that data if they come back with the same IP address they will still be banned and sent a 403 error.
This is IP address specific and if they come back with a different IP address they will be allowed but if they act in the same way eventually they will be banned again and the owners of the BOT should eventually run out of IP addresses to use I imagine.
I also wanted to create a Admin Panel for any BOT that wasn’t automatically caught by a honey trap. This would allow me to ban manually. This prompted a re-structuring of the database and system I started with to contain 4 tables. One for nice bots (google etc), one for banned bad bots, one for locations IP address had been to, and one for unassigned visits.
This was massive and took a long time and quite a bit of debugging to make it work how I wanted. but now I can log all locations of where a visitor has been and I can move them to other tables assigning them as either a banned bot, or nice bot and drop all the info about their visits from the system altogether.
As far as I can tell now this is flawless, I still need to add in a few more honey traps as nasty bots are intentionally looking for files on the server such as “/wp-login.php” so they can attempt to hack peoples wordpress accounts. This will simply be a regular expression that checks for similarity to certain patters in the URL and if they are obviously looking for stuff they shouldn’t be they will be banned, haha.
As far as looks goes with the Admin Panel, I made it look all pretty with my usual CSS skills. I made each visit log in what looks like a table but is actually an un-ordered list. Then each list item has a drop down which I made with MooTools jQuery API so you can see more info such as locations the visitor has been to when expanded. Then theres the links on the right for managing bans and deleting data etc. The Image at the Top of the Post is the Admin Panel.