Lots of people want to block Baidu spider since it does not care about robots.txt and it might just eat your servers resources.
For those who doesn’t know what Baidu is, it’s a Chinese search engine.
I have a big vBulletin-forum and Baidu really gives the server a very hard time, it’s constantly on the forum indexing stuff. It’s a forum in a European language so I dont really see any users from China so why would I need Baidus spiders? So I blocked them.
This is how you can block Baidu too.
Find your htaccess-file. If you run Windows on the server you’re out of luck but most Linux-servers seem to use htaccess.
Log into your FTP, check for a file called “.htaccess” and open it. Add these lines to the top of the file:
BrowserMatchNoCase Baiduspider bad_bot
Deny from env=bad_bot
And save it. If you dont have a “.htaccess”-file, create one in notepad and save it as “.htaccess”. If it doesn’t work, you might have to ask your webhost what’s wrong.
If you’re thinking about blocking Baidus IPs, you’ll have a lot of work to do since you’ll have to block IPs nearly everyday. Baidu seem to get new IPs all the time. So the most easy way to block Baidu is to do it with htaccess.