That’s what I wanted to believe when I started with search engine optimization. I knew that you could block bots with htaccess but I thought the more bots get the my page the better. But the truth is you should always protect your website against content scrapers. One way to access your website’s content is by using Wget command. There are thousands of people these days that want to take advantage of the hard work of yours and mine to make a quick buck. What they do is write robots that go across the net and pretend to be from legitimate sites such as Google and MSN and grab your content. Now who wants that?
The problem with these robots is not just the fact that they are stealing your content, but also the bandwidth the consume. If your site is highly on demand, then you can expect these robots to take away your bandwidth and slow down the process time for your website. At the end of the day, you want to make sure that only legitimate sites get access to your valuable content. Using robots.txt allows you to at least protect yourself against most scraping attacks. Read on more about it here.




