Over the last few days there has been a bit of talk about the standards that are being developed to replace the Robots.txt file that all SEO’s use. Robots.txt was developed in 1994 when bandwidth was more precious, to stop search engines crawling sites to often and using up people’s bandwidths Robots.txt was developed to stop this and the search engines followed the standards though it was all voluntary. The new standards are called ACAP which stands for Automated Content Access Protocol. Now strictly talking this is not Robots.txt 2.0 because robots.txt does not have a working group behind it pushing the standard forward. The ACAP group was put together in 2006 and is backed by major European news papers and publishers. ACAP has been devised to provide more control over publishers content.
As it stands search engines take as default that a whole site can be crawled. ACAP want to change that so by default sites are not crawled and the user tells the search engine what it can crawl and when. One of the main features that is included in ACAP is ‘time based inclusion / exclusion’ This could mean that a news website could publish an article on their site and tell search engines to drop the page from the index after 2 weeks when the news becomes old or irrelevant. Another feature that gives publishers more control is the ‘Present’ command this allows content owners to control search engines ability to display a copy of the web site weather it be a snippet, a miniature version of the site or a thumbnail of the website.
The question is Should we use it? My current opinion is, in its current state, no. Search engines are yet to acknowledge the new technique though Google have said in a statement that they would be looking into the standards. As the standards have been backed by major publishing websites then it seems that it is being geared towards protecting content for the major sites. An update and group pushing technology forward is good as Robots.txt basically got devised then left. People should be able to have complete control over their website and access to it. Weather ACAP is the answer nobody at this point knows, If the search engines don’t recognise it and take it on board then its been a big waste of time but if they do then this could be the next step up from Robots.txt and something we will all be using in the future, In the mean time we need to wait and see what happens.
Gary
SEO Programmer