The more complex a site gets, the harder it can be to keep track and monitor the presence of duplicate pages on the site. Many widely used e-commerce platforms suffer from this and it can make the search engines kick up a bit of a fuss.
As with a lot of ‘bug hunting’ in computing, half the battle is actually looking for them, but if you know the most likely places to look, you can make life a bit easier for yourself.
Session IDs:
These are strings of text that are often added on to the end of a URL to help the system track a user throughout the site. They are very useful but can cause a big problem with duplication, making the search engine cache a different URL each time it visits the page. Keep an eye out on your site for something like this:
www.domain.com?sid=78954378256232653654
Pagination:
When you have too much content for one page it is good practice to put content into different sections with links at the bottom showing page 1, page 2 or page 3 etc. Remember to take care that your site isn’t doing something dodgy with the URLs here. Often, the links will be different from one page to the next, even though the content is the same.
Products:
It’s easy to pick up different URLs on e-commerce sites when products can be viewed from many different locations. Different URLs can be cached from categories, search pages, popular products and direct links etc. You will need to keep an eye on it.
Sort functions:
Any situation where a list can be sorted, like a category page, can easily be duplicated by the sort function.
Sorting these problems out can make a real difference, and it is usually pretty easy to do once you know how. There is plenty of information on 301 redirects, htaccess, robots.txt files, nofollow tags and canonical meta tags right here on this site to sort out any duplication problems you may have. Good luck.
Simon Davies
SEO Programmer
From my experience duplicate content on the same domain name does not cause any penalties.
It’s logical that you could have duplicate content on your domain.