Sebastian's Pamphlets: New Google Dupe Filters?

Folks at WebmasterWorld, ThreadWatch and other hang-outs discuss a new duplicate content filter from Google. This odd thing seems to wipe out the SERPs, producing way more collateral damage than any other filter known to SEOs.

From what I've read, all threads contentrate on on-page and on-site factors trying to find a way out of Google's trash can. I admit that on-page/site factors like near-duplicates produced with copy, paste and modify operations or excessive quoting can trigger duplicate content filters. But I don't buy that's the whole story.

If a fair amount of the vanished sites mentioned in the discussions are rather large, those sites probably are dedicated to popular themes. Popular themes are subject of many Web sites. The amount of unique information on popular topics isn't infinite. That is, many Web sites provide the same piece of information. The wording may be different, but there are only so many ways to rewrite a press release. The core information is identical, making many pages considered near-duplicates, and inserting longer quotes even duplicates text snippets or blocks.

Semantic block analysis of Web pages is not a new thing. What if Google just bought a few clusters of new machines, now applying well known filters on a broader set of data? This would perfectly explain why a year ago four very similar pages all ranked fine, then three of four disappeard, and since yesterday all four are gone, because the page having the source bonus resides on a foreign Web site. To come to this conclusion, just expand the scope of the problem analysis to the whole Web. This makes sense, since Google says "Google's mission is to organize the world's information".

Read more here: Thoughts on new Duplicate Content Issues with Google.

Tags: Search Engine Optimization (SEO) Google