Sebastian's Pamphlets: Code Monkey Very Simple Man/Woman

Rcjordan over at Threadwatch pointed me to a nice song perfectly explaining romors like "Google's verification tags get you into supplemental hell" and thoughtless SEO theories like "self-closing meta tags in HTML 4.x documents and uppercase element/attribute names in XHTML documents prevent search engine crawlers from indexing". You don't believe such crappy "advice" can make it to the headlines? Just wait for an appropiate thread at your preferred SEO forum picked by a popular but technically challenged blogger. This wacky hogwash is the most popular lame excuse for MSSA issues (aka "Google is broke coz my site sitting at top10 positions since the stone age disappeared all of a sudden") at Google's very own Webmaster Central.

Here is a quote:

"The robot [search engine crawler] HAS to read syntactically ... And I opt for this explanation exactly because it makes sense to me [the code monkley] that robots have to be dilligent in crawling syntactically in order to do a good job of indexing ... The old robots [Googlebot 2.x] did not actually parse syntactically - they sucked in all characters and sifted them into keywords - text but also tags and JS content if the syntax was broken, they didn't discrimnate. Most websites were originally indexed that way. The new robots [Mozilla compatible Googlebot] parse with syntax in mind. If it's badly broken (and improper closing of a tag in the head section of a non-xhtml dtd is badly broken), they stop or skip over everything else until they find their bearings again. With a broken head that happens the </html> tag or thereabouts".

Basically this means that the crawler ignores the remaining code in HEAD or even hops to the end of the document not reading the page's contents.

In reality search engine crawlers are pretty robust and fault tolerant, designed to eat and digest the worst code one can provide. These syntax errors break strict validations, and they lead to all kind of troubles when you try to parse such erroneous documents with a standard XML parser. Search engine crawlers OTOH are designed to handle the worst stuff avail on the net and can extract meta data and textual contents from malformed code that even IE cannot render.

I'm all for clean code but IMHO explaining each and every site-disappeared case with syntax errors is misleading and distracts webmasters and site owners asking for advice.

Lets face it, for 98% of all questions in the sense of "Why the heck did my site disappear all of a sudden after holding top positions over years" the causes are one or more of the following:

Lazyness (lack of promotion, relying on reciprocal linkage etc.)
Ignorance (outdated business models like sheer affiliate sites)
Cheating ("undetectable" SEO shortcuts which Google learned to discover recently)
Oversaturation and quality issues (most of them covered under the MSSA umbrella)

Another 1%:

Simple outranking (competitors hired smart experts)
Loss of demand (site theme no longer hot or better products avail)

And the remaining 1%:

Technical issues (IIS, maintenance windows ...)
Unintentional use of architectures / methods considered questionable
Magic (insulted Googler strikes back)

(Percentages made up to bring the point home)

In many cases a downranked site can recover when the site owner follows good advice, and I mean marketing advice, not SEO tips primarily. Repeatedly posted presumptions, opinions and assertions get quoted as facts after a while and new myths flow around in lightning speed. We don't need another revisit-after meta tag flooding the net (myth), endless discussions of over optimizing penalties (rumor, more exactly a lame excuse when nabbed for keyword stuffing) ... or non-existent age delays (very popular bullshit aka "Google's Sandbox").

Just hire code monkeys for code monkey tasks, and SEOs for everything else ;)