Sebastian's Pamphlets

If you've read my articles somewhere on the Internet, expect something different here.

MOVED TO SEBASTIANS-PAMPHLETS.COM

Please click the link above to read actual posts, this archive will disappear soon!

Stay tuned...

Wednesday, May 02, 2007

Yahoo! search going to torture Webmasters

According to Danny Yahoo! supports a multi-class nonsense called robots-nocontent tag. CRAP ALERT!

Can you senseless and cruel folks at Yahoo!-search imagine how many of my clients who'd like to use that feature have copied and pasted their pages? Do you've a clue how many sites out there don't make use of SSI, PHP or ASP includes, and how many sites never heard of dynamic content delivery, respectively how many sites can't use proper content delivery techniques because they've to deal with legacy systems and ancient business processes? Did you ask how common templated Web design is, and I mean the weird static variant, where a new page gets build from a randomly selected source page saved as new-page.html?

It's great that you came out with a bastardized copy of Google's somewhat hapless (in the sense of cluttering structured code) section targeting, because we dreadfully need that functionality across all engines. And I admit that your approach is a little better than AdSense section targeting because you don't mark payload by paydirt in comments. But why the heck did you design it that crappy? The unthoughtful draft of a microformat from what you've "stolen" that unfortunate idea didn't become a standard for very good reasons. Because it's crap. Assigning multiple class names to markup elements for the sole purpose of setting crawler directives is as crappy as inline style assignments.

Well, due to my zero-bullshit tolerance I'm somewhat upset, so I repeat: Yahoo's robots-nocontent class name is crap by design. Don't use it, boycott it, because if you make use of it you'll change gazillions of files for each and every proprietary syntax supported by a single search engine in the future. When the united search geeks can agree on flawed standards like rel-nofollow, they should be able to talk about a sensible evolvement of robots.txt.

There's a way easier solution, which doesn't require editing tons of source files, that is standardizing CSS-like syntax to assign crawler directives to existing classes and DOM-IDs. For example extent robots.txt syntax like:

A.advertising { rel: nofollow; } /* devalue aff links */

DIV.hMenu, TD#bNav { content:noindex; rel:nofollow; } /* make site wide links unsearchable */


Unsupported robots.txt syntax doesn't harm, proprietary attempts do harm!

Dear search engines, get together and define something useful, before each of you comes out with different half-baked workarounds like section targeting or robots-nocontent class values. Thanks!

Labels: , , , , ,

Share this post at StumbleUpon
Stumble It!
    Share this post at del.icio.us
Post it to
del.icio.us
 


-->

5 Comments:

  • At Thursday, May 03, 2007, Blogger Scott G said…

    Yeah, I fully do not support this and definitely will not recommend to clients or developers.

     
  • At Sunday, July 29, 2007, Anonymous Anonymous said…

    This is definitely useful for SEO.

    If someone has to update every page of their site for such a change then they're building their sites inefficiently and dare I say, wrong.

    After all, its the developer's decision to use this in the first place and weigh whether the change is worth it.

    The implementation might not be the greatest but it's good that Yahoo! is thinking about this kind of stuff and actually has enough balls to through something out there to get feedback on.

    Adding CSS selectors to the robots.txt file is a ridiculous suggestion IMO. How about just Noindex-class:advertising,nav or Noindex-id:main-nav?

    Full disclosure: I work for Yahoo! as a Web Developer but not on search.

     
  • At Monday, July 30, 2007, Blogger Sebastian said…

    Anonymous, I agree that the functionality is useful and dreadfully needed, but the implementation is crappy and in your comment you tell us why.

    A search engine should not ignore the fact that a huge amount of Web sites *are* built inefficiently. A caring search engine should be smart enough to come out with something well thought out that can get applied to legacy code without hassles.

    Your robots.txt syntax suggestion shows that Yahoo! has the potential, so why was it designed that thoughtless in the first place?

    Because you guys think that everybody not following state of the art principles is wrong and not worth a second thought?

    If you want to develop great stuff for the masses, you should study the great unwashed to discover that many of them are non-geeks still making use of outdated processes and technologies. That's wrong from a geek's perspective, but reality.

    If Yahoo! would have done it a bit smarter, I would have praised a good initiative. Unfortunately I can't do that.

     
  • At Tuesday, July 31, 2007, Anonymous Anonymous said…

    The more I thought about it the more I agree with your point and can applaud you for speaking your mind.

    It should be better.

    Hopefully this is just a step in the right direction and we're not stuck with a half baked attempt at something that can be much better (and easier to implement.)

     
  • At Tuesday, July 31, 2007, Blogger Sebastian said…

    Thanks :) I can't wait to blog a better approach!

     

Post a Comment

<< Home