Google Blog Search, Why The Spam?

SpamJonathan Bailey posted an insightful article the other day on his blog, Plagiarism Today, about why blog search services have failed. In the post he raises an interesting point about Google. While Google has done an OK job at handling splogs in their regular search engine, their blog search is pretty bad at filtering them (assuming blog search even finds the splogs, it seems to have a hard time finding blogs at all).

At first I thought part of the reason could have been due to a lack of spam reporting for the blog search, but after a brief discussion Jonathan found out that their normal spam report form is for both their blog search and their regular search engine (supposedly).

So the question still remains, why does it give splogs good rankings?

I don’t know the full answer, but I do think a good chunk of the reason has to do with the blog search algorithm. Unlike normal search, the blog search algorithm relies heavily upon when something was posted, instead of how important the posted content is. The number of links going to an individual post, the pagerank of the blog, and similar factors that are important in the normal search arena have very little to do with where content appears in the blog search. New content goes on top, old content goes on the bottom.

While factors like the number of links can have a slight effect on a post’s ranking, it won’t make a post so “important” that it gets the #1 position for a year (like with regular search). So unless a blog is considered a spam blog by the algorithm, a good chunk of the posts will briefly be given decent ranks.

In addition, I get the feeling Google Blog search is a more of a hobby for Google and isn’t a serious product. When it first came out it caused a bit of buzz, but since then it hasn’t really changed at all. Call me crazy, but I wouldn’t be too surprised to see it get dropped at some point in the next couple of years. If it doesn’t get dropped it’ll at least end up under the “even more” section of Google along with their other miserable failures (errr… “less accepted” products).

Please subscribe, or else I will cry. Do you really want to make a programmer cry?

10 Comments

  1. Joshua Clanton - Design for the WEB Says:

    I’ve been disappointed with Blog Search as well. It just doesn’t provide good authoritative results. And if I’m searching rather than browsing, authority is more important than recentness.

  2. Jeremy Steele Says:

    I agree. Someone really needs to make a service like Technorati or Google Blog search that goes one extra step and makes it a real search engine that ranks based on the content itself and how authoritative the blog is instead of how recent an article was posted.

  3. vishnu Says:

    I agree.

  4. Peter T Says:

    I agree with your notes. I think Technorati and Blog Catalog are more useful for finding blogs on a specific topic.

  5. Jeremy Steele Says:

    Blog Catalog especially, every time I’ve seen a spam site on there it’s removed within 1 or 2 hours of my reporting it, plus they are constantly improving and adding features - something Technorati and Google Blog Search have stopped doing.

  6. Jonathan Bailey Says:

    The problem I have with BlogCatalog as a solution is two-fold.

    First, only sites that have registered are in the index. For example, mine is not.

    Second, it is only good for finding sites, not specific entries. When most people do a “blog search” they really want to skip straight to a post with the information they need. Most targeted blog searches will not have a whole blog dedicated to them.

    It’s got its place, but I don’t think it is a fair replacement for Blog search.

  7. Jeremy Steele Says:

    Your first point is valid.

    But I don’t really get your second point. BC does have a post search function: http://www.blogcatalog.com/posts/

  8. Jonathan Bailey Says:

    Jeremy: Ok, I didn’t see that. You can tell I don’t use BC :)

    However, it is still hamstrung by the first problem. I would hate to think that we would be setting up blog search engines like “clubs”. That would make it hard to have one central place to search for information.

    It’s an inelegant solution, but so is Google BlogSearch. Much pondering is ahead…

  9. Jeremy Steele Says:

    Screw keeping data in one central area, I could get rich off of that blog search engine club idea…

    Bet I could convince some fools to pay $500/year for inclusion in a wonderful “blog search engine”? Just gotta tell them it is better than all those lame free ones. Could even pay off some popular bloggers to claim it is the “hottest blog club” in the world. Then, after it gets half a million subscribers I could sell it for about $500 mil and live happily ever after. :twisted:

    Now… to find a venture capitalist…

  10. Jonathan Bailey Says:

    Jeremy: Fine, but only if you take me on as a consultant…

Leave a Reply

Note: By submitting your comment you agree to this blog's comment policy.

If you want a little icon next to your name - sign up for one at Gravatar.