Jonathan Bailey posted an insightful article the other day on his blog, Plagiarism Today, about why blog search services have failed. In the post he raises an interesting point about Google. While Google has done an OK job at handling splogs in their regular search engine, their blog search is pretty bad at filtering them (assuming blog search even finds the splogs, it seems to have a hard time finding blogs at all).
At first I thought part of the reason could have been due to a lack of spam reporting for the blog search, but after a brief discussion Jonathan found out that their normal spam report form is for both their blog search and their regular search engine (supposedly).
So the question still remains, why does it give splogs good rankings?
I don’t know the full answer, but I do think a good chunk of the reason has to do with the blog search algorithm. Unlike normal search, the blog search algorithm relies heavily upon when something was posted, instead of how important the posted content is. The number of links going to an individual post, the pagerank of the blog, and similar factors that are important in the normal search arena have very little to do with where content appears in the blog search. New content goes on top, old content goes on the bottom.
While factors like the number of links can have a slight effect on a post’s ranking, it won’t make a post so “important” that it gets the #1 position for a year (like with regular search). So unless a blog is considered a spam blog by the algorithm, a good chunk of the posts will briefly be given decent ranks.
In addition, I get the feeling Google Blog search is a more of a hobby for Google and isn’t a serious product. When it first came out it caused a bit of buzz, but since then it hasn’t really changed at all. Call me crazy, but I wouldn’t be too surprised to see it get dropped at some point in the next couple of years. If it doesn’t get dropped it’ll at least end up under the “even more” section of Google along with their other miserable failures (errr… “less accepted” products).
Please subscribe, or else I will cry. Do you really want to make a programmer cry?

March 12th, 2008 at 10:44 am
I’ve been disappointed with Blog Search as well. It just doesn’t provide good authoritative results. And if I’m searching rather than browsing, authority is more important than recentness.
March 12th, 2008 at 11:21 am
I agree. Someone really needs to make a service like Technorati or Google Blog search that goes one extra step and makes it a real search engine that ranks based on the content itself and how authoritative the blog is instead of how recent an article was posted.
March 13th, 2008 at 1:09 pm
I agree.
March 16th, 2008 at 4:23 am
I agree with your notes. I think Technorati and Blog Catalog are more useful for finding blogs on a specific topic.
March 16th, 2008 at 10:23 am
Blog Catalog especially, every time I’ve seen a spam site on there it’s removed within 1 or 2 hours of my reporting it, plus they are constantly improving and adding features - something Technorati and Google Blog Search have stopped doing.
March 16th, 2008 at 10:34 am
The problem I have with BlogCatalog as a solution is two-fold.
First, only sites that have registered are in the index. For example, mine is not.
Second, it is only good for finding sites, not specific entries. When most people do a “blog search” they really want to skip straight to a post with the information they need. Most targeted blog searches will not have a whole blog dedicated to them.
It’s got its place, but I don’t think it is a fair replacement for Blog search.
March 16th, 2008 at 4:06 pm
Your first point is valid.
But I don’t really get your second point. BC does have a post search function: http://www.blogcatalog.com/posts/
March 16th, 2008 at 9:27 pm
Jeremy: Ok, I didn’t see that. You can tell I don’t use BC
However, it is still hamstrung by the first problem. I would hate to think that we would be setting up blog search engines like “clubs”. That would make it hard to have one central place to search for information.
It’s an inelegant solution, but so is Google BlogSearch. Much pondering is ahead…
March 16th, 2008 at 10:21 pm
Screw keeping data in one central area, I could get rich off of that blog search engine club idea…
Bet I could convince some fools to pay $500/year for inclusion in a wonderful “blog search engine”? Just gotta tell them it is better than all those lame free ones. Could even pay off some popular bloggers to claim it is the “hottest blog club” in the world. Then, after it gets half a million subscribers I could sell it for about $500 mil and live happily ever after.
Now… to find a venture capitalist…
March 17th, 2008 at 8:47 am
Jeremy: Fine, but only if you take me on as a consultant…