Blogs versus the NY Times in Google
In 2002, Dave Winer of Scripting News and Martin Nisenholtz of the New York Times made a Long Bet about the authority of weblogs versus that of NY Times in Google:
In a Google search of five keywords or phrases representing the top five news stories of 2007, weblogs will rank higher than the New York Times’ Web site.
I decided to see how well each side is doing by checking the results for the top news stories of 2005. Eight news stories were selected and an appropriate Google keyword search was chosen for each one of them. I went through the search results for each keyword and noted the positions of the top results from 1) “traditional” media, 2) citizen media, 3) blogs, and 4) nytimes.com. Finally, the scores were tallied and an “actual” winner (blogs vs. nytimes.com) and an “in-spirit” winner (any traditional media source vs. any citizen media source) were calculated. (For more on the methodology, definitions, and caveats, read the methodology section below.)
So how did the NY Times fare against blogs? Not very well. For eight top news stories of 2005, blogs were listed in Google search results before the Times six times, the Times only twice. The in-spirit winner was traditional media by a 6-2 score over citizen media. Here the specific results:
1) Hurricane Katrina hits New Orleans.
Search term: “hurricane katrina”
3. Top citizen media result (Wikipedia)
13. Top media result (CNN)
56. Top NY Times mention (NY Times).
61. Top blog result (Kaye’s Hurricane Blog)
Winner (in spirit): Citizen media
Winner (actual): NY Times
2) Big changes in the US Supreme Court (Rhenquist dies, O’Conner retires, Roberts appointed Chief Justice, Harriet Miers rejected).
Search term: “harriet miers”
4. Top media result (Washington Post)
5. Top citizen media result (Wikipedia)
8. Top NY Times mention (NY Times)
11. Top blog result (TalkLeft)
Winner (in spirit): Media
Winner (actual): NY Times
3) Terrorists bomb London, killing 52.
Search term: “london bombing”
1. Top media result (CNN)
2. Top citizen media result (Wikipedia)
21. Top blog result Schneier on Security
No NY Times article appears in the first 100 results.
Winner (in spirit): Media
Winner (actual): Blogs
4) First elections in Iraq after Saddam.
Search term: “iraq election”
1. Top media result (BBC News)
6. Top blog result (Iraq elections newswire)
6. Top citizen media result (Iraq elections newswire)
14. Top NY Times mention (NY Times)
Winner (in spirit): Media
Winner (actual): Blogs
5) Terri Schiavo legal fight and death.
Search term: “terri schiavo”
2. Top blog result (Abstract Appeal)
2. Top citizen media result (Abstract Appeal)
4. Top media result (CNN)
65. Top NY Times mention (NY Times)
Winner (in spirit): Citizen media
Winner (actual): Blogs
6) Pope John Paul II dies and Cardinal Joseph Ratzinger appointed Pope Benedict XVI.
Search term: “pope john paul ii death”
1. Top media result (CNN)
3. Top citizen media result (Wikipedia)
58. Top blog result (The Pope Blog: Pope Benedict XVI)
No NY Times article appears in the first 100 results.
Winner (in spirit): Media
Winner (actual): Blogs
7) The Israeli withdrawal from the Gaza Strip.
Search term: “gaza withdrawal”
1. Top media result (Worldpress.org)
31. Top blog result (Simply Appalling)
31. Top citizen media result (Simply Appalling)
No NY Times article appears in the first 100 results.
Winner (in spirit): Media
Winner (actual): Blogs
8) The investigation into the Valerie Plame affair, Judith Miller, Scooter Libby indicted, etc..
Search term: “scooter libby indicted”:
1. Top media result (CNN)
15. Top blog result (Seven Generational Ruminations)
15. Top citizen media result (Seven Generational Ruminations)
43. Top NY Times mention (NY Times)
Winner (in spirit): Media
Winner (actual): Blogs
And just for fun here’s a search for “judith miller jail” (not included in the final tally):
1. Top media result (Washington Post)
3. Top blog result (Gawker)
3. Top citizen media result (Gawker)
No NY Times article appears in the first 100 results (even though there are several matching articles on the Times site).
In covering the jailing of their own reporter, the Times lagged in the Google results behind such informational juggernauts as Drinking Liberally, GOP Vixen, and Feral Scholar.
Winner (in spirit): Media
Winner (actual): Blogs
Here’s the overall results, excluding the Judith Miller search:
Overall winner (in spirit): Media (beating citizen media 6-2).
Overall winner (actual): Blogs (beating the NY Times 6-2).
Some observations:
- My feeling is that Mr. Nisenholtz will likely lose his bet come 2007. Even though the nytimes.com fares very well in getting linked to by the blogosphere, it does very poorly in Google. This isn’t exactly surprising given that most NY Times articles disappear behind a paywall after a week and some of their content (TimesSelect) isn’t even publicly accessible at all. Also, I didn’t look too closely at the HTML markup of the NY Times, but it could also be that it’s not as optimized for Google as well as that of some weblogs and other media outlets.
- “www.nytimes.com” has a PageRank of 10/10, higher than that of “www.cnn.com” (9/10), yet stories from CNN consistently appeared higher in the search results than those from the Times. The Times clearly has overall authority according to Google, but when it comes to specific instances, it falls short. In some cases, a NY Times story didn’t even appear in the first 100 search results for these keyword searches.
- By 2007, it may be difficult to differentiate a blog from a traditional media source. All of the Gawker and Weblogs, Inc. sites are presented in a blog format and are referred to as blogs but otherwise how are they distinguishable from traditional media? Engadget paid to send 12 people to cover the CES technology conference, probably as many or more than the Times sent. The Sundance film festival was heavily covered by paid writers for both companies as well. In the spirit in which this bet was made, I’d have a hard time counting any of their sites as blogs. (And what about kottke.org? I get paid to write it. Am I still a member of the citizen media or have I crossed over?)
- Choosing appropriate news stories and keywords for those stories was difficult in some cases. Katrina was a no-brainer, but was the Terri Schiavo story really one of the top eight news stories of 2005? Resolving the methodology for this bet in 2007 will be tricky. I wonder how the Long Bets Foundation will handle its determination of the victory.
- Wikipedia does very well in Google results for topical search terms. Overall, traditional media still dominates (in first appearance as well as number of results), but blogs and Wikipedia do very well in some instances.
- What do these results mean? Probably not a whole lot. Nisenholtz asserts that “[news] organizations like the Times can provide that far more consistently than private parties can” while Winer says that “in five years, the publishing world will have changed so thoroughly that informed people will look to amateurs they trust for the information they want”. It’s difficult to draw any conclusions on this matter based on these results. Contrary to what most people believe, PageRank has a bias, a point of view. That POV is based largely (but not entirely) on what people are linking to. As someone said in the discussion of this bet, this bet is about Google more than influence or reputation, so these results probably tell us more about how Google determines influence on a keyword basis rather than how readers of online informational sources value or rate those sources. Do web users prefer the news coverage of blogs to that of the NY Times? I don’t think you can even come close to answering that question based on these results.
The eight news stories were culled from various sources (Lexis-Nexis, Wikipedia, NY Times) and narrowed down to the top stories that would have been prominently covered in both the NY Times and blogs.
The keyword phrase for each of the eight stories was selected by the trial and error discovery of the shortest possible phrase that yielded targeted search results about the subject in question. In some cases, the keyword phrase chosen only returned results for a part of a larger news story. For instance, the phrase “pope john paul” was not specific enough to get targeted results, so “pope john paul ii death” was used, but that didn’t give results about the larger story of his death, the conclave to select a new pope, and the selection of Cardinal Joseph Ratzinger as Pope Benedict XVI. In the case of “katrina”, that single keyword was enough to produce hundreds of targeted search results for both Hurricane Katrina and its aftermath. Keyword phrases were not tinkered with to promote or demote particular types of search results (i.e. those for blogs or nytimes.com); they were only adjusted for the relevence of overall results.
The searches were all done on January 27, 2006 with Google’s main search engine, not their news specific search.
Since the spirit of the bet deals with the influence of traditional media versus that of citizen-produced media, I tracked the top traditional media (labeled just “media” above) results and the top citizen media results in addition to blog and nytimes.com results. For the purposes of this exercise, relevent results were those that linked to pages that an interested reader would use as a source of information about a news story. For citizen media, this meant pages on Wikipedia, Flickr (in some cases), weblogs, message boards, wikis, etc. were fair game. For traditional media, this meant articles, special news packages, photo essays, videos, etc.
In differentiating between “media” & citizen media and also between relevent and non-relevent results, in only one instance did this matter. Harriet Miers’s Blog!!!, a fictional satire written as if the author were Harriet Miers, was the third result for this keyword phrase, but since the blog was not a informational resource, I excluded it. In all other cases, it was pretty clear-cut.
Stay Connected