Regarding the Twitter vs. Blogger thing from earlier in the week, I took another stab at the faulty Twitter data. Using some educated guesses and fitting some curves, I’m 80-90% sure that this is what the Twitter message growth looks like:
These graphs cover the following time periods: 8/23/1999 - 3/7/2002 for Blogger and 3/21/2006 - 5/7/2007 for Twitter. It’s important to note that the Twitter trend is not comprised of actual data points but is rather a best-guess line, an estimate based on the data. Take it as fact at your own risk. (More specifically, I’m more sure of the general shape of the curve than with the steepness. My gut tells me that the curve is probably a little flatter than depicted rather than steeper.)
That said, most of what I wrote in the original post still holds, as do the comments in subsequent thread. Twitter did not grow as fast as the faulty data indicated, but it did get to ~6,000,000 messages in about half the time of Blogger. Here are the reasons I offered for the difference in growth:
1. Twitter is easier to use than Blogger was and had a lower barrier to entry.
2. Twitter has more ways to update (web, phone, IM, Twitterific) than did Blogger.
3. Blogger’s growth was limited by a lack of funding.
4. Twitter had a larger pool of potential users to draw on.
5. Twitter has a built-in social aspect that Blogger did not.
And commenters in the thread noted that:
6. Twitter’s 140-character limit encourages more messages.
7. More people are using Twitter for conversations than was the case with Blogger.
What’s interesting is that these seeming advantages (in terms of message growth potential) for Twitter didn’t result in higher message growth than Blogger over the first 9-10 months. But then the social and network effects (#5 and #7 above) kicked in and Twitter took off.
This morning I posted a comparison of the growth in messages with both Blogger and Twitter. The Twitter data was based on information collected by Andy Baio in a post that was widely read in the blogosphere. In the course of looking at the Twitter data, neither of us noticed that from Nov 21, 2006 to Feb 4, 2007 and March 9, 2007 to the present, the Twitter post IDs had the same last digit, indicating that the data is not strictly sequential. If you look at Twitter’s public timeline, the Twitter post IDs skip around by multiples of 10.
Anil suggested via email that could be an artifact of database sharding and lo and behold, if you take off the last digit of the post ID, they seem to become sequential again, more or less. He’s going to ask the Twitter gang about it.
For right now though, the parts of this morning’s post that rely on Twitter data from the above dates is incorrect. Basically, all of it. Here it is in all caps: WRONG WRONG WRONG ERROR ERROR, F——-, WOULD NOT BUY DATA ANALYSIS FROM AGAIN. In hindsight, it seems obvious that the data was incorrect…that sort of growth seems impossible, especially when Twitter was having all sorts of scaling problems. Anyway, good thing this is just a blog and not a refereed journal, eh? Big thanks to the commenters in the other post for pointing me toward the error. More as I have it.
Update: Email from Biz Stone, who works for Twitter. He says:
There’s truth in the essence of what you’re talking about here — Twitter updates *are* coming in faster and furiouser than Blogger updates. However, the way we number Twitter updates has switched back and forth a few times which pretty much screws up the exactness of your analysis.
We have been doubling the number of active users about every three weeks for a sustained period of months now which is definitely contributing significantly to more and more updates. Also, active users of Twitter a measured by how many times they update per day (at Blogger it was per month). So activity in general at Twitter is crazy by comparison.
We’re going to start digging in to more data visualization, user patterns, etc in the coming weeks so if there’s anything you think we should be looking at specifically please let us know!
So we’ll have to wait a few weeks for an accurate look at this stuff. (thx, biz)
Important update: I’ve re-evaluated the Twitter data and came up with what I think is a much more accurate representation of what’s going on.
Important update: I’ve re-evaluated the Twitter data and came up with what I think is a much more accurate representation of what’s going on.
Further update: The Twitter data is bad, bad, bad, rendering Andy’s post and most of this here post useless. Both jumps in Twitter activity in Nov 2006 and March 2007 are artificial in nature. See here for an update.
Update: A commenter noted that sometime in mid-March, Twitter stopped using sequential IDs. So that big upswing that the below graphs currently show is partially artificial. I’m attempting to correct now. This is the danger of doing this type of analysis with “data” instead of data.
โ
In mid-March, Andy Baio noted that Twitter uses publicly available sequential message IDs and employed Twitter co-founder Evan Williams’ messages to graph the growth of the service over the first year of its existence. Williams co-founded Blogger back in 1999, a service that, as it happens, also exposed its sequential post IDs to the public. Itching to compare the growth of the two services from their inception, I emailed Matt Webb about a script he’d written a few years ago that tracked the daily growth of Blogger. His stats didn’t go back far enough so I borrowed Andy’s idea and used Williams’ own blog to get his Blogger post IDs and corresponding dates. Here are the resulting graphs of that data.1
The first one covers the first 253 days of each service. The second graph shows the Twitter data through May 7, 2007 and the Blogger data through March 7, 2002. [Some notes about the data are contained in this footnote.]
As you can see, the two services grew at a similar pace until around 240 days in, with Blogger posts increasing faster than Twitter messages. Then around November 21, 2006, Twitter took off and never looked back. At last count, Twitter has amassed five times the number of messages than Blogger did in just under half the time period. But Blogger was not the slouch that the graph makes it out to be. Plotting the service by itself reveals a healthy growth curve:
From late 2001 to early 2002, Blogger doubled the number of messages in its database from 5M to 10M in under 200 days. Of course, it took Twitter just over 40 days to do the same and under 20 days to double again to 20M. The curious thing about Blogger’s message growth is that large events like 9/11, SXSW 2000 & 2001, new versions of Blogger, and the launch of blog*spot didn’t affect the growth at all. I expected to see a huge message spike on 9/11/01 but there was barely a blip.
The second graph also shows that Twitter’s post-SXSW 2007 growth is real and not just a temporary bump…a bunch of people came to check it out, stayed on, and everyone messaged like crazy. However, it does look like growth is slowing just a bit if you look at the data on a logarithmic scale:
Actually, as the graph shows, the biggest rate of growth for Twitter didn’t occur following SXSW 2007 but after November 21.
As for why Twitter took off so much faster than Blogger, I came up with five possible reasons (there are likely more):
1. Twitter is easier to use than Blogger was. All you need is a web browser or mobile phone. Before blog*spot came along in August 2000, you needed web space with FTP access to set up a Blogger blog, not something that everyone had.
2. Twitter has more ways to create a new message than Blogger did at that point. With Blogger, you needed to use the form on the web site to create a post. To post to Twitter, you can use the web, your phone, an IM client, Twitterrific, etc. It’s also far easier to send data to Twitter programatically…the NY Times account alone sends a couple dozen new messages into the Twitter database every day without anyone having to sit there and type them in.
3. Blogger was more strapped for cash and resources than Twitter is. The company that built Blogger ran out of money in early 2001 and nearly out of employees shortly after that. Hard to say how Blogger might have grown if the dot com crash and other factors hadn’t led to the severe limitation of its resources for several key months.
4. Twitter has a much larger pool of available users than Blogger did. Blogger launched in August 1999 and Twitter almost 7 years later in March 2006. In the intervening time, hundreds of millions of people, the media, and technology & media companies have become familiar and comfortable with services like YouTube, Friendster, MySpace, Typepad, Blogger, Facebook, and GMail. Hundreds of millions more now have internet access and mobile phones. The potential user base for the two probably differed by an order of magnitude or two, if not more.
5. But the biggest factor is that the social aspect of Twitter is built in and that’s where the super-fast growth comes from. With Blogger, reading, writing, and creating social ties were decoupled from each other but they’re all integrated into Twitter. Essentially, the top graph shows the difference between a site with social networking and one largely without. Those steep parts of the Twitter trend on Nov 21 and mid-March? That’s crazy insane viral growth2, very contagious, users attracting more users, messages resulting in more messages, multiplying rapidly. With the way Blogger worked, it just didn’t have the capability for that kind of growth.
A few miscellaneous thoughts:
It’s important to keep in mind that these graphs depict the growth in messages, not users or web traffic. It would be great to have user growth data, but that’s not publicly available in either case (I don’t think). It’s tempting to look at the growth and think of it in terms of new users because the two are obviously related. More users = more messages. But that’s not a static relationship…perhaps Twitter’s userbase is not increasing all that much and the message growth is due to the existing users increasing their messaging output. So, grain of salt and all that.
What impact does Twitter’s API have on its message growth? As I said above, the NY Times is pumping dozens of messages into Twitter daily and hundreds of other sites do the same. This is where it would be nice to have data for the number of active users and/or readers. The usual caveats apply, but if you look at the Alexa trends for Twitter, pageviews and traffic seem to leveling out. Compete, which only offers data as recently as March 2007, still shows traffic growing quickly for Twitter.
Just for comparison, here’s a graph showing the adoption of various technologies ranging from the automobile to the internet. Here’s another graph showing the adoption of four internet-based applications: Skype, Hotmail, ICQ, and Kazaa (source: a Tim Draper presentation from April 2006).
[Thanks to Andy, Matt, Anil, Meg, and Jonah for their data and thoughts.]
[1] Some notes and caveats about the data. The Blogger post IDs were taken from archived versions of Evhead and Anil Dash’s site stored at the Internet Archive and from a short-lived early collaborative blog called Mezzazine. For posts prior to the introduction of the permalink in March 2000, most pages output by Blogger didn’t publish the post IDs. Luckily, both Ev and Anil republished their old archives with permalinks at a later time, which allowed me to record the IDs.
The earliest Blogger post ID I could find was 9871 on November 23, 1999. Posts from before that date had higher post IDs because they were re-imported into the database at a later time so an accurate trend from before 11/23/99 is impossible. According to an archived version of the Blogger site, Blogger was released to the public on August 23, 1999, so for the purposes of the graph, I assumed that post #1 happened on that day. (As you can see, Anil was one of the first 2-3 users of Blogger who didn’t work at Pyra. That’s some old school flavor right there.)
Regarding the re-importing of the early posts, that happened right around mid-December 1999…the post ID numbers jumped from ~13,000 to ~25,000 in one day. In addition to the early posts, I imagine some other posts were imported from various Pyra weblogs that weren’t published with Blogger at the time. I adjusted the numbers subsequent to this discontinuity and the resulting numbers are not precise but are within 100-200 of the actual values, an error of less than 1% at that point and becoming significantly smaller as the number of posts grows large. The last usable Blogger post ID is from March 7, 2002. After that, the database numbering scheme changed and I was unable to correct for it. A few months later, Blogger switched to a post numbering system that wasn’t strictly sequential.
The data for Twitter from March 21, 2006 to March 15, 2007 is from Andy Baio. Twitter data subsequent to 3/15/07 was collected by me. โฉ
[2] “Crazy insane viral growth” is a very technical epidemiological term. I don’t expect you to understand its precise meaning. โฉ
Marc Hedlund, founder of the intriguing Wesabe, recently made this interesting observation:
One of my favorite business model suggestions for entrepreneurs is, find an old UNIX command that hasn’t yet been implemented on the web, and fix that. talk and finger became ICQ, LISTSERV became Yahoo! Groups, ls became (the original) Yahoo!, find and grep became Google, rn became Bloglines, pine became Gmail, mount is becoming S3, and bash is becoming Yahoo! Pipes. I didn’t get until tonight that Twitter is wall for the web. I love that.
A slightly related way of thinking about how to choose web projects is to take something that everyone does with their friends and make it public and permanent. (Permanent as in permalinked.) Examples:
- Blogger, 1999. Blog posts = public email messages. Instead of “Dear Bob, Check out this movie.” it’s “Dear People I May or May Not Know Who Are Interested in Film Noir, Check out this movie and if you like it, maybe we can be friends.”
- Twitter, 2006. Twitter = public IM. I don’t think it’s any coincidence that one of the people responsible for Blogger is also responsible for Twitter.
- Flickr, 2004. Flickr = public photo sharing. Flickr co-founder Caterina Fake said in a recent interview: “When we started the company, there were dozens of other photosharing companies such as Shutterfly, but on those sites there was no such thing as a public photograph โ it didn’t even exist as a concept โ so the idea of something ‘public’ changed the whole idea of Flickr.”
- YouTube, 2005. YouTube = public home videos. Bob Saget was onto something.
Not that this approach leads naturally to success. Several companies are exploring music sharing (and musical opinion sharing), but no one’s gotten it just right yet, due in no small measure to the rights issues around much recorded music.
Stay Connected