Mountains of data, right at your fingertips

Last week, two announcements caught my eye. The first was from Amazon.com, which announced that there is now more than 1 TB of public data available to developers through its Public Data Sets on AWS project. The second was from the New York Times, which announced its Newswire API, providing access all NYTimes articles as they are published.

This is a big deal. Never before has so much data been so readily available to anyone. The AWS data is particularly interesting. All of a sudden, any developer in the world has cost-effective access to all publicly available DNA sequences (including the entire Human Genome), an entire dump of Wikipedia, US Census data, and much more. Perhaps most importantly, the data is in machine-readable formats. It’s relatively easy for developers to tap into the data sources for cross-referencing, statistical analysis, and who knows what else.

The Newswire API is also really intriguing. It’s part of a growing set of APIs that the New York Times has made available. With the Newswire API, developers can get links and metadata for new articles the minute they are published. What will developers do with this data? Again, who knows. Imagination is the only limitation now that everyone can have immediate access.

Both of these projects remove barriers and will help foster invention, innovation, and discovery. I hope they are part of a larger trend, where simple access to data becomes the norm. Google’s mission might be to organize the world’s information and make it universally accessible and useful, but it’s projects like these that are making that vision a reality. I can’t wait to see what comes next!

Amazon Web Services: Still getting better

aws logo I often think back to 2006 when Dickson and I were in the midst of the VenturePrize business plan competition. It was around that time that Amazon.com launched their first web service, the Simple Storage Service (S3). It had a huge impact on our business, and we’ve been extremely happy customers ever since.

Over the last couple of years, Amazon has introduced a number of additional web services, the most well-known of which might be the Elastic Compute Cloud (EC2). You can think of it like an on-demand computer in the cloud. I had a quick look at it when it launched, but being a Windows shop, we really didn’t have time to invest the extra effort necessary to get it running. Now, Amazon has announced that EC2 will support Windows:

Starting later this Fall, Amazon Elastic Compute Cloud (Amazon EC2) will offer the ability to run Microsoft Windows Server or Microsoft SQL Server. Our goal is to support any and all of the programming models, operating systems and database servers that you need for building applications on our cloud computing platform. The ability to run a Windows environment within Amazon EC2 has been one of our most requested features, and we are excited to be able to provide this capability. We are currently operating a private beta of Amazon EC2 running Windows Server and SQL Server.

Very cool news for Windows developers. It should put some extra pressure on Microsoft too – though apparently they are getting ready to launch something. Watch for more news on that at PDC.

Another interesting new service that Amazon is introducing is a Content Delivery Service:

This new service will provide you a high performance method of distributing content to end users, giving your customers low latency and high data transfer rates when they access your objects. The initial release will help developers and businesses who need to deliver popular, publicly readable content over HTTP connections.

It will run atop S3, so anything that currently exists there can easily be added to the new content delivery network. This is very cool, and will finally bring world-class CDN infrastructure to small businesses. I wish they had introduced this two years ago!

Those are both very important improvements to AWS. Amazon is raising the bar, again. When will Microsoft, Google, and others answer?

Also – I just noticed recently that Amazon has redesigned the AWS website. It looks fantastic, in my opinion, and is much easier to navigate. Keep the positive improvements coming!

What if Twitter had been built by Amazon.com's Web Services team?

twitter by aws? I’ve been using Twitter for a long time now, and I can’t remember a period of downtime quite as bad as the current one. Features have been disabled, and there’s no ETA for when everything will be back to normal. Who knows, maybe it won’t ever be. Which got me wondering about why Twitter’s reliability is so terrible. Is it the nature of the application, or is it something to do with the people behind Twitter?

What if Twitter had been built by a different team, a team with a pretty good track record for high-availability services? What if Twitter had been built by the Web Services team at Amazon.com?

I think it’s safe to say that things would be quite different:

  1. Reliable, redundant infrastructure
    Twitter would be run inside Amazon’s high-availability data centers. We would never know (or care) that Twitter’s main database was named db006, nor would we ever wonder whether it has a good backup. We’d just know that if it’s good enough for Amazon, it’s good enough for us.
  2. No wondering, “is Twitter working?”
    Instead of wondering if Twitter is working correctly or waiting for Twitter messages or blog posts that explain what the problem is, Twitter would be part of the AWS Service Health Dashboard. We’d be able to see, at a glance, how Twitter is working now, and how well it has worked for the last month. This is what transparency is all about.
  3. Twitter wouldn’t be free, but we’d be cool with that
    Twitter would have had a business model from day one, and we’d all be cheering about how affordable it is. A pay-as-you-go model like all the other web services from Amazon would work quite well for Twitter. You get what you pay for, right?
  4. Premium Support and SLAs
    Speaking of getting what you pay for, Amazon would likely have realized that there are lots of different types of users, and they’d react accordingly. We’d probably have Premium Support for Twitter, to service support requests more efficiently. We’d also have Service Level Agreements.
  5. We wouldn’t call it Twitter…
    Of course, the service wouldn’t be called Twitter. In keeping with Amazon’s other services it would probably have a name like “Amazon Simple Messaging Service”, or SMS for short. Though I suppose that acronym is already taken!

I am a huge Twitter fan, and I really do hope that Ev, Biz, Jack, and the rest of the team get things working and fixed. With every passing hour of downtime though, I lose a little bit of faith. I wonder if Twitter would be better off in someone else’s hands.

Of course, if Twitter really had been built by AWS, there would be far more differences than just the items in my list above. The service may not be recognizable as Twitter!

That doesn’t mean that they couldn’t adopt some of these items as improvements, however. I’d love to see an official Twitter health dashboard, for instance. One can hope.