Thoughts On Chrome OS

Chrome OS is an interesting idea, though I still don’t see it as revolutionary like some people. To me it’s still a terminal but unlike the VT100 uses web standards.

Regarding reliability, in my opinion you’ve added new points of failure: your network connection, and the cloud. I’ve see my network connection and web services experience way more problems than my personal computer has.

Regarding security, you’re only as secure as your password to the cloud. Since all your data is synced to the cloud, anyone who can obtain access has it. No longer is physical access necessary. Disk encryption may have saved you when physical access is obtained, but in the cloud you’re often relying on what’s available.

Regarding cost, this becomes a toss up. On the plus side you can have cheap hardware. You don’t need much storage, or CPU. On the downside, your a slave to your network connection for even the most basic tasks. We’ve yet to enter a world of free wireless, and even broadband services are looking to switch to metered service as a replacement to the “all you can eat” plans we’re used to. A change to how bandwidth is priced can ruin this model overnight.

Lets not forget broadband performance in the US is far from stellar. Web UI has improved greatly over the years, but it’s hardly at the level of desktop applications.

Personally I see little value at this time for cheap hardware in exchange for giving up most control. I can replicate all the functionality of Chrome OS using a web browser, and get the added bonuses of a full operating system.

Would I use it? Perhaps as a throw around netbook, but not as a primary computer, or even for serious work. Maybe one day, but not in 2009, and I highly doubt 2010 will close all those gaps.

Google Chrome OS

The big news over the past 24 hours is the announcement of Google Chrome OS. Effectively Google Chrome OS is a stripped down Linux Kernel with just enough to boot Chrome/WebKit as it’s main UI. The exact UI paradigm hasn’t been reveled as of yet. Google claims:

Speed, simplicity and security are the key aspects of Google Chrome OS. We’re designing the OS to be fast and lightweight, to start-up and get you onto the web in a few seconds. The user interface is minimal to stay out of your way, and most of the user experience takes place on the web. And as we did for the Google Chrome browser, we are going back to the basics and completely redesigning the underlying security architecture of the OS so that users don’t have to deal with viruses, malware and security updates. It should just work.

It’s an interesting and somewhat bold statement.

Continue reading

Amazon S3 Outage

The buzz around the web today was the outage of Amazon’s S3. It shows what websites are “doing it right”, and who fails. This is a great follow up to my “Reliability On The Grid” post the other day.

Amazon S3 is cloud based computing. Essentially when you send them a file using their REST or SOAP interface Amazon stores it on multiple nodes in their infrastructure. This provides redundancy and security (in case a data center catches fire for example). Because of this design it’s often though that cloud based computing is invincible to problems. This is hardly the fact. Just like any large system, it’s complicated and full of hazards. It takes only a small software glitch, or an unaccounted for issue to cause the entire thing to grind to a halt. More complexity = more things that can fail.

Amazon S3 is popular because it’s cheap and easy to scale. It’s pay-per-use based on bandwidth, disk storage, and requests. Because that allows for websites to grow without having to make a large infrastructure investment, it’s popular for “Web 2.0” companies trying to keep their budgets tight. Notably sites like Twitter, WordPress.com, SmugMug and Amazon.com themselves all use Amazon S3 to host things like images.

Many sites, notably Twitter, and SmugMug didn’t have a good day today. WordPress.com and Amazon.com operated like normal. The obvious reason for this is WordPress.com and Amazon.com are much better in terms of infrastructure and design.

WordPress.com uses S3, but proxies that with Varnish. There’s a brief description here, and a more detailed breakdown here. According to Barry Abrahamson, WordPress.com does 1500 image requests per second across and 80-100 are served through S3. They have (slower) back up’s in house for when S3 is down and can failover if S3 has a problem. This means they can leverage S3 to their advantage, but aren’t down because of S3. Using Varnish allows them to keep the S3 bill down by using their own bandwidth (likely cheaper since they are a large site and can get better rates on bandwidth). This also and lets them have this have a good level of redundancy. Awesome job.

Amazon.com uses S3 themselves. If you look at images on the site, they are actually served from g-ecx.images-amazon.com. Which is actually:

g-ecx.images-amazon.com. 38     IN      CNAME   ant.mii.instacontent.net.

instacontent.net is actually part of Mirror Image, a CDN. This is essentially outsourcing what WordPress.com is doing in terms of caching. It’s similar to Akamai’s services. A CDN’s biggest advantage is lowering latency by using servers closer to the customer, which are generally going to feel faster. The other benefit is that they cache content for when the origin is having problems. Because Amazon has a layer on top of S3, they have an added level of protection and remained up and images loaded.

Twitter serves most images such as avatars right off of S3. This means when S3 went down, there were thousands of dead images on their pages. No caching, not even a CNAME in place. Image hosting is the least of their concerns. Keeping the service up and running is their #1 concern right now. The service was still usable, just ugly. Many users take advantage of third party clients anyway.

Using a CDN or having the infrastructure in house is obviously more expensive (it makes S3 more of a luxury than a cost savings measure), but it means your not depending on one third party for your uptime.

Reliability On The Grid

There’s been a lot of discussion lately (in particular NYTimes, Data Center Knowledge) regarding both reliability of web applications which users are becoming more and more reliant on, as well as the security of such applications. It’s a pretty interesting topic considering there are so many things that ultimately have an impact on these two metrics. I call them metrics since that’s what they really are.

Continue reading