Google Security Web Development

The Future Of SSL

Google announced the other day that it will now enable HTTPS by default on Gmail. Previously a user had to either manually type in HTTPS or change a setting to default to it, something most people likely never bothered to do. Google says it’s not related but it seems oddly coincidental that this chance coincides with its China announcement.

However Gmail using HTTPS is not the big story here.

The big story is that HTTPS is now being used in places where it before was considered excessive. Once upon only financial information was generally sent over HTTPS. As time went on, so did most website login pages, though the rest of the sites often were unencrypted. The reason for being so selective is that it’s more costly to scale HTTPS due to it’s CPU usage on the server-side, and it’s performance on the client side. These days CPU is becoming very cheap.

In the next few years I think we’ll see more and more of the web switch to using HTTPS. If things like network neutrality don’t work this trend could accelerate at an even quicker rate just like it did for P2P using MSE/PE to mask traffic.

Like I said, these days the CPU impact is pretty affordable, however the performance impact due to HTTP handshaking can be pretty substantial. Minimizing HTTP requests obviously helps. HTTP Keepalive is a good solution however that generally results in more child processes on the server as they aren’t freed as quickly (read: more memory needed).

Mobile is a whole different ballgame since CPU is still more limited. I’m not aware of any mobile devices that have hardware to specifically handle SSL, which does exist for servers. Add in the extra latency and mobile really suffers. Perhaps it’s time to re-examine how various Crypto libraries are optimized for running on ARM hardware? I think the day will come where performance over SSL will matter as it becomes more ubiquitous.

Google Politics

Google vs. China

Google’s announcement about China is rather stunning in many respects from its candidness to the rather bold decision to potentially leave China over “[t]hese attacks and the surveillance they have uncovered–combined with the attempts over the past year to further limit free speech on the web…”.

Some may remember a few years ago that Yahoo! controversially provided information to the Chinese government that resulted in the arrest of Shi Tao and Li Zhi. There’s no evidence this impacted the decision but I would be shocked if it didn’t play any role.

It sounds like within the next few weeks we’ll know if Google and the Chinese government have come to an agreement regarding the censorship of search results. I suspect this is only a tiny part of the full story regarding


Google Nexus One Shaking Things Up?

Google’s Nexus One is now out. Given that they distributed a phone to employees a few weeks ago, this isn’t surprising and we all pretty much knew what was coming for a long time now. Mike Pinkerton (Google Employee, Apple fanboy) has a great and rather candid review of his experience with the device.

Based on everyone’s reviews and looking at the specs it’s pretty obvious. It’s evolutionary rather than revolutionary. The big advantages the Nexus One has on the hardware side are CPU and a camera with flash. Apple is almost at the end of an upgrade cycle so it’s expected to be beat at this time. Apple’s next revision should catch up or beat in most respects. On the software side Apple could even things out quickly if it were to loosen its tight grip on the App Store and allow things like duplicate functionality. Generally speaking Apple already wins thanks to a more consistent and polished UI.

It’s pretty well-known Google isn’t looking to make money off of hardware, they want to make it easier for people to use Google services anywhere/everywhere. That roughly translates to: “we want you to view more ads”. Google is the King Gillette of the web. I’m pretty sure Google wouldn’t mind putting more apps on the iPhone and getting more eyeballs on ads. Google tried via the Google Voice App but was met with resistance.

The most revolutionary thing about it is how it’s sold directly from Google and will be pretty much feature equal across providers. You can either get it subsidized by a mobile provider (T-Mobile for now, Verizon later) or unlocked at a higher price. I’m surprised they aren’t providing their own subsidy on an unlocked phone to try to rattle the mobile market. Right now the vast majority of Americans buy phones subsidized by a provider essentially locking them into an expensive plan. People go for this because the thought of spending several hundred dollars on a phone is scary. If Google were to make an affordable phone that competed with subsidized phones but was unlocked, providers would need to start offering data plans to compete for those customers and essentially break out of the cycle that the iPhone helped strengthen. The main thing keeping people locked into plans is the phone subsidies these days. If the perceived value of that contract diminished the long-term plans would no longer be attractive and competition of hardware and service would be separate.

Of course the downside to this is Google would be throwing a ton of money to create chaos in the mobile market and likely upset mobile providers enough to march to the FCC and demand action (I doubt that would go anywhere though). Google however did make a mult-billion dollar spectrum bid in the past with the goal of keeping it open. Something they succeeded on despite losing the bid, which is possibly another win since this may have been a bluff to get policy. I’m not entirely sure they really wanted the actual spectrum.

If hardware and service competition were separate the mobile market would accelerate quicker since neither could rely on the other to make up for its shortcomings and keep selling. Each would sell or die based on its own merits.

Google’s said to have more phones in the works. I suspect at least one of those is a cheaper more affordable model that will at least partially attempt to open up the market and untie the cell phone from the provider.

Google Internet

Who Indexes Tweets

I was curious who is indexing the links that people tweet on Twitter. It’s obvious someone does since links get ‘clicks’ almost immediately after submission. To do this presumably they are tapping into the xmpp firehose.

Lets take a look: - - [06/Dec/2009:20:17:43 +0000] "GET /test HTTP/1.1" 301 20 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +"

I guess Google has a deal with Twitter. Googlebot indexed just a few seconds after it was sent. As far as I know nothing is actually announced. This is the first evidence I know of a potential deal of some sort. I’d be shocked if Google is scraping the site this quickly.

Edit: Stephen Duncan pointed out in the comments that this was announced in October. Totally forgot about that. - - [06/Dec/2009:20:17:47 +0000] "GET /test HTTP/1.0" 301 - "-" "Mozilla/5.0 (compatible; Butterfly/1.0; + Gecko/2009032608 Firefox/3.0.8"

This is Topsy, a twitter search engine. Never saw this site before. Few tests and I actually kind of like the output. - - [06/Dec/2009:20:17:58 +0000] "GET /test HTTP/1.1" 301 - "-" "Mozilla/5.0 (compatible; MSIE 6.0b; Windows NT 5.0) Gecko/2009011913 Firefox/3.0.6 TweetmemeBot"

Tweetmeme mines Twitter links and attempts to build a Digg-like index based on retweets rather than Diggs. - - [06/Dec/2009:20:18:05 +0000] "GET /test HTTP/1.1" 301 - "-" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0)" - - [06/Dec/2009:20:20:25 +0000] "GET /test HTTP/1.1" 301 - "-" "Python-urllib/2.5"

Can’t identify these AWS hosted services. - - [06/Dec/2009:20:20:53 +0000] "GET /test HTTP/1.1" 301 20 "-" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0)" - - [06/Dec/2009:20:24:23 +0000] "GET /test HTTP/1.1" 301 20 "-" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0)"

This is actually Microsoft. Microsoft’s Bing search engine indexes Twitter. I’m not sure why they indexed twice in such close intervals that seems odd for this day and age.

Mining logs a little deeper it looks like when tweets meet certain criteria (such as retweeted) there are other bots that spider them. It also looks like other search engines may be indexing at a slower rate (Baidu for example).

There are several others from AWS and a few other dedicated providers. These servers are obviously trying to keep a low profile, they don’t even have reverse DNS.

So there you go. Just a matter of seconds after a link hits Twitter this all happens.

Here’s a few more from another Tweet that weren’t in the first set:

Edit: More!: - - [06/Dec/2009:20:49:42 +0000] "GET /test HTTP/1.1" 301 - "-" "Mozilla/5.0 (compatible; Feedtrace-bot/0.2;"

Feedtrace is some sort of twitter mining service currently in beta. - - [06/Dec/2009:20:49:45 +0000] "GET /test HTTP/1.0" 301 - "-" "Mozilla/5.0 (compatible; mxbot/1.0; +"

Chainn is a social data mining service with a few apps that make use of the data it collects.

Google Networking

Google DNS Privacy Policy

John Gruber among others note that Google DNS service is not tied to Google Accounts. That’s not just wording in their privacy statement, it’s technically impossible for them to do otherwise, at least with reasonable accuracy.

Your computer is associated with a Google account via a cookie given to you when you login. Cookies are sent back to Google’s servers as HTTP headers whenever you fetch something from the host that set the cookie (every request, even images). They can only be sent to that domain, nobody else.

DNS doesn’t operate over HTTP, and therefore can’t tell what Google Account you’re using.

Google could however use your IP address you used to login to your Google Account and associate it with your DNS activity, but that would make the statisticians at Google cringe. So many homes and businesses have multiple computers behind a NAT router. Google DNS is unable to distinguish between them. Even one computer can have multiple users.

Before someone jumps up and says “MAC address”, the answer is: NO. To keep it simple a MAC address is part of the “Data Link Layer” of the OSI model (Layer 2) and is used to address adjacent devices. Your MAC address is only transmitted until the first hop which would be the first router on your way to Google. Each time your data makes it to the next device on its way to Google the previous MAC header is stripped off and a new one is added. By the time your bits get to Google that packet of data has only the last hop’s MAC address on it. Many people confuse Layers 2 and 3.

Google Mozilla

Google Goes HTML5

I just noticed that Google is now serving it’s homepage with an HTML5 doctype:

< !doctype HTML>

I suspect this might have changed when they launched that new fade effect. I also noticed they are doing so when using the new YouTube “Feather” beta. This shouldn’t be too surprising considering their involvement in the HTML5 specs and developing a web browser and announcing it’s moving away from Google Gears.

Of course the pages don’t validate, and don’t really take advantage of much HTML5 features (that I’ve seen at least). But it’s a step in the right direction. With modern browsers like Firefox, Chrome, Safari becoming more popular it’s slowly becoming a reality.

Google Networking

Google Public DNS Analysis

Google’s new Public DNS is interesting. They want to lower DNS latency in hopes of speeding up the web.

Awesome IP Address

This is the most interesting thing to me. I view IP addresses similar to the way Steve Wozniak views phone numbers, though I don’t collect them like he does phone numbers.

Level 3 Communications, Inc. LVLT-ORG-8-8 (NET-8-0-0-0-1) 
Google Incorporated LVLT-GOOGL-1-8-8-4 (NET-8-8-4-0-1) 

# ARIN WHOIS database, last updated 2009-12-02 20:00
# Enter ? for additional hints on searching ARIN's WHOIS database.

Looks like Google is working with Level 3 (also their partner for Google Voice I hear) for the purpose of having an easy to remember IP. From what I can tell it’s anycasted to a Google data center.

For what it’s worth, is owned by the US Army. Make of that what you will.


First thought is Google would hijack NXDOMAIN for the purpose of showing ads, like many ISP’s and third party DNS providers. Instead they explicitly state:

If you issue a query for a domain name that does not exist, Google Public DNS always returns an NXDOMAIN record, as per the DNS protocol standards. The browser should show this response as a DNS error. If, instead, you receive any response other than an error message (for example, you are redirected to another page), this could be the result of the following:

  • A client-side application such as a browser plug-in is displaying an alternate page for a non-existent domain.
  • Some ISPs may intercept and replace all NXDOMAIN responses with responses that lead to their own servers. If you are concerned that your ISP is intercepting Google Public DNS requests or responses, you should contact your ISP.

Good. Nobody should ever hijack NXDOMAIN. DNS should be handled per spec.

Performance Benefits

Google documented what they did to speed things up. Some of it anyway. Good news is they will still be obeying TTL it seems. My paraphrasing:

  • Infrastructure – Tons of hardware/network capacity. No shocker.
  • Shared caching in the cluster – Pretty self explanatory.
  • Prefetching name resolutions – Google is using their web search index and DNS server logs to figure out who to prefetch.
  • Anycast routing – Again obvious. They do note however that this can have negative consequences:

    Note, however, that because nameservers geolocate according to the resolver’s IP address rather than the user’s, Google Public DNS has the same limitations as other open DNS services: that is, the server to which a user is referred might be farther away than one to which a local DNS provider would have referred. This could cause a slower browsing experience for certain sites.

Google also discusses the security practices to mitigate some common security issues.


Google says after 24-48 hours they erase any IP information in their privacy policy. Assuming you trust Google that may be better than what your ISP is doing though your ISP could still log by monitoring DNS traffic over their network. As far as I’m aware there are no US laws governing data retention, though proposed several times.

I am curious how this will be treated in Europe who does have some data retention laws for ISP’s. Does providing DNS, traditionally an ISP activity make you an ISP? Or do you need to handle transit as well? Does an ISP need to track DNS queries of someone using a 3rd party DNS? Remember recording IP’s alone is not the same thanks to virtual hosting. Many websites can exist on one IP.

OpenDNS and others may have flown under the radar being smaller companies, but Google will attract more attention. I suspect it’s only a matter of time before someone raises this question.

Would I use it?

I haven’t seen any DNS related problems personally. I’ve seen degraded routing from time to time from my ISP. Especially in those cases, my nearby ISP provided DNS would be quicker than Google. I don’t really like how nameservers may geolocate me further away, but that’s not a deal killer. I don’t plan on switching since I don’t see much of a benefit at this time.


Thoughts On Chrome OS

Chrome OS is an interesting idea, though I still don’t see it as revolutionary like some people. To me it’s still a terminal but unlike the VT100 uses web standards.

Regarding reliability, in my opinion you’ve added new points of failure: your network connection, and the cloud. I’ve see my network connection and web services experience way more problems than my personal computer has.

Regarding security, you’re only as secure as your password to the cloud. Since all your data is synced to the cloud, anyone who can obtain access has it. No longer is physical access necessary. Disk encryption may have saved you when physical access is obtained, but in the cloud you’re often relying on what’s available.

Regarding cost, this becomes a toss up. On the plus side you can have cheap hardware. You don’t need much storage, or CPU. On the downside, your a slave to your network connection for even the most basic tasks. We’ve yet to enter a world of free wireless, and even broadband services are looking to switch to metered service as a replacement to the “all you can eat” plans we’re used to. A change to how bandwidth is priced can ruin this model overnight.

Lets not forget broadband performance in the US is far from stellar. Web UI has improved greatly over the years, but it’s hardly at the level of desktop applications.

Personally I see little value at this time for cheap hardware in exchange for giving up most control. I can replicate all the functionality of Chrome OS using a web browser, and get the added bonuses of a full operating system.

Would I use it? Perhaps as a throw around netbook, but not as a primary computer, or even for serious work. Maybe one day, but not in 2009, and I highly doubt 2010 will close all those gaps.

Google Mozilla

Google Buys On2

Google today announced they are buying On2 Technologies. This is one of their more significant purchases despite the relatively low price tag of $106.5 million since it’s video technology and Google is the largest video source on the web right now.

On2 is really an unknown to most people but their product has an amazing reach thanks to Adobe Flash. VP6 notably was included in Flash 8 and really brought about the age of Flash video (think YouTube). On2 also has VP7 which is considered a H.264 competitor. VP3 was released as open source and lives on as OGG Theora.

Of course by buying On2 Google will not need to pay any licensing for it’s VP7 technology, they can then bundle it into Chrome, Android and Google Chrome OS (finally giving Linux decent web video support). They could also open source it similar to these platforms in hopes that it will gain ubiquity.

This does however leave me wondering if this pending On2 deal had any bearing on the decision to leave HTML 5 <video/> codec ambiguous. It’s noteworthy since Google is very involved in the HTML 5 efforts. As I mentioned last month licensing is really key. If VP7 were open sourced and it’s licensing were compatible to meet Apple and Mozilla’s needs (which could lead to inclusion in Safari and Firefox respectively), OGG Theora is potentially dead overnight. Given Google’s strategy so far of making technology open source in efforts to encourage adoption, I wouldn’t rule this out, though it would likely take a while to evaluate everything and make sure they legally have that option. Timeline could also come into play here. The web isn’t necessarily going to wait for Google. These reviews can potentially take a long time. No guarantee others will incorporate it either, though it’s a pretty good deal should licensing work.

It’s also interesting that now Microsoft has Windows Media Player, Apple has QuickTime, and Google has On2’s codec bundle. It’s not exactly a “player”, but considering it’s usage it’s just as important.

It’s going to be very interesting to see how this plays out. One thing that seems relatively certain is that Google just made web video more interesting.