Disabling Java In Your Browser

For the past 2 years now I’ve been browsing the web with Java disabled. I’ve had less than 5 situations where I needed to turn it on to do something, and all of those were situations with a limited audience (a very old technical tool, intranet applications). I’m of the opinion you really don’t need it enabled to happily browse the web anymore. I can’t disable Flash yet, but Java I seem to be largely fine without. I still have it on my computer in case I need it, but it’s seldom.

Given the past security issues and the fact that Java is outright annoying UI wise and slow to load, I don’t miss it at all. It served a purpose years ago in a webpage when it was difficult to build apps, but those days are long gone. It’s amazing if you remember Java being used for mouseovers way back when.

On HTML5 And The Future Of Privacy

Today’s alarmist without much research news is “New Web Code Draws Concern Over Risks to Privacy” about HTML5 and its threat to privacy. How evil of HTML5 and its creators.

The Real Deal

Persistent cookies are nothing new. Essentially the strategy works like this: Store data everywhere you can on the users footprint, and if data it deleted in a few locations, you copy it back from another location the next time you can. It’s regenerative by design. A popular example is evercookie which uses:

  • Standard HTTP Cookies
  • Local Shared Objects (Flash Cookies)
  • Storing cookies in RGB values of auto-generated, force-cached PNGs using HTML5 Canvas tag to read pixels (cookies) back out
  • Storing cookies in and reading out Web History
  • Storing cookies in HTTP ETags
  • Internet Explorer userData storage
  • HTML5 Session Storage
  • HTML5 Local Storage
  • HTML5 Global Storage
  • HTML5 Database Storage via SQLite

Note that several of these aren’t HTML5 specific. More than one of which isn’t cleared by just “erasing cookies”.

HTML5 does add a few new possibilities, but they are also by design as easy to control, monitor and restrict as your browser (or third-party add-on) will allow. HTML5 storage mechanisms are bound to the host that created them making them easy to search/sift/manage as HTTP cookies. Much worse are some of the more obscure cookie methods (Flash Cookies, various history hacks). They don’t really provide any more of a privacy risk than what the browser already has been offering for the past decade.

To Shut Up The Geolocaiton Conspiracy Theorists

Before someone even attempts the “Geolocation API lets advertisers know my location” myth, lets get this out of the way. The specification explicitly states:

User agents must not send location information to Web sites without the express permission of the user. User agents must acquire permission through a user interface, unless they have prearranged trust relationships with users, as described below. The user interface must include the URI of the document origin [DOCUMENTORIGIN]. Those permissions that are acquired through the user interface and that are preserved beyond the current browsing session (i.e. beyond the time when the browsing context [BROWSINGCONTEXT] is navigated to another URL) must be revocable and user agents must respect revoked permissions.

Some user agents will have prearranged trust relationships that do not require such user interfaces. For example, while a Web browser will present a user interface when a Web site performs a geolocation request, a VOIP telephone may not present any user interface when using location information to perform an E911 function.

To my knowledge no user agent implements Geolocation without complying with these specifications. None.

No HTML5 Needed For Fingerprinting

Even if you do manage to wipe all the above storage locations, you’re still not untraceable. Browser fingerprinting is the idea that just your system configuration makes you unique enough to be traceable. This includes things like your browser version, platform, flash version, and various other bits of data plugins may additionally leak. The EFF recently did a rather impressive study to learn about the accuracy of this technique. Computers with Flash and Java installed sport 18.8 bits of entropy and result in 94.2% of browsers being unique in the EFF study [cite, pdf]. Of course their data was likely skewing towards more experienced web users who are more likely to have an assortment of customizations to their computer (specific plugins, more variety in web browsers, operating systems, fonts) than the average internet user. I’d wager that their data downplays the effectiveness of this technique.

The idea that HTML5 is a privacy risk is FUD. It doesn’t provide any worse security than anything else already out there. It’s actually easier to counteract than what’s already being used since it’s handled by the browser.

The Future

I still believe all browsers out there can do a much better job of protecting privacy when it comes to local data storage for the purpose of tracking. What I believe what needs to happen is web browsers need to start moving away from the “cookie manager” interfaces that are now a decade+ old and move towards a “my data management” interface that lets users view and delete more than just cookies. It needs to encompass all the storage methods listed above as supported by the browser. Hooks should also exist so that plug-ins that have data storage (like Flash) can also be dealt with using the same UI.

Additionally it needs to be possible to control retention policies per website. For example I should be able to let Google storage persist indefinitely, Facebook for 2 weeks, and Yahoo for the length of my browser session should I wish.

My personal preference would be for a website to denote the longest storage time for any object on a webpage in the UI. Clicking on it would give a breakdown of all hostnames that makeup the page, what they are storing and let the user select their own policy. With 2 clicks I could then control my privacy on a granular level. For example visiting SafePasswd.com would give me a [6] in the UI. Clicking would show me a panel this:

+------------------------------------------------------------------------------+
| My Data Settings for SafePasswd.com:                                         |
|                                                                              |
|  Host                        Longest Requested Lifespan    Your Choice       |
|                                                                              |
| *safepasswd.com              2 years                       [site default]    |
| googleads.g.doubleclick.net  6 years                       [browser session] |
|                                                                              |
|                                                                              |
|                                                       (Done)  (Cancel)       |
+------------------------------------------------------------------------------+

I could then override googleads.g.doubleclick.net to be for the browser session via the drop down if that’s what I wanted. I could optionally forbid it from saving anything if that’s what I wanted. I could optionally click-through for more detail or view the data to help me make my decision. Perhaps this would also be a good place for P3P like data to be available. One of the notable failures of P3P that impeded usage was it was never easy to view so it never caught on.

The browser would then remember I forbid googleads.g.doubleclick.net from storing data beyond my browser session. This would apply to googleads.g.doubleclick.net regardless of what website it was used on.

This model works better than the “click to confirm cookie” model that only a handful of people on earth ever had the patience for. It provides easy access to control and view information with minimal click-throughs.

It also makes a web page much more transparent to an end-user who could then easily see who they are interacting with when they visit one webpage with several ads, widgets, social media integration points etc.

One click to view data policies, two clicks to customize, three to save.

HTML5 is not a risk here. The web moving to HTML5 is like going from the lawless land to a civilized society where structure and order rule.

Is IE8 Trident’s Last Stand?

Randall C. Kennedy at InfoWorld wrote:

IE8 is the last version of the Internet Explorer Web browser. At least, that’s what I’m hearing through the grapevine. It seems that Microsoft is preparing to throw in the towel on its Internet Explorer engine once and for all.

There were rumors earlier this year that the IE team was looking at WebKit a few months ago. I said then and I still think that’s a real perilous approach considering the legacy they need to somehow support. The other approach is to start over, something that’s possibly on the works.

Any truth to these claims? I don’t know. Though I’d be curious to see how Microsoft handles it’s customers who expect old applications to keep working and others who want Microsoft to catch up with progress. I doubt they can go either way 100%. Which way will they lean? I think that’s anyone’s guess.

How To Be More Secure With Your Data & Identity

It’s amazing how on a daily basis there’s a story about someone’s identity or data being stolen, personal info being misused, or just getting screwed via the Internet. Most of the time it’s due to a complete lack of standards regarding how people treat their digital property and identity. It’s the electronic equivalent of leaving your home and not locking the door. Anyone can come in and take what they want.
Continue reading

Tab Impact On Total Time Spent

As everyone in the industry knows, Nielsen/NetRatings no longer relies on page views instead preferring total time spent. This makes sense since ajax applications can have 1 page view, but keep a user for an hour. Not to mention other things like video or Flash. The use of time spent is likely much more accurate. In my mind “time spent” is time actually spent on the site (I’m a literal guy).

This of course raises an interesting question. How do tabs influence this metric? Take the following situation as an example. A user visits a home page, and opens a link in a new tab. Then finds another link and opens it in a new [background] tab. That’s 3 tabs in 1 visit (assume visit to be 30 minutes).

Before tabs, most browser sessions would look like this:
Linear Pathing

There’s now an increasing number that will look like this (gray is a tab not in view):
Tabbed Pathing

If we assume total time on the site is time between the first and last page, we potentially undercount the total time on sites that list information (for example Digg). The total time to make those clicks could be < 10 seconds, but the time spent reading those two page alone might be > 10 minutes. Many tab power-users from what I’ve read around the web over the years essentially use them as a way to bookmark their “to read” list (including myself). It also undercounts sites like Gmail which are ajax based (1 page) but can be used for several minutes.

If we use javascript to “ping” (call back by placing a tracker gif) the analytics service every x seconds to see if the page is still open, we potentially double count since a user can’t be in 3 tabs at once. The clock would be counting 3 seconds for every 1 second the user is actually looking at the page.

This raises the question: are sites that are heavily used by Firefox, Safari, Opera and IE7 site underestimated or overestimated because of the way users browse the site? How do you accurately tell how long a view is when a user can have multiple tabs?

Another example is someone who keeps their webmail open in a tab all afternoon for easy access. They may only check it 1x measuring no more than 1 minute in actual attention. But it’s open for 5 hrs. What is the real time on the page? You can measure my interaction (opening/closing mail). But what if I’m reading an email for an hour (it’s a really complicated one)? How does that compare to just leaving it open in the background?

This is really no different than using new windows, the difference being that most people seem to have found windows to be annoyance, while tabs are a “feature”. The increase in usage and popularity in a time where visit length matters raises an interesting question. How do you measure it?

One assumption is that it’s just a small percentage of the population, which is likely true. The problem with this assumption is that it’s one subject to change as the browserscape matures and users learn about new features. Another assumption is to just account for all time a page is open, even if it’s not visible. The downside I see here is that it’s pretty inaccurate. As a content producer I’d like to know if my content is used, or just loaded on a users computer. If I were an advertiser I’d care even more.

I’m not sure how analytics firms approach this. In a sense it’s similar to the “hotel problem“. Perhaps just something you need to decide upon and live with.

Firefox Tip: Don’t Let Websites Resize Your Browser Window

Have a favorite website that still thinks its 1999? Resizes your browser window into a small awkward space? It can be annoying. You have that big display, and you should be able to use it. Thankfully you can prevent this. Just go into Tools-> Options and select the “Content” tab. Then click on the “Advanced” button across from “Enable JavaScript”. Uncheck the “Move or resize existing windows” checkbox. Now you don’t have to deal with this.

Camino 1.5

Camino 1.5 is out. It’s a great product for Mac users. Lets face it, the best browsers are on the Mac right now. Camino, Firefox, Safari, Shiira, and OmniWeb. All provide an excellent user experience. Camino is a great balance between the Gecko rendering engine (which has the benefit of extra market share thanks to it’s cross platform nature and sibling Firefox’s efforts) and a smooth UI. The obvious downside being the lack of extensions.

Confusing Cross Browser UI Design

Most have heard by now that Internet Explorer is adopting the Firefox RSS icon to standardize and help users who hate having to remember what equivalent icons are. Of course this is great for users. Though I wish they were a bit more consistent with their practices. UI design cross browsers is important simply for security purposes (as I will demonstrate). IE has apparently made some great strides in combating Phishing. What I disagree with, is how they implemented the UI. I think it’s confusing, and could easily be fixed, should they decide to do so.

Their scheme essentially works by coloring the URL bar based on how suspicious the website is. Known scammers get red, suspected get yellow, and a potential good site would be green. This is obviously modeled after a traffic light.

What I dislike is how that can be confusing to the end user. Right now, the colored URL bar technique is used by Firefox and Opera to distinguish a secure website (since it’s more obvious than the little lock). Take a look at the little demo I have here:

Good Site Opera 9

Opera 7

Good Site Firefox 1.5

Firefox 1.5

Bad Site Internet Explorer 7

Internet Explorer 7

Screenshot from IE Blog.

For an end user, who doesn’t follow browser changes, and perhaps first encounters IE 7 at work, or in a public terminal. Seeing the yellow bar is familiar. We know that as being safe. I think many wouldn’t even notice the “Suspicious Website” text on the right side. The shield even looks a bit like the Lock icon in Firefox. Very confusing.

My suggestion is to use another color, in particular, one that I call “orange”. I release the color “orange” under a Public Domain License. Anyone may use it, however they may wish, no need to credit me ;-) (though I’d appreciate it).

Bad Site Internet Explorer 7 + My Solution

Internet Explorer 7

This would distinguish the site as a possible fraudulent website, but still avoid using Yellow, which many users now view as “secure” aka “safe”. This solution solves the problem of conflicting UI design between browsers.