Adventures With document.documentElement.firstChild

Here’s an interesting DOM test-case I ran across inadvertently yesterday.

For the purpose of this post assume the following markup:

< !DOCTYPE html>
<html>
<!– i broke the dom –>
<head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
    <title>Testcase</title>
</head>
<body>
<p>Something</p>
</body>
</html>

If I use document.documentElement.firstChild I don’t get consistent behavior. In Firefox and IE I get the <head/> element, which is what I was initially expecting. In WebKit (Safari/Chrome) and Opera. I get the HTML comment which I wasn’t.

I think WebKit and Opera are technically correct on this as the DOM Level 2 specs state:

firstChild of type Node, read only
The first child of this node. If there is no such node, this
returns null.

A COMMENT_NODE is a node and therefore should have been first. As for the position of the comment, the document is valid HTML5 and also is valid as XHTML 1.0 Strict and HTML 4 Strict. My interpretation is that this means indeed the comment is the first valid node in the documentElement.

One of the reasons why I even thought to use document.documentElement.firstChild is that I saw Google doing it the other day for the new asynchronous tracking code for Google Analytics (currently in beta). Originally the code was:

var _gaq = _gaq || [];
  _gaq.push([‘_setAccount’, ‘UA-XXXXX-X’]);
  _gaq.push([‘_trackPageview’]);
 
  (function() {
    var ga = document.createElement(‘script’);
    ga.src = (‘https:’ == document.location.protocol ? ‘https://ssl’ :
        ‘http://www’) + ‘.google-analytics.com/ga.js’;
    ga.setAttribute(‘async’, ‘true’);
    document.documentElement.firstChild.appendChild(ga);
  })();

It has now been updated to prevent this problem. I don’t know if I was the first to report it or if it was already known by the Google engineers. The code, still in beta is now:

var _gaq = _gaq || [];
  _gaq.push([‘_setAccount’, ‘UA-XXXXX-X’]);
  _gaq.push([‘_trackPageview’]);
 
  (function() {
    var ga = document.createElement(‘script’); ga.type = ‘text/javascript’; ga.async = true;
    ga.src = (‘https:’ == document.location.protocol ? ‘https://ssl’ : ‘http://www’) + ‘.google-analytics.com/ga.js’;
    (document.getElementsByTagName(‘head’)[0] || document.getElementsByTagName(‘body’)[0]).appendChild(ga);
  })();

The new code seems a bit more resilient. They also got rid of the longhand ga.setAttribute in favor of just ga.async and added the type attribute.

There is a test case for anyone who wants to try it. I haven’t found a relevant Mozilla bug.

Tags: , , , , , ,

Related Posts

Related Posts


10 Responses to “Adventures With document.documentElement.firstChild”

  1. google analytics says:

    Thanks for writing this! Its good to get this documented. We were aware of the issue…we just underestimated the number of sites with this with comments above the head. We made the wrong tradeoff. The new code isn’t as terse, but it should cover pretty much any page you throw at it.

  2. Boris says:

    The DOM spec is the wrong place to look for this, since your question is really what the DOM should look like. The relevant spec there would be the one that covers how to convert HTML source into a DOM: the HTML parsing spec. There isn’t one at the moment, though HTML5 is working on it. So currently behavior is undefined. Not sure what the HTML5 draft proposes for the behavior.

    In particular, whitespace before is treated magically in various UAs and in the HTML5 draft last I checked; comments may or may not be depending. It’s interesting that you didn’t expect firstChild to be the textnode coming before the comment; why not?

  3. Matt says:

    FYI document.head was added to HTML5[1] to simplify this case. The issue you found was discussed[2] on the whatwg list as one of the reasons for a better way to access the head of the document. Hopefully it will be one less browser compat. issue to worry about once implemented.

    [1] http://www.whatwg.org/specs/we.....ument-head
    [2] http://lists.whatwg.org/htdig......23105.html

  4. Robert says:

    @google analytics: The new code I think is pretty solid since it leaves little to chance. I’m pretty sure even I won’t manage to break it, at least from the implementers standpoint.

    @Boris: My expectations were admittedly based on previous experience more than specs. Not sure if I’ve ever run across a situation exactly like this before.

    @Matt: I’m aware of document.head, though I don’t foresee using that for quite some time when 99% of the world runs a web browser that will work on. At the rate we’re going that’s 2029.

  5. Sean Hogan says:

    If you need to access document.head more than once it is probably worth adding it (if it doesn’t already exist):

    if (!document.head) document.head = document.getElementsByTagName(“head”)[0];

    This should always work in HTML pages, even if the script occurs before the tag.

  6. Robert says:

    @Sean Hogan: Virtually all js libraries make it easy too for example $(“head”) will work in jQuery.

  7. Ms2ger says:

    According to HTML5, you should get the comment. (Relevant part of the spec: The “before head” insertion mode.) Mozilla’s HTML5 parser matches the spec here.

  8. James says:

    This post is a bit confusing on planet mozilla, as the HTML is partly parsed instead of displayed, so the comment is missing.

  9. Robert says:

    @James: Thanks, that’s an old bug I totally forgot about. It’s now fixed so next time it should show correctly.

  10. […] document.documentElement is the HTML element and its first child must be the head. Not necessarily, as it turns out. If there’s a comment following the HTML element, WebKits will give you the comment as the first child. Here’s an investigation with a test case. […]

Leave a Reply

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

By submitting a comment here you grant this site a perpetual license to reproduce your words and name/web site in attribution.