Quote your angle brackets

Here’s a neat fact: both IE and Firefox will find a </script> tag in the middle of a Javascript string constant. Now that’s probably not too surprising; the browser sort-of has to do that, or else a missing quote in a script block would eat the whole page. This came up in my universe however when I was doing some testing on a page I suspected wasn’t quoting stuff properly.

Any web-oriented templating mechanism (or, more generally, anything spewing out HTML programmatically) will have to worry about what to do with external data that needs to be dropped into the HTML. When it’s going into HTML code, like this:

  <label>Name:</label> ${whatever.name}

then you’ve got to translate angle brackets and ampersands to HTML entities “&lt;”, “&gt;”, and “&amp;”. However, when you’re dropping stuff into portions of the HTML document that are actually Javascript blocks (for example, when populating a data structure), you don’t do that. Instead, you have to massage the value so that it works inside a Javascript string constant (well, that’s what you do if you want to put it in a string constant, at least).

Our library has an “escapeJS” routine that does that sort of quoting. What it worried about (up until about 30 minutes ago) was quote characters, backslashes, and characters outside the old 7-bit printable ASCII range. Of note, it did not worry about less-than characters (left angle bracket, that is). I stumbled over this because I was getting a weird complaint from Firefox:

  whatever = "${mylib:escapeJS(whatever)}";

The error seemed strange because it was about the string being unterminated. When I brought up the source, however, the whole thing was there (including an embedded “</script>” in the string – remember, I was doing XSS testing). “Durrr,” I thought to myself. “What’s Firefox doing?” Embarrassingly it took me some time to get it. I thought that maybe it was a Firefox 3 thing, but no. When I saw that IE did it too, the feeble 20-watt light bulb went on.

I updated the “escapeJS” routine so that it treats less-than characters the same as control characters, encoding them with the Javascript “\u” notation. Probably everybody else in the world is less dumb than me, but nevertheless I figured I’d write this up.

PS: Ha ha I just realized that this is another “spring bug.”

9,832 Responses to “Quote your angle brackets”