image/jpeg Signature
IE 7 DATA[0:1] == 0xffd8
Firefox 3 DATA[0:2] == 0xffd8ff
Safari 3.1 DATA[0:3] == 0xffd8ffe0
Chrome DATA[0:2] == 0xffd8ff
image/gif Signature
IE 7 (strncasecmp(DATA,“GIF87”,5) == 0) ||
(strncasecmp(DATA,“GIF89”,5) == 0)
Firefox 3 strncmp(DATA,“GIF8”,4) == 0
Safari 3.1 N/A
Chrome (strncmp(DATA,“GIF87a”,6) == 0) ||
(strncmp(DATA,“GIF89a”,6) == 0)
image/png Signature
IE 7 (DATA[0:3] == 0x89504e47) &&
(DATA[4:7] == 0x0d0a1a0a)
Firefox 3 DATA[0:3] == 0x89504e47
Safari 3.1 N/A
Chrome (DATA[0:3] == 0x89504e47) &&
(DATA[4:7] == 0x0d0a1a0a)
image/bmp Signature
IE 7 (DATA[0:1] == 0x424d) &&
(DATA[6:9] == 0x00000000)
Firefox 3 DATA[0:1] == 0x424d
Safari 3.1 N/A
Chrome DATA[0:1] == 0x424d
Table 1. Signatures for four popular image formats.
DATA is the sniffing buffer. The nomenclature is
detailed in the Appendix.
Signatures. We find that each browser employs different
signatures. Table 1 shows the different signatures for four
popular image types. Understanding the exact signatures
used by browsers, especially the HTML signature, is crucial
in constructing content-sniffing XSS attacks. The HTML
signatures used by browsers differ not only in the set of
HTML tags, but also in how the algorithm searches for
those tags. Internet Explorer 7 and Safari 3.1 use permissive
HTML signatures that search the full sniffing buffer (256
bytes and 1024 bytes, respectively) for predefined HTML
tags. Firefox 3 and Google Chrome, however, use strict
HTML signatures that require the first non-whitespace char-
acter to begin one of the predefined tags. The permissive
HTML signatures in Internet Explorer 7 and Safari 3.1
let attackers construct chameleon documents because a file
that begins GIF89a<html> matches both the GIF and the
HTML signature. Table 2 presents the union of the HTML
signatures used by the four browsers. These browsers will
not treat a file as HTML if it does not match this signature.
Restrictions. We find that some browsers restrict when
certain MIME types can be sniffed. For example, Google
Chrome restricts which Content-Type headers can
be sniffed as HTML to avoid privilege escalation (see
Section 3). Table 5 in the Appendix shows which
Content-Type header values each browser is willing to
sniff as HTML.
text/html Signature
(strncmp(PTR,"<!",2) == 0) ||
(strncmp(PTR,"<?",2) == 0) ||
(strcasestr(DATA,"<HTML") != 0) ||
(strcasestr(DATA,"<SCRIPT") != 0) ||
(strcasestr(DATA,"<TITLE") != 0) ||
(strcasestr(DATA,"<BODY") != 0) ||
(strcasestr(DATA,"<HEAD") != 0) ||
(strcasestr(DATA,"<PLAINTEXT") != 0) ||
(strcasestr(DATA,"<TABLE") != 0) ||
(strcasestr(DATA,"<IMG") != 0) ||
(strcasestr(DATA,"<PRE") != 0) ||
(strcasestr(DATA,"text/html") != 0) ||
(strcasestr(DATA,"<A") != 0) ||
(strncasecmp(PTR,"<FRAMESET",9) == 0) ||
(strncasecmp(PTR,"<IFRAME",7) == 0) ||
(strncasecmp(PTR,"<LINK",5) == 0) ||
(strncasecmp(PTR,"<BASE",5) == 0) ||
(strncasecmp(PTR,"<STYLE",6) == 0) ||
(strncasecmp(PTR,"<DIV",4) == 0) ||
(strncasecmp(PTR,"<P",2) == 0) ||
(strncasecmp(PTR,"<FONT",5) == 0) ||
(strncasecmp(PTR,"<APPLET",7) == 0) ||
(strncasecmp(PTR,"<META",5) == 0) ||
(strncasecmp(PTR,"<CENTER",7) == 0) ||
(strncasecmp(PTR,"<FORM",5) == 0) ||
(strncasecmp(PTR,"<ISINDEX",8) == 0) ||
(strncasecmp(PTR,"<H1",3) == 0) ||
(strncasecmp(PTR,"<H2",3) == 0) ||
(strncasecmp(PTR,"<H3",3) == 0) ||
(strncasecmp(PTR,"<H4",3) == 0) ||
(strncasecmp(PTR,"<H5",3) == 0) ||
(strncasecmp(PTR,"<H6",3) == 0) ||
(strncasecmp(PTR,"<B",2) == 0) ||
(strncasecmp(PTR,"<BR",3) == 0)
Table 2. Union of HTML signatures. PTR is a pointer to
the first non-whitespace byte of DATA.
Fast path. We find that, unlike other browsers, Internet
Explorer 7 varies the order in which it applies its
signatures according to the Content-Type header. If
the header is text/html, image/gif, image/jpeg,
image/pjpeg, image/png, image/x-png, or
application/pdf and the content matches the
signature for the indicated MIME type, then the algorithm
skips the remaining signatures. Otherwise, the algorithm
checks the signatures in the usual order.
Over time, Microsoft has added MIME types to this
fast path. For example, in April 2008, Microsoft added
application/pdf to the fast path to improve compati-
bility [28]. Microsoft classified this change as non-security
related [29], but adding MIME types to the fast path makes
construction of chameleon documents more difficult. If the
chameleon matches a fast-path signature, the browser will
not treat the chameleon as HTML. However, if the site’s
upload filter is more permissive than the browser’s signature,
the attacker can craft an exploit as we show in Section 2.5.