This document outlines Cross-Origin Read Blocking, an algorithm by which some dubious cross-origin resource loads may be identified and blocked by web browsers before they reach the web page. CORB offers a way to maintain same-origin protections on user data, even in the presence of side channel attacks.
The same-origin policy generally prevents one origin from reading arbitrary network resources from another origin. In practice, enforcing this policy is not as simple as blocking all cross-origin loads: exceptions must be established for web features, like <img>
or <script>
, which can target cross-origin resources for historical reasons, and for the CORS mechanism, which allows some resources to be selectively read across origins.
Certain type of content, however, can be shown to be incompatible with all of the historically-allowed permissive contexts. JSON is one such type: a JSON response will result in a decode error when targeted by the <img>
tag, either a no-op or syntax error when targeted by the <script>
tag, and so on. The only case where a web page can load JSON with observable consequences, is via fetch() or XMLHttpRequest; and in those cases, cross-origin reads are moderated by CORS.
If such resources can be detected and blocked earlier in the progress of the load -- say, before the response makes it to the image decoder or JavaScript parser -- the chances of information leakage via potential side channels can be reduced.
CORB can mitigate the following attack vectors:
Cross-Site Script Inclusion (XSSI)
<script>
tag at a target resource which is not JavaScript, and observing some side-effects when the resulting resource is interpreted as JavaScript. An early example of this attack was discovered in 2006: by overwriting the Javascript Array constructor, the contents of JSON lists could be intercepted as simply as: <script src="https://example.com/secret.json">
. While the array constructor attack vector is fixed in current browsers, numerous similar exploits have been found and fixed in the subsequent decade. For example, see the slides here.<script>
element.Speculative Side Channel Attack (e.g. Spectre).
<img src="https://example.com/secret.json">
element to pull a cross-site secret into the process where the attacker's Javascript runs, and then 2) use a speculative side channel attack (e.g. Spectre) to read the secret.X-Content-Type-Options: nosniff
header.When CORB decides that a response needs to be CORB-protected, the response is modified as follows:
CORB blocking needs to take place before the response reaches the process hosting the cross-origin initiator of the request. In other words, CORB blocking should prevent CORB-protected response data from ever being present in the memory of the process hosting a cross-origin website (even temporarily or for a short term). This is different from the concept of filtered responses (e.g. CORS filtered response or opaque filtered response) which just provide a limited view into full data that remains stored in an internal response and may be implemented inside the renderer process.
Responses to the following kinds of requests are not eligible for CORB protection:
<iframe>
s or <object>
s or is outside the scope of CORB (and depends on Site Isolation approach specific to each browser).[lukasza@chromium.org] TODO: Figure out how Edge's VM-based isolation works (e.g. if some origins are off-limits in particular renderers, then this would greatly increase utility of CORB in Edge).
[lukasza@chromium.org] TODO: Figure out how other browsers approach Site Isolation (e.g. even if there is no active work, maybe there are some bugs we can reference here).
[lukasza@chromium.org] AFAIK, in Chrome a response to a download request never passes through memory of a renderer process. Not sure if this is true in other browsers.
All other kinds of requests may be CORB-eligible. This includes:
ping
, navigator.sendBeacon()
<link rel="prefetch" ...>
<img>
tag, /favicon.ico
, SVG‘s <image>
, CSS’ background-image
, etc.<script>
, importScripts()
, navigator.serviceWorker.register()
, audioWorklet.addModule()
, etc.The CORB algorithm will declare a response as either CORB-protected (and eligible to be blocked) or CORB-exempt (and always allowed through). It will be shown that the following types of content can be CORB-protected:
<iframe>
embedding and )[lukasza@chromium.org] TODO: Rewrite the remainder of this section:
- Change document-vs-resource narrative to CORB-protected VS CORB-allowed (or non-CORB-eligible).
- Plain text = sniffing for HTML or XML or JSON
- Exclude PDF/ZIP/other from CORB and cover how web developers can protect PDF/ZIP/other resources even though the are not CORB-protected
- Cover how images/videos are not protected (mention possibility of an opt-in via header)
CORB decides whether a response is a document or a resource primarily based on the Content-Type header.
CORB considers responses with the following Content-Type
headers to be resources (these are the content types that can be used in <img>
, <audio>
, <video>
, <script>
and other similar elements which may be embedded cross-origin):
application/javascript
or text/jscript
text/css
image/*
audio/*
, video/*
or application/ogg
font/*
or one of legacy font types[lukasza@chromium.org] Some images (and other content types) may contain sensitive data that shouldn't be shared with other origins. To avoid breaking existing websites images have to be treated by default as cross-origin resources, but maybe we should consider letting websites opt-into additional protection. For examples a server might somehow indicate to treat its images as origin-bound documents protected by CORB (e.g. with a new kind of HTTP response header that we might want to consider).
CORB considers HTML, XML and JSON to be documents - this covers responses with one of the following Content-Type
headers:
image/svg+xml
which is a resource)Responses marked as multipart/*
are conservatively considered resources. This avoids having to parse the content types of the nested parts. We recommend not supporting multipart range requests for sensitive documents.
Responses with a MIME type not explicitly named above (e.g. application/pdf
or application/zip
) are considered to be documents. Similarly, responses that don't contain a Content-Type
header, are also considered documents. This helps meet the goal of protecting as many documents as possible.
[lukasza@chromium.org] Maybe this is too aggressive for the initial launch of CORB? See also https://crbug.com/802836. OTOH, it seems that in the long-term this is the right approach (e.g. defining a short list of types and type patterns that don't need protection, rather than trying to define a long and open-ended list of types that need protection today or would need protection in the future).
CORB considers text/plain
to be a document. TODO: application/octet-stream.
[lukasza@chromium.org] This seems like a one-off in the current implementation. Maybe
text/plain
should just be treated as “a MIME type not explicitly named above”.
Additionally CORB may classify as a document any response that 1) has Content-Type set to a non-empty value different from text/css
and 2) starts with a JSON parser breaker (e.g. )]}'
or {}&&
or for(;;);
) - regardless of the presence of X-Content-Type-Options: nosniff
header. A JSON parser breaker is highly unlikely to be present in a resource (and therefore highly unlikely to lead to misclassification of a resource as a document) - for example:
application/javascript
.FF D8 FF
for image/jpeg).text/css
(and missing and/or empty Content-Type) is an exception, because these Content-Types are allowed for stylesheets and it is possible to construct a file that begins with a JSON parser break, but at the same time parses fine as a stylesheet - for example:)]}' {} h1 { color: red; }
CORB can't always rely solely on the MIME type of the HTTP response to distinguish documents from resources, since the MIME type on network responses is sometimes wrong. For example, some HTTP servers return a JPEG image with a Content-Type
header incorrectly saying text/html
.
To avoid breaking existing websites, CORB may attempt to confirm if the response body really matches the CORB-protected Content-Type response header:
CORB will only sniff to confirm the classification based on the Content-Type
header (i.e. if the Content-Type
header is text/json
then CORB will sniff for JSON and will not sniff for HTML and/or XML).
If some syntax elements are shared between CORB-protected and non-CORB-protected MIME types, then these elements have to be ignored by CORB sniffing. For example, when sniffing for (CORB-protected) HTML, CORB ignores and skips HTML comments, because they can also be present in (non-CORB-protected) JavaScript. This is different from the HTML sniffing rules, used in other contexts.
[lukasza@chromium.org] It is not practical to try teaching CORB about sniffing all possible types of documents like
application/pdf
,application/zip
, etc.
[lukasza@chromium.org] Some MIME types types are inherently not sniffable (for example
application/octet-stream
).
[lukasza@chromium.org] TODO: Explain how text/plain sniffs for any of HTML, XML or JSON. Also discuss whether text/plain+nosniff should be different from text/html+nosniff (the latter should be CORB-protected, not sure about the former, given the still not understood media blocking that happens with CORB).
CORB should trust the Content-Type
header in scenarios where sniffing shouldn't or cannot be done:
When X-Content-Type-Options: nosniff
header is present.
When the response is a partial, 206 response.
[lukasza@chromium.org] An alternative behavior would be to allow (instead of blocking) 206 responses that would be sniffable otherwise (so one of HTML, XML or JSON + not accompanied by a nosniff header). This alternative behavior would decrease the risk of blocking mislabeled resources, but would increase the risk of not blocking documents that need protection (an attacker could just need to issue a range request - protection in this case would depend on whether 1) the response includes a nosniff header and/or 2) the server rejects range requests altogether). Note that the alternative behavior doesn't help with mislabeled text/plain responses (see also https://crbug.com/801709).
[lukasza@chromium.org] We believe that mislabeling as HTML, JSON or XML is most common. TODO: are we able to back this up with some numbers?
[lukasza@chromium.org] Note that range requests are typically not issued when making requests for scripts and/or stylesheets.
Sniffing is a best-effort heuristic and for best security results, we recommend 1) marking responses with the correct Content-Type header and 2) opting out of sniffing by using the X-Content-Type-Options: nosniff
header.
CORB should have no observable impact on <img>
tags unless the image resource is both 1) mislabeled with an incorrect, non-image, CORB-protected Content-Type and 2) served with the X-Content-Type-Options: nosniff
response header.
Examples:
Correctly-labeled HTML document
<img>
tag:Content-Type: text/html
X-Content-Type-Options
headerfetch/corb/img-html-correctly-labeled.sub.html
Mislabeled image (with sniffing)
<img>
tag:Content-Type: text/html
X-Content-Type-Options
headerfetch/corb/img-png-mislabeled-as-html.sub.html
Mislabeled image (nosniff)
<img>
tag:Content-Type: text/html
X-Content-Type-Options: nosniff
nosniff
header, CORB will have to rely on the Content-Type
header. Because this response is mislabeled (the body is an image, but the Content-Type
header says that it is a html document), CORB will incorrectly classify the response as requiring CORB-protection.fetch/corb/img-png-mislabeled-as-html-nosniff.tentative.sub.html
In addition to the HTML <img>
tag, the examples above should apply to other web features that consume images: /favicon.ico
, SVG‘s <image>
, background-image
in stylesheets, painting images onto (potentially tainted) HTML’s <canvas>
, etc.
[lukasza@chromium.org] TODO: Figure out if we can measure and share how many of CORB-blocked responses are 1) for
<img>
tag, 2)nosniff
, 3) 200-or-206 status code, 4) non-zero (or non-available) Content-Length. Cursory manual testing on major websites indicates that most CORB-blocked images are tracking pixels and therefore blocking them won't have any observable effect.
[lukasza@chromium.org] Earlier attempts to block nosniff images with incompatible MIME types failed. We think that CORB will succeed, because
- it will only block a subset of CORB-protected MIME types (e.g. it won't block
application/octet-stream
quoted in a Firefox bug)- CORB is an important response to the recent announcement of new side-channel attacks like Spectre.
TODO.
CORB should have no observable impact on <script>
tags except for cases where a CORB-protected, non-JavaScript resource labeled with its correct MIME type is loaded as a script - in these cases the resource will usually result in a syntax error, but CORB-protected response's empty body will result in no error.
Examples:
<script>
tag:Content-Type: text/html
X-Content-Type-Options
headerfetch/corb/script-html-correctly-labeled.tentative.sub.html
[lukasza@chromium.org] In theory, using a non-empty response in CORB-blocked responses might reintroduce the lost syntax error. We didn't go down that path, because
- using a non-empty response would be inconsistent with other parts of the Fetch spec (like opaque filtered response).
- retaining the presence of the syntax error might require changing the contents of a CORB-blocked response body depending on whether the original response body would have caused a syntax error or not. This would add extra complexity that seems undesirable both for CORB implementors and for web developers.
Mislabeled script (with sniffing)
<script>
tag:Content-Type: text/html
X-Content-Type-Options
headerfetch/corb/script-js-mislabeled-as-html.sub.html
Mislabeled script (nosniff)
<script>
tag:Content-Type: text/html
X-Content-Type-Options: nosniff
nosniff
response header response will cause the response to be blocked when its MIME type (text/html
in the example) is not a JavaScript MIME type (this behavior is required by the Fetch spec).fetch/corb/script-js-mislabeled-as-html-nosniff.sub.html
In addition to the HTML <script>
tag, the examples above should apply to other web features that consume JavaScript including script-like destinations like importScripts()
, navigator.serviceWorker.register()
, audioWorklet.addModule()
, etc.
TODO.
TODO: Correctly-labeled HTML document
<link rel="stylesheet" href="...">
tag: TODO.TODO: Mislabeled stylesheet (with sniffing)
TODO: Mislabeled stylesheet (nosniff)
nosniff
response header response will cause the response to be blocked when its MIME type (text/html
in the example) is not text/css
(this behavior is required by the Fetch spec).TODO: Correctly-labeled stylesheet with JSON parser breaker
TODO: Polyglot HTML/CSS labeled as text/html.
<!DOCTYPE html> <style> h2 {} h1 { color: blue; } </style>
CORB has no impact on the following scenarios:
Tracking and reporting
img
element to a HTTP URI that usually replies either with a 204 or with a short HTML document. In addition to the img
tag, websites may use style
, script
and other tags to track usage.Service workers
Content scripts and plugins
[lukasza@chromium.org] TODO: Do we need to be more explicit about handling of requests initiated by plugins?
- Should CORB attempt to intercept and parse crossdomain.xml which tells Adobe Flash whether a particular cross-origin request is okay or not (similarly to how CORB needs to understand CORS response headers)?
- If CORB doesn't have knowledge about
crossdomain.xml
, then it will be forced to allow all responses to Flash-initiated requests. We should clarify why CORB still provides security benefits in this scenario.- Also - not sure if plugin behavior is in-scope of https://fetch.spec.whatwg.org?
To test CORB one can turn on the feature in Chrome M63+ by launching it with the --enable-features=CrossSiteDocumentBlockingAlways
cmdline flag.
CORB demo page: https://anforowicz.github.io/xsdb-demo/index.html
filesystem:
and blob:
URIs if their nested origin has a HTTP and/or HTTPS scheme).[lukasza@chromium.org] Should the filesystem/blob part be somehow weaved into one of explainer sections above? WPT tests?
Initiator origins
file:
origins.Interoperability with other origin-related policies
document-vs-resource classification
application/javascript
text/css
image/*
audio/*
, video/*
or application/ogg
font/*
or one of legacy font typesX-Content-Type-Options: nosniff
header is present:image/svg+xml
which is a resource)text/json
, text/json+*
, text/x-json
, text/x-json+*
, application/json
, application/json+*
or *+json
)]}'
or {}&&
or for(;;);
), regardless of its Content-Type and regardless of the presence of X-Content-Type-Options: nosniff
header.[lukasza@chromium.org] Should the JSON parser breaker sniffing be somehow weaved into one of explainer sections above?
X-Content-Type-Options: nosniff
is present.text/html
then CORB SHOULD allow the response if it doesn't sniff as text/html
.text/xml
.