| # Post-Spectre Threat Model Re-Think |
| |
| Contributors: awhalley, creis, dcheng, jschuh, jyasskin, lukasza, mkwst, nasko, |
| palmer, tsepez. Patches and corrections welcome! |
| |
| Last Updated: 27 April 2021 |
| |
| [TOC] |
| |
| ## Introduction |
| |
| In light of [Spectre/Meltdown](https://spectreattack.com/), we needed to |
| re-think our threat model and defenses for Chrome renderer processes. Spectre is |
| a new class of hardware side-channel attack that affects (among many other |
| targets) web browsers. This document describes the impact of these side-channel |
| attacks and our approach to mitigating them. |
| |
| > The upshot of the latest developments is that the folks working on this from |
| > the V8 side are increasingly convinced that there is no viable alternative to |
| > Site Isolation as a systematic mitigation to SSCAs [speculative side-channel |
| > attacks]. In this new mental model, we have to assume that user code can |
| > reliably gain access to all data within a renderer process through |
| > speculation. This means that we definitely need some sort of ‘privileged/PII |
| > data isolation’ guarantees as well, for example ensuring that password and |
| > credit card info are not speculatively loaded into a renderer process without |
| > user consent. — Daniel Clifford, in private email |
| |
| In fact, any software that both (a) runs (native or interpreted) code from more |
| than one source; and (b) attempts to create a security boundary inside a single |
| address space, is potentially affected. For example, software that processes |
| document formats with scripting capabilities, and which loads multiple documents |
| from different sources into the same process, may need to take defense measures |
| similar to those described here. |
| |
| ### Problem Statement |
| |
| #### Active Web Content: Renderer Processes |
| |
| We must assume that *active web content* (JavaScript, WebAssembly, Native |
| Client, Flash, PDFium, …) will be able to read any and all data in the address |
| space of the process that hosts it. Multiple independent parties have developed |
| proof-of-concept exploits that illustrate the effectiveness and reliability of |
| Spectre-style attacks. The loss of cross-origin confidentiality inside a single |
| process is thus not merely theoretical. |
| |
| The implications of this are far-reaching: |
| |
| * An attacker that can exploit Spectre can bypass certain native code exploit |
| mitigations, even without an infoleak bug in software. |
| * ASLR |
| * Stack canaries |
| * Heap metadata canaries |
| * Potentially certain forms of control-flow integrity |
| * We must consider any data that gets into a renderer process to have no |
| confidentiality from any origins running in that process, regardless of the |
| same origin policy. |
| |
| Additionally, attackers may develop ways to read memory from other userland |
| processes (e.g. a renderer reading the browser’s memory). We do not include |
| those attacks in our threat model. The hardware, microcode, and OS must |
| re-establish the process boundary and the userland/kernel boundary. If the |
| underlying platform does not enforce those boundaries, there’s nothing an |
| application (like a web browser) can do. |
| |
| #### GPU Process |
| |
| Chrome’s GPU process handles data from all origins in a single process. It is |
| not currently practical to isolate different sites or origins into their own GPU |
| processes. (At a minimum, there are time and space efficiency concerns; we are |
| still trying to get Site Isolation shipped and are actively resolving issues |
| there.) |
| |
| However, WebGL exposed high-resolution clocks that are useful for exploiting |
| Spectre. It was possible to temporarily remove some of them, and to coarsen |
| another, with minimal breakage of web compatibility, and so [that has been |
| done](https://bugs.chromium.org/p/chromium/issues/detail?id=808744). However, we |
| expect to reinstate the clocks on platforms where Site Isolation is on by |
| default. (See [Attenuating Clocks, below](#attenuating-clocks).) |
| |
| We do not currently believe that, short of full code execution, an attacker can |
| control speculative execution inside the GPU process to the extent necessary to |
| exploit Spectre-like vulnerabilities. [As always, evidence to the contrary is |
| welcome!](https://www.google.com/about/appsecurity/chrome-rewards/index.html) |
| |
| #### Nastier Threat Models |
| |
| It is generally safest to assume that an arbitrary read-write primitive in the |
| renderer process will be available to the attacker. The richness of the |
| attack/API surface available in a rendering engine makes this plausible. |
| However, this capability is not a freebie the way Spectre is — the attacker must |
| actually find 1 or more bugs that enable the RW primitive. |
| |
| Site Isolation (SI) gets us closer to a place where origins face in-process |
| attacks only from other origins in their `SiteInstance`, and not from any |
| arbitrary origin. (Origins that include script from hostile origins will still |
| be vulnerable, of course.) However, [there may be hostile origins in the same |
| process](#multiple-origins-within-a-siteinstance). |
| |
| Strict origin isolation is not yet being worked on; we must first ship SI on by |
| default. It is an open question whether strict origin isolation will turn out to |
| be feasible. |
| |
| ## Defensive Approaches |
| |
| These are presented in no particular order, with the exception that Site |
| Isolation is currently the best and most direct solution. |
| |
| ### Site Isolation |
| |
| The first order solution is to simply get cross-origin data out of the Spectre |
| attacker’s address space. [Site |
| Isolation](https://www.chromium.org/Home/chromium-security/site-isolation) (SI) |
| more closely aligns the web security model (the same-origin policy) with the |
| underlying platform’s security model (separate address spaces and privilege |
| reduction). |
| |
| SI still has some bugs that need to be ironed out before we can turn it on by |
| default, both on Desktop and on Android. As of May 2018 we believe we can turn |
| it on by default, on Desktop (but not Android yet) in M67 or M68. |
| |
| On iOS, where Chrome is a WKWebView embedder, we must rely on [the mitigations |
| that Apple is |
| developing](https://webkit.org/blog/8048/what-spectre-and-meltdown-mean-for-webkit/). |
| |
| All major browsers are working on some form of site isolation, and [we are |
| collaborating publicly on a way for sites to opt in to |
| isolation](https://groups.google.com/a/chromium.org/forum/#!forum/isolation-policy), |
| to potentially make implementing and deploying site isolation easier. (Chrome |
| Desktop’s Site Isolation will be on by default, regardless, in the M67 – M68 |
| timeframe.) |
| |
| #### Limitations |
| |
| ##### Incompleteness of CORB |
| |
| Site Isolation depends on [cross-origin read |
| blocking](https://chromium.googlesource.com/chromium/src/+/main/content/browser/loader/cross_origin_read_blocking_explainer.md) |
| (CORB; formerly known as cross-site document blocking or XSDB) to prevent a |
| malicious website from pulling in sensitive cross-origin data. Otherwise, an |
| attacker could use markup like `<img src="http://example.com/secret.json">` to |
| get cross-origin data within reach of Spectre or other OOB-read exploits. |
| |
| As of M65, CORB protects: |
| |
| * HTML, JSON, and XML responses. |
| Protection requires the resource to be served with the correct |
| `Content-Type` header. [We recommend using `X-Content-Type-Options: |
| nosniff`](https://www.chromium.org/Home/chromium-security/ssca). |
| * text/plain responses which sniff as HTML, XML, or JSON. |
| |
| Today, CORB doesn’t protect: |
| |
| * Responses without a `Content-Type` header. |
| * Particular content types: |
| * `image/*` |
| * `video/*` |
| * `audio/*` |
| * `text/css` |
| * `font/*` |
| * `application/javascript` |
| * PDFs, ZIPs, and other unrecognized MIME types |
| * Responses to requests initiated from the Flash plugin. |
| |
| Site operators should read and follow, where applicable, [our guidance for |
| maximizing CORB and other defensive |
| features](https://developers.google.com/web/updates/2018/02/meltdown-spectre). |
| (There is [an open bug to add a CORB evaluator to |
| Lighthouse](https://bugs.chromium.org/p/chromium/issues/detail?id=806070).) |
| |
| ##### Multiple Origins Within A `SiteInstance` {#multiple-origins-within-a-siteinstance} |
| |
| A *site* is defined as the effective TLD + 1 DNS label (“eTLD+1”) and the URL |
| scheme. This is a broader category than the origin, which is the scheme, entire |
| hostname, and port number. All of these origins belong to the same site: |
| |
| * https, www.example.com, 443 |
| * https, www.example.com, 8443 |
| * https, goaty-desktop.internal.example.com, 443 |
| * https, compromised-and-hostile.unmaintained.example.com, 8443 |
| |
| Therefore, even once we have shipped SI on all platforms and have shaken out all |
| the bugs, renderers will still not be perfect compartments for origins. So we |
| will still need to take a multi-faceted approach to UXSS, memory corruption, and |
| OOB-read attacks like Spectre. |
| |
| Note that we are looking into the possibility of disabling assignments to |
| `document.domain` (via [origin-wide](https://wicg.github.io/origin-policy) |
| application of [Feature Policy](https://wicg.github.io/feature-policy/) or the |
| like). This would open the possibility that we could isolate at the origin |
| level. |
| |
| ##### Memory Cost |
| |
| With SI, Chrome tends to spawn more renderer processes, which tends to lead to |
| greater overall memory usage (conservative estimates seem to be about 10%). On |
| many Android devices, it is more than 10%, and this additional cost can be |
| prohibitive. However, each renderer is smaller and shorter-lived under Site |
| Isolation. |
| |
| ##### Plug-Ins |
| |
| ###### PDFium |
| |
| Chrome uses different PPAPI processes per origin, for secure origins. (We |
| tracked this as [Issue |
| 809614](https://bugs.chromium.org/p/chromium/issues/detail?id=809614).) |
| |
| ###### Flash |
| |
| Click To Play greatly reduces the risk that Flash-borne Spectre (and other) |
| exploits will be effective at scale. |
| Even so, |
| [we might want to consider teaching CORB about Flash flavour of CORS](https://crbug.com/816318). |
| |
| ##### Android `WebView` |
| |
| Android `WebView`s run in their own process as of Android O, so the hosting |
| application gets protection from malicious web content. However, all origins are |
| run in the same `WebView` process. |
| |
| ### Ensure User Intent When Sending Data To A Renderer |
| |
| Before copying sensitive data into a renderer process, we should somehow get the |
| person’s affirmative knowledge and consent. This has implications for all types |
| of form auto-filling: normal form data, passwords, payment instruments, and any |
| others. It seems like we are [currently in a pretty good place on that |
| front](https://bugs.chromium.org/p/chromium/issues/detail?id=802993), with one |
| exception: usernames and passwords get auto-filled into the shadow DOM, and then |
| revealed to the real DOM on a (potentially forged?) user gesture. These |
| credentials are origin-bound, however. |
| |
| The [Credential Management |
| API](https://developer.mozilla.org/en-US/docs/Web/API/Credential_Management_API) |
| still poses a risk, exposing usernames/passwords without a gesture for the |
| subset of users who've accepted the auto-sign-in mechanism. |
| |
| What should count as a secure gesture is a gesture on relevant, well-labeled |
| browser chrome, handled in the browser process. Tracking the gesture in the |
| renderer, that can be forged by web content that compromises the renderer, does |
| not suffice. |
| |
| #### Challenge |
| |
| We must enable a good user experience with autofill, payments, and passwords, |
| while also not ending up with a browser that leaks these super-important classes |
| of data. (A good password management experience is itself a key security goal, |
| after all.) |
| |
| ### Reducing Or Eliminating Speculation Gadgets |
| |
| Exploiting Spectre requires that the attacker can find (in V8, Blink, or Blink |
| bindings), generate, or cause to be generated code ‘gadgets’ that will read out |
| of bounds when speculatively executed. By exerting more control over how we |
| generate machine code from JavaScript, and over where we place objects in memory |
| relative to each other, we can reduce the prevalence and utility of these |
| gadgets. The V8 team has been [landing such code generation |
| changes](https://bugs.chromium.org/p/chromium/issues/detail?id=798964) |
| continually since January 2018. |
| |
| Of the known attacks, we believe it’s currently only feasible to try to mitigate |
| variant 1 with code changes in C++. We will need the toolchain and/or platform |
| support to mitigate other types of speculation attacks. We could experiment with |
| inserting `LFENCE` instructions or using |
| [Retpoline](https://support.google.com/faqs/answer/7625886) before calling into |
| Blink. |
| |
| PDFium uses V8 for its JavaScript support. To the extent that we rely on V8 |
| mitigations for Spectre defense, we need to be sure that PDFium uses the latest |
| V8, so that it gets the latest mitigations. In shipping Chrome/ium products, |
| PDFium uses the V8 that is in Chrome/ium. |
| |
| #### Limitations |
| |
| We don’t consider this approach to be a true solution; it’s only a mitigation. |
| We think we can eliminate many of the most obvious gadgets and can buy some time |
| for better defense mechanisms to be developed and deployed (primarily, Site |
| Isolation). |
| |
| It is very likely impossible to eliminate all gadgets. As with [return-oriented |
| programming](https://en.wikipedia.org/wiki/Return-oriented_programming), a large |
| body of object code (like a Chrome renderer) is likely to contain so many |
| gadgets that the attacker has a good probability to craft a working exploit. At |
| some point, we may decide that we can’t stay ahead of attack research, and will |
| stop trying to eliminate gadgets. |
| |
| Additionally, the mitigations typically come with a performance cost, and we may |
| ultimately roll some or all of them back. Some potential mitigations are so |
| expensive that it is impractical to deploy them. |
| |
| ### Attenuating Clocks {#attenuating-clocks} |
| |
| Exploiting Spectre requires a clock. We don’t believe it’s possible to |
| eliminate, coarsen, or jitter all explicit and implicit clocks in the Open Web |
| Platform (OWP) in a way that is sufficient to fully resolve Spectre. ([Merely |
| enumerating all the |
| clocks](https://bugs.chromium.org/p/chromium/issues/detail?id=798795) is |
| difficult.) Surprisingly coarse clocks are still useful for exploitation. |
| |
| While it sometimes makes sense to deprecate, remove, coarsen, or jitter clocks, |
| we don’t expect that we can get much long-term defensive value from doing so, |
| for several reasons: |
| |
| * There are [many explicit and implicit clocks in the |
| platform](https://gruss.cc/files/fantastictimers.pdf) |
| * It is not always possible to coarsen or jitter them enough to slow or stop |
| exploitation… |
| * …while also maintaining web platform compatibility and utility |
| |
| In particular, [clock jitter is of extremely limited |
| utility](https://rdist.root.org/2009/05/28/timing-attack-in-google-keyczar-library/#comment-5485) |
| when defending against side channel attacks. |
| |
| Many useful and legitimate web applications need access to high-precision |
| clocks, and we want the OWP to be able to support them. |
| |
| ### Gating Access To APIs That Enable Exploitation |
| |
| We want to support a powerful web, though we recognize that some kinds of APIs |
| a powerful web requires are more likely than others to facilitate exploitation, |
| either because they can be used as very high-resolution timers |
| (`SharedArrayBuffer`), or because they provide powerful introspection |
| capabilities (`performance.measureMemory`). We can mitigate the risks these APIs |
| pose by exposing them only in contexts that have opted-into a set of |
| restrictions that limits access to cross-origin data. |
| |
| In particular, [`Cross-Origin-Opener-Policy` (COOP) and |
| `Cross-Origin-Embedder-Policy` (COEP)](https://docs.google.com/document/d/1zDlfvfTJ_9e8Jdc8ehuV4zMEu9ySMCiTGMS9y0GU92k/edit) |
| seem promising. Together, these mechanisms change web-facing behavior to enable |
| origin-level process isolation, and ensure that cross-origin subresources will |
| flow into a given process only if they opt-into that usage. These properties |
| give us a higher degree of confidence that otherwise dangerous APIs can be |
| exposed safely, as any attack they enable would only gain access to same-origin |
| data, or data that explicitly asserted that it accepted the risk of exposure. |
| |
| Both COOP and COEP are enabled as of M83, and we [plan to require both before |
| enabling APIs like `SharedArrayBuffer`](https://groups.google.com/a/chromium.org/d/msg/blink-dev/_0MEXs6TJhg/F0UduPfpAQAJ). |
| Other browsers seem likely to do the same. |
| |
| #### Other Gating Mechanisms |
| |
| **Note:** This section explores ideas but we are not currently planning on |
| implementing anything along these lines. |
| |
| Looking beyond developer opt-ins such as COOP and COEP, we might be able to find |
| other ways of limiting the scope of APIs to reduce their risk. For example, a |
| third-party `iframe` that is trying to exploit Spectre is very different than a |
| WebAssembly game, in the top-level frame, that the person is actively playing |
| (and issuing many gestures to). We could programmatically detect engagement and |
| establish policies for when certain APIs and features will be available to web |
| content. (See e.g. [Feature Policy](https://wicg.github.io/feature-policy/).) |
| |
| *Engagement* could be defined in a variety of complementary ways: |
| |
| * High [site engagement |
| score](https://www.chromium.org/developers/design-documents/site-engagement) |
| * High site popularity, search rank, or similar |
| * Frequent gestures on/interactions with the document |
| * Document is the top-level document |
| * Document is the currently-focused tab |
| * Site is bookmarked or added to the Home screen or Desktop |
| |
| Additionally, we have considered the possibility of prompting the user for |
| permission to run certain exploit-enabling APIs, although there are problems: |
| warning fatigue, and the difficulty of communicating something accurate yet |
| comprehensible to people. |
| |
| ## Conclusion |
| |
| For the reasons above, we now assume any active code can read any data in the |
| same address space. The plan going forward must be to keep sensitive |
| cross-origin data out of address spaces that run untrustworthy code, rather than |
| relying on in-process checks. |