This document summarizes the design guidelines for the APIs exposed through Chrome DevTools protocol (CDP further in the document) and provides a brief overview of CDP terminology and related DevTools backend architecture.
Although the CDP was originally conceived with Chrome DevTools front-end as the primary client, it is currently used by multiple clients, most of which reside outside of Chromium source tree. We aim at maintaining a reasonably stable and future-proof API for such clients, so we offer certain compatibility terms for the CDP API:
experimental
are guaranteed to remain backwards compatible until the next version of the protocol after the one where they have been marked as deprecated. This implies that no new mandatory input parameters (or fields in input types) will be added, and no output parameters (or fields in output types) will be removed or will become optional.deprecated
will remain supported until the protocol version is incremented. We will keep deprecated commands, events and types for at least 3 Chrome releases before they are removed.The following principles should help contributors to maintain a comprehensive and stable API:
Domains are modules used to logically group related types, events and commands, e.g. Network
, Performance
or DOM
.
Commands (occasionally referred to as “methods”) are requests sent by the client to the backend. Each command eventually produces a response indicating completion, either successful or not. In case of success, the command may return arbitrary number of output parameters. If the command has failed, it should preferably indicate an error using protocol standard means (i.e. using static methods of the ProtocolResponse
class). However, in rare cases where command requires additional error information, it should indicate success via the protocol while returning additional error details through output parameters. No explicit indication of success through method's output parameters (e.g. boolean success
) should be used.
Events are notifications sent by the backend to the client and may carry arbitrary parameters. Events do not require any acknowledgement by the client. No events should be emitted by the backend before it was explicitly enabled by the client (typically, through domain's “enable” command), or after it was disabled.
If a command invocation results in sending some events (for example, the enable
command of certain domains may result in sending of previously buffered events), these are guaranteed to be emitted before the method returns.
Types are named using PascalCase (AKA UpperCamelCase), e.g. ResourceTiming
. Methods, events, parameters and object properties are named using camelCase.
Methods should follow <verb>[Object] pattern. e.g. enable
, getCookies
, captureScreenshot
or addScriptToEvaluateOnLoad
.
Event names should follow <object><Verb-in-passive-voice> pattern, e.g. consoleMessageAdded
or requestWillBeSent
.
Agents are backend classes that implement individual protocol domains. Some agents are implemented in the renderer process (either in Blink or v8; no agents are currently present in the content/renderer so far) and some are in the browser process (implemented either by the content layer or by the embedder). A domain may be implemented by multiple agents spanning several layers and processes, e.g. have instances in chrome/
, content/
and blink/renderer/
. A single command may be handled by multiple layers in the following sequence: embedder, content browser, renderer. An agent may indicate that it has completed handling the command or let the command fall-through to the next layer.
For historical reasons, agents that are implemented in the browser process are also called handlers and the classes that implement them are named as <Domain>Handler.
Targets are entities being inspected or debugged, such as frame subtrees, workers or worklets of different types, or an external VMs (e.g. a nodejs instance). Each target is identified by a UUID and has associated type (e.g. iframe
, shared_worker
etc). The browser
target handles methods that have global effects on the entire browser (or on a certain profile).
The type of the target defines the set of protocol domains a given target supports. For example, targets that support JavaScript execution (that is, all except browser
) would have Runtime
domain, but only iframes would have DOM
domain.
While each worker or worklet corresponds to a target of its own, multiple local frames belonging to the same page will be grouped to a single target. A subframe thus may change its target during the navigation, typically in case when it navigates into or out of the process of its parent.
A session corresponds to a single client connection to a particular target. Agents appropriate for target types are instantiated once per session and per layer -- for example, when a client connects to a frame, a PageHandler from chrome/, a PageHandler from content/ and an InspectorPageAgent (from blink) are instantiated for the given session.
DevTools support multiple sessions with the same target, which implies that multiple agents for the same domain should be designed to co-exist. For example, an agent should avoid overriding browser state modified by another instance of the agent unless the protocol client explicitly requested this.
Among the implementation details hidden from the protocol client is the fact that some frame navigations require render process swaps. This requires the renderer-side agent to represent some of their state in a way that may be replicated to a different renderer in the event of navigation. State that was configured by the client and is not associated to current document should be maintained using type aliases offered by InspectorAgentState
class.
Protocol clients are typcally considered trusted, as they can navigate to arbitrary origins and have access to all origin data. However, since the protocol is also exposed to chrome extensions through chrome.debugger
API, the backend implements additional access control in some of the methods to prevent extensios form accessing file system or otherwise escaping the sandbox. These restrictions are not extended to other types of clients.
Protocol clients should be prepared to handle data coming from untrusted sources such as malicious web pages and potentially compromised renderer processes.
String identifiers are preferred to integers even when the underlying implementation currently offers an integer identifier. This is so that we have flexibility of using composite identifiers in the future to avoid identifier collisions, for example, by prepending process identifier to renderer-issued ids.
CDP is designed with JSON-RPC 2.0 as the primary wire format (though other representations exist). When exposed outside of the browser, the JSON produced by the CDP bindings is encoded as UTF-8 in accordance with RFC-8259. Some of the strings may come from JavaScript or DOM, where strings are typically represented as UTF-16. This may result in strings containing unpaired UTF-16 surrogates that doesn‘t have a matching UTF-8 representation. Such surrogates, as well as control characters that can’t appear as is in JSON would be escaped in accordance with RFC-8259.
Binary data, such as images or the contents of arbitrary network requests, are designated with Binary type in the protocol definition, which is mapped to base64-encoded strings when sent over JSON, and uses a more efficient representation when protocol is represented using a binary wire format.
Data passed over the protocol should be suitable for handling by automated tools as well as UI clients supporting i18n, so passing messages (such as errors) in English is rarely appropriate, structured types and enums should be used instead.