The Life of an Input Event in Desktop Chrome UI

Background

The goal of this document is to improve the understanding of the input event system in the desktop Chrome UI.

Overview

High-level overview of Chrome input pipeline

The Chrome UI system handles input events (typically key downs or mouse clicks) in three stages:

  • At the very beginning, the OS generates a native input event and sends it to a Chrome browser window.
  • Then, the Chrome Windowing system receives the event and converts it into a platform-independent ui::Event. It then sends the event to the Views system.
    The interaction with IME (Input Method, for non-English text input) is also handled at this stage.
  • Lastly, the Views system sends the event to the control that expects to receive it. For example, a character key event will insert one letter to a focused textfield.

Window Abstraction

Aura

Aura is the window abstraction layer used on Windows, Linux, and ChromeOS. An event goes through several phases in Aura and is eventually passed into views.

Typical event routing in Aura

Phase 0 - DesktopWindowTreeHost

After the user presses a key or clicks the mouse, the OS generates a low-level input event and pumps it into a message loop. After some low-level os-specific plumbing, the event is then delivered to a DesktopWindowTreeHost that hosts a native window and handles events in DesktopWindowTreeHost::DispatchEvent().

Phase 1 - EventProcessor pre-dispatch

Event routing in EventProcessor

Next, the event is passed to a WindowEventDispatcher which is an EventProcessor owned by the DesktopWindowTreeHost. On ChromeOS, some ui::EventRewriters may rewrite the event before passing.

An EventProcessor delivers the event to the right target. It provides a root EventTarget and a default EventTargeter. EventTargeter is responsible for finding the event target. An EventTarget can also provide an EventTargeter and the EventProcessor prefers its root EventTarget’s targeter over the default targeter.

The EventProcessor delivers the event to the first target found by the targeter. If the event is not marked as handled, it will ask the targeter to find the next target and repeat the procedure until the event is handled.

The EventProcessor can also have pre- and post-dispatch phases that happen before and after the event is dispatched to the target.

In the case of WindowEventDispatcher, it has a pre-dispatch phase for different types of events.

  • For mouse move events, it may synthesize and dispatch a ET_MOUSE_EXITED event to notify that mouse exits from previous UI control.
  • For key events, it forwards the key to ui::InputMethod::DispatchKeyEvent() and the event will be handled there. Depending on IME involvement, later phases of WindowEventDispatcher may be SKIPPED. Details are explained later.

If the event is not marked handled in the EventProcessor pre-dispatch phase, it will be passed to the target. For key events, the target is the aura::Window that owns the focus. For mouse events, the target is the aura::Window under the cursor.

Each browser window comes with an aura::Window. It is worth noting that the web content and dialog bubble live in their own aura::Windows. These different aura::Windows treat accelerators differently and the detail will be explained later.

Phase 2 - EventTarget pre-target

Like EventProcessor, an EventTarget consumes the event in three phases. it owns one target handler and optionally multiple pre-targets and post-target handlers. An event will first be passed to pre-target handlers, and if not consumed by them, then to the default target handler, and lastly to post-target handlers.

Non-content aura::Window uses pre-handler to forward key events to FocusManager in views. If the key is an accelerator, the event will be intercepted and later phases will be SKIPPED.

Mouse events at present are not processed in pre-handlers. Content aura::Window does not have any pre-handlers, either.

Phase 3 - EventTarget regular

At this phase, non-content aura::Window asks (Desktop)NativeWidgetAura::OnEvent() to handle the event.

DesktopNativeWidgetAura is the native implementation of a top-level Widget. Non top-level widgets, e.g. dialog bubble, use NativeWidgetAura instead. The native widget then passes the event to Widget::OnMouseEvent(), Widget::OnClickEvent(), or other Widget methods depending on the event type. The event is then handled in views and is explained in a later section.

Content aura::Window instead asks RenderWidgetHostViewAura::OnEvent() to handle the event. The event will be sent to Blink and then to BrowserCommandController if not consumed by the webpage. Some important shortcuts, e.g. Ctrl+T, are preserved and will not be sent to the web page.

Phase 4 - EventTarget post-target

This phase is not effective in window abstraction.

Phase 5 - EventProcessor post-dispatch

For touch events, WindowEventDispatcher may recognize the event as a gesture event and dispatch it.

Key event handling and IME interoperability

We mentioned in phase 1 pre-dispatch that a key event may be consumed in this phase and no later phases. This is because we need to interact with IME through InputMethod::DispatchKeyEvent() in pre-dispatch.

If the IME accepts this key event, Chrome will stop any further event handling because IMEs have their own interpretation to the event. Instead, Chrome exits phase 1 with a fake VKEY_PROCESSKEY event indicating the event has been processed by IME, and waits for new events emitted by IME and handles them accordingly. For example, Chrome on Linux listens for the GTK preedit-changed event that indicates a change in the composition text.

If the IME does not accept this key event, WindowEventDispatcher will re-enter phase 1 but with IME explicitly skipped, so that the event can be passed to phase 2 where accelerators are handled.

MacViews

MacViews is an umbrella term that covers the broader effort to adopt views in Chrome Mac. Before this, Chrome Mac was using native Cocoa controls. In this document, we use MacViews to refer to the windows abstraction part of Chrome Mac.

Mac does not use Aura and is significantly different from Aura in that it hosts native NSWindow in RemoteCocoa that talks to views through a mojo interface. This design allows RemoteCocoa to either live within the browser process or in a separate process for PWA mode. This design is largely due to the requirement of PWAs on Mac. [ref]

Mac’s event handling borrows heavily from Cocoa’s Event architecture but applies its own handling where appropriate.

During startup ChromeBrowserMainParts will kick off NSApp’s main run loop that will continue to service Chrome application event messages for the life of the program. These messages are picked up by BrowserCrApplication (NSApplication subclass) and for the most part forwarded to the appropriate NativeWidgetMacNSWindow (NSWindow subclass).

A key departure from how typical Cocoa applications are architected is that Chrome uses a single root NSView (the BridgedContentView) as the contentView for it’s NSWindow. This view is largely responsible for adapting native NSEvents and funneling them through to the Views framework.

The below two examples demonstrate two key event flows through the Cocoa layers of Chrome through to the Views framework.

Event routing on Mac

Right mouse down event (clicking a button in the browser window)

The below diagram demonstrates points of interest during dispatch of a right mouse down event on a Chrome browser window button.

Summary:

  • The Window Server is responsible for determining which NSWindow a mouse event belongs to.
  • Once the NSWindow has been identified the Window Server will place the mouse down event in Chrome’s BrowserCrApplication (NSApplication) event queue.
  • BrowserCrApplication’s main run loop reads from the event queue.
  • BrowserCrApplication delivers the event to the NativeWidgetMacNSWindow (NSWindow) which delivers the mouseDown event to its root NSView contentView.
  • BridgedContentView aggregates all mouse related NSResponder messages (rightMouseDown, mouseMoved, leftMouseUp etc) into the mouseEvent: method.
  • The mouseEvent method performs NSEvent conversion into ui::Event and sends the event to the NativeWidgetMacNSWindowHost’s OnMouseEvent() method.
  • BridgedContentView communicates to the NativeWidgetMacNSWindowHost via a bridge.
    • NativeWidgetMacNSWindowHost implements a Mojo remote remote_cocoa::mojom::NativeWidgetNSWindowHost such that the BridgedContentView and the NativeWidgetMacNSWindowHost can communicate via message passing (needed in the case these exist across process boundaries).
  • NativeWidgetMac owns a NativeWidgetMacNSWindowHost instance.

Key Down event (entering text into the browser’s omnibox)

The following demonstrates key points of interest in the event flow that occurs when a user presses a character key with the intention to enter text into the browser’s omnibox.

Summary:

  • The Window Server will deliver key events to the CrBrowserApplication’s (NSApplication) event queue.
  • Provided the keyDown event is not a key equivalent or keyboard interface control, the BrowserCrApplication sends the event to NativeWidgetMacNSWindow (NSWindow) that is associated with the first responder.
  • The window dispatches the event as a keyDown event to it’s first responder (in this case the BridgedContentView which serves as the NSWindow’s contentView).
  • BridgedContentView inherits from NSTextInputClient which is required for Chrome to interact properly with Cocoa’s text input management system.
  • BridgedContentView forwards the keyEvent to interpretKeyEvents: method.
    • This invokes Cocoa’s input management system.
    • This checks the pressed key against all key-binding dictionaries.
    • If there is a match in the keybinding dictionary it sends a doCommandBySelector: message back to the view. (commands include insertTab, insertNewline, insertLineBreak, moveLeft etc).
    • If no command matches it sends an insertText: message back to the BridgedContentView.
  • BridgedContentView converts the NSString to UFT16 and sends it through to it’s TextInputHost.
  • The TextInputHost calls InsertText() on it’s ui::TextInputClient.
    • This should be the TextInputClient of the currently focused view.

Views

The Window Abstraction layer will pass the input event to Views. Views is Chrome’s (mostly) platform-independent UI framework that orchestrates UI elements in a tree structure. Every node in the views tree is a View, which is a UI element similar to an HTML DOM element.

A simplified diagram of a views tree

A Widget hosts the views tree and is a window-like surface that draws its content onto a canvas provided by the underlying window abstraction. Every widget can have at most one focused view which is tracked by a FocusManager owned by the widget.

The root of Views tree is a RootView, which is a special subclass of View that helps bridging between children views and the wrapping Widget.