This page documents tracing and debugging the Video Acceleration API (VaAPI or VA-API) on ChromeOS. The VA-API is an open-source library and API specification, providing access to graphics hardware acceleration capabilities for video and image processing. The VaAPI is used on ChromeOS on both Intel and AMD platforms.
VaAPI code is developed upstream on the VaAPI GitHub repository, from which ChromeOS is a downstream client via the libva package, with packaged backends for e.g. both Intel and AMD.
A simplified diagram of the buffer circulation is provided below. The “client” is always a Renderer process via a Mojo/IPC communication. Essentially the VaAPI Video Decode Accelerator (VaVDA) receives encoded BitstreamBuffers from the “client”, and sends them to the “va internals”, which eventually produces decoded video in PictureBuffers. The VaVDA may or may not use the Vpp
unit for pixel format adaptation, depending on the codec used, silicon generation and other specifics.
K BitstreamBuffers +-----+ +-------------------+ C --------------------->| Va | -----> | L <---------------------| VDA | <---- va internals | I (encoded stuff) | | | | E | | | +-----+ +----+ N <---------------------| | <----| |<------| lib| T --------------------->| | ---->| Vpp |------>| va | N +-----+ +-+-----+ M +----+ PictureBuffers VASurfaces (decoded stuff)
K
is unrelated to both M
and N
.Tracing memory consumption is done via the MemoryInfra system. Please take a minute and read that document (in particular the difference between effective_size
and size
). The VaAPI lives inside the GPU process (a.k.a. Viz process), so please familiarize yourself with the GPU Memory Tracing document. The VaVDA provides information by implementing the Memory Dump Provider interface, but the information provided varies with the executing mode as explained next.
The usage of the Vpp
unit is controlled by the member variable |decode_using_client_picture_buffers_|
and is very advantageous in terms of CPU, power and memory consumption (see crbug.com/822346).
|decode_using_client_picture_buffers_|
is false, libva
uses a set of internally allocated VASurfaces that are accounted for in the gpu/vaapi/decoder
tracing category (see screenshot below). Each of these VASurfaces is backed by a Buffer Object large enough to hold, at least, the decoded image in YUV semiplanar format. In the diagram above, M
varies: 4 for VP8, 9 for VP9, 4-12 for H264/AVC1 (see GetNumReferenceFrames()
).|decode_using_client_picture_buffers_|
is true, libva
can decode directly on the client's PictureBuffers, M = 0
, and the gpu/vaapi/decoder
category is not present in the GPU MemoryInfra.VaVDA allocates storage for the N PictureBuffers provided by the client by means of VaapiPicture{NativePixmapOzone}s, backed by NativePixmaps, themselves backed by DmaBufs (the client only knows about the client Texture IDs). The GPU's TextureManager accounts for these textures, but:
See e.g. the following ToT example for 10 1920x1080p textures (32bpp); finding the desired context_group
can be tricky.
Power consumption is available on ChromeOS test/dev images via the command line binary dump_intel_rapl_consumption
; this tool averages the power consumption of the four SoC domains over a configurable period of time, usually a few seconds. These domains are, in the order presented by the tool:
pkg
: estimated power consumption of the whole SoC; in particular, this is a superset of pp0 and pp1, including all accessory silicon, e.g. video processing.pp0
: CPU set.pp1
/gfx
: Integrated GPU or GPUs.dram
: estimated power consumption of the DRAM, from the bus activity.Googlers can read more about this topic under go/power-consumption-meas-in-intel.
dump_intel_rapl_consumption
is usually run while a given workload is active (e.g. a video playback) with an interval larger than a second to smooth out all kinds of system services that would show up in smaller periods, e.g. WiFi.
dump_intel_rapl_consumption --interval_ms=2000 --repeat --verbose
E.g. on a nocturne main1, the average power consumption while playing back the first minute of a 1080p VP9 video, the average consumptions in watts are:
pkg | pp0 | pp1 /gfx | dram |
---|---|---|---|
2.63 | 1.44 | 0.29 | 0.87 |
As can be seen, pkg
~= pp0
+ pp1
+ 1W, this extra watt is the cost of all the associated silicon, e.g. bridges, bus controllers, caches, and the media processing engine.
TODO(mcasas): fill in this section.
vainfo
is a small command line utility used to enumerate the supported operation modes; it's developed in the libva-utils repository, but more concretely available on ChromeOS dev images (media-video/libva-utils package) and under Debian systems (vainfo). vainfo
will try to load the appropriate backend driver for the system and/or GPUs and fail if it cannot find/load it.
A few steps are customary to verify the support and use of a given codec.
To verify that the build and platform supports video acceleration, launch Chromium and navigate to chrome://gpu
, then:
Search for the “Video Acceleration Information” Section: this should enumerate the available accelerated codecs and resolutions.
If this section is empty, oftentimes the “Log Messages” Section immediately below might indicate an associated error, e.g.:
vaInitialize failed: unknown libva error
that can usually be reproduced with vainfo
, see the previous section.
To verify that a given video is being played back using the accelerated video decoding backend:
chrome://media-internals
tab.Player Properties
” and check the “video_decoder
” entry: it should say “GpuVideoDecoder”.This configuration is unsupported (see docs/linux_hw_video_decode.md), the following instructions are provided only as a reference for developers to test the code paths on a Linux machine.
use_vaapi=true
in the args.gn file (please refer to the Setting up the build) Section).proprietary_codecs = true
and ffmpeg_branding = "Chrome"
to the GN args.At this point you should make sure the appropriate VA driver backend is working correctly; try running vainfo
from the command line and verify no errors show up.
To run Chromium using VaAPI two arguments are necessary:
--ignore-gpu-blacklist
--use-gl=desktop
or --use-gl=egl
./out/gn/chrome --ignore-gpu-blacklist --use-gl=egl
Note that you can set the environment variable MESA_GLSL_CACHE_DISABLE=false
if you want the gpu process to run in sandboxed mode, see crbug.com/264818. To check if the running gpu process is sandboxed or not, just open chrome://gpu
and search for Sandboxed
in the driver information table. In addition, passing --gpu-sandbox-failures-fatal=yes
will prevent the gpu process to run in non-sandboxed mode.
Refer to the previous section to verify support and use of the VaAPI.