Elaboration of the blob storage system in Chrome.
Please see the FileAPI Spec for the full specification for Blobs, or Mozilla's Blob documentation for a description of how Blobs are used in the Web Platform in general. For the purposes of this document, the important aspects of blobs are:
In Chrome, after blob creation the actual blob ‘data’ gets transported to and lives in the browser process. The renderer just holds a reference - specifically a string UUID - to the blob, which it can use to read the blob or pass it to other processes.
Blobs are created in a renderer process, where their data is temporarily held for the browser (while Javascript execution can continue). When the browser has enough memory quota for the blob, it requests the data from the renderer. All blob data is transported from the renderer to the browser. Once complete, any pending reads for the blob are allowed to complete. Blobs can be huge (GBs), so quota is necessary.
If the in-memory space for blobs is getting full, or a new blob is too large to be in-memory, then the blob system uses the disk. This can either be paging old blobs to disk, or saving the new too-large blob straight to disk.
Blob reading goes through the network layer, where the renderer dispatches a network request for the blob and the browser responds with the BlobURLRequestJob
.
General Chrome terminology:
Blob system terminology:
We calculate the storage limits here.
In-Memory Storage Limit
2GB
total_physical_memory / 5
total_physical_memory / 100
Disk Storage Limit
disk_size / 2
6 * disk_size / 100
disk_size / 10
Note: Chrome OS's disk is part of the user partition, which is separate from the system partition.
Minimum Disk Availability
We limit our disk limit to accomidate a minimum disk availability. The equation we use is:
min_disk_availability = in_memory_limit * 2
Device | Ram | In-Memory Limit | Disk | Disk Limit | Min Disk Availability |
---|---|---|---|---|---|
Cast | 512 MB | 102 MB | 0 | 0 | 0 |
Android Minimal | 512 MB | 5 MB | 8 GB | 491 MB | 10 MB |
Android Fat | 2 GB | 20 MB | 32 GB | 1.9 GB | 40 MB |
CrOS | 2 GB | 409 MB | 8 GB | 4 GB | 0.8 GB |
Desktop 32 | 3 GB | 614 MB | 500 GB | 50 GB | 1.2 GB |
Desktop 64 | 4 GB | 2 GB | 500 GB | 50 GB | 4 GB |
Creating a lot of blobs, especially if they are very large blobs, can cause the renderer memory to grow too fast and result in an OOM on the renderer side. This is because the renderer temporarily stores the blob data while it waits for the browser to request it. Meanwhile, Javascript can continue executing. Transfering the data can take a lot of time if the blob is large enough to save it directly to a file, as this means we need to wait for disk operations before the renderer can get rid of the data.
If the blob object in Javascript is kept around, then the data will never be cleaned up in the backend. This will unnecessarily use memory, so make sure to dereference blob objects if they are no longer needed.
Similarily if a URL is created for a blob, this will keep the blob data around until the URL is revoked (and the blob object is dereferenced). However, the URL is automatically revoked when the browser context is destroyed.
All blob interaction should go through the BlobStorageContext
. Blobs are built using a BlobDataBuilder
to populate the data and then calling BlobStorageContext::AddFinishedBlob
or ::BuildBlob
. This returns a BlobDataHandle
, which manages reading, lifetime, and metadata access for the new blob.
If you have known data that is not available yet, you can still create the blob reference, but see the documentation in BlobDataBuilder::AppendFuture* or ::Populate*
methods on the builder, the callback usage on BlobStorageContext::BuildBlob
, and BlobStorageContext::NotifyTransportComplete
to facilitate this construction.
All blob information should come from the BlobDataHandle
returned on construction. This handle is cheap to copy. Once all instances of handles for a blob are destructed, the blob is destroyed.
BlobDataHandle::RunOnConstructionComplete
will notify you when the blob is constructed or broken (construction failed due to not enough space, filesystem error, etc).
The BlobReader
class is for reading blobs, and is accessible off of the BlobDataHandle
at any time.
Updated Recommendations:
blink::
you'll probably have a blink::BlobDataHandle
, so use FileReaderLoader
/FileReaderLoaderClient
as an abstraction around the mojom Blob interface.blink
(in both browser and renderer): you'll probably have a blink::mojom::BlobPtr
, so just call ReadAll
/ReadRange
on that directly.storage::BlobDataHandle
use CreateReader/storage::BlobReader
.This process is outlined with diagrams and illustrations here.
This outlines the renderer-side responsabilities of the blob system. The renderer needs to:
The meat of blob construction starts in the WebBlobRegistryImpl's createBuilder(uuid, content_type)
.
Since blobs are often constructed with arrays with single bytes, we try to consolidate all adjacent memory blob items into one. This is done in BlobConsolidation. The implementation doesn‘t actually do any copying or allocating of new memory buffers, instead it facilitates the transformation between the ‘consolidated’ blob items and the underlying bytes items. This way we don’t waste any memory.
After the blob has been ‘consolidated’, it is given to the BlobTransportController. This class:
The transport controller also tries to keep the renderer alive while we are sending blobs, as if the renderer is closed then we would lose any pending blob data. It does this the incrementing and decrementing the process reference count, which should prevent fast shutdown.
The browser side is a little more complicated. We are thinking about:
We follow this general flow for constructing a blob on the browser side:
Note: The transportation sections (steps 1, 2, 3) of this process are described (without accounting for blob dependencies) with diagrams and details in this presentation.
The BlobTransportHost
is in charge of the actual transportation of the data from the renderer to the browser. When the initial description of the blob is sent to the browser, the BlobTransportHost asks the BlobMemoryController which strategy (IPC, Shared Memory, or File) it should use to transport the file. Based on this strategy it can translate the memory items sent from the renderer into a browser represetation to facilitate the transportation. See this slide, which illustrates how the browser might segment or split up the renderer's memory into transportable chunks.
Once the transport host decides its strategy, it will create its own transport state for the blob, including a BlobDataBuilder
using the transport's data segment representation. Then it will tell the BlobStorageContext
that it is ready to build the blob.
When the BlobStorageContext
tells the transport host that it is ready to transport the blob data, the transport host requests all of the data from the renderer, populates the data in the BlobDataBuilder
, and then signals the storage context that it is done.
The BlobStorageContext
is the hub of the blob storage system. It is responsible for creating & managing all the state of constructing blobs, as well as all blob handle generation and general blob status access.
When a BlobDataBuilder
is given to the context, whether from the BlobTransportHost
or from elsewhere, the context will do the following:
BlobMemoryManager
for file or memory quota for the transportation if necessaryBlobTransportHost
that to begin transporting the data.BlobMemoryManager
for memory quota for any copies necessary for blob slicing.When all of the following conditions are met:
BlobTransportHost
tells us it has transported all the data (or we don't need to transport data),BlobMemoryManager
approves our memory quota for slice copies (or we don't need slice copies), andThe blob can finish constructing, where any pending blob slice copies are performed, and we set the status of the blob.
The BlobStatus tracks the construction procedure (specifically the transport process), and the copy memory quota and dependent blob process is encompassed in PENDING_REFERENCED_BLOBS
.
Once a blob is finished constructing, the status is set to DONE
or any of the ERR_*
values.
During construction, slices are created for dependent blobs using the given offset and size of the reference. This slice consists of the relevant blob items, and metadata about possible copies from either end. If blob items can entirely be used by the new blob, then we just share the item between the. But if there is a ‘slice’ of the first or last item, then our resulting BlobSlice representation will create a new bytes item for the new blob, and store necessary copy data for later.
The BlobFlattener
takes the new blob description (including blob references), creates blob slices for all the referenced blobs, and constructs a ‘flat’ representation of the new blob, where all blob references are replaced with the BlobSlice
items. It also stores any copy data from the slices.
The BlobMemoryController
is responsable for: