IndexedDB (IDB) Overview for Database People

This document is intended to help people with some database experience quickly understand IndexedDB. In the interest of brevity, non-essential quirks and edge cases are not covered. The IDB Specification covers all the details, and is good follow-up reading.

IDB's storage hierarchy is as follows:

An origin (e.g., https://localhost:8080) owns multiple databases.
A database owns multiple object stores (similar to SQL tables).
An object store is a collection of entries (similar to SQL records), which are essentially (primary key, value) pairs.
An object store owns multiple indexes.
An index is a collection of index entries, which are essentially (index key, primary key) pairs.

IndexedDB storage hierarchy overview

This image embeds the data needed for editing using draw.io.

IDB keys (both primary keys and index keys) are a fairly restricted subset of JavaScript's supported types. There are no index-level type constraints, so the same index may hold many types of values (i.e. string keys and number keys side by side). The IDB spec defines a single strict ordering relationship over all possible IDB keys. This implies that the relationship covers all key types (e.g., numbers < dates < strings < binary keys < arrays). This also implies no custom orderings (e.g., no collations).

Conceptually, IDB object store entries are sorted according to their primary keys. Like in SQL, a object store's primary keys have to be unique.

IDB indexes are conceptually sorted sets, i.e. index entries are sorted by (index key, primary key). It is possible for an index to have multiple entries with the same index key. Unlike SQL indexes, an IDB index can contain multiple entries with the same primary key - multiEntry indexes map each element inside an IDB value array to an index entry.

IDB values are a fairly large subset of JavaScript's supported types. There are no object store-level constraints on value types, so the values in a store can be pretty heterogeneous. By contrast, SQL tables have schemas which heavily constrain the rows.

From a database perspective, IDB values can mostly be thought of as black boxes. The two caveats are:

IDB values can contain Blobs, which are serialized separately (at least by Chromium). The IDB backend must track the Blobs associated with a value.
IDB primary keys can be extracted from IDB values. IDB index keys are always extracted from IDB values. This means that an IDB backend that only logs object store operations must be able to extract keys from IDB values when the log is replayed. (we don't do that in Chromium) The key extraction algorithm is non-trivial, see keyPath and multiEntry indexes.

IDB has transactions. Each transaction operates on a fixed set of object stores that is declared when the transaction is started. This means IDB implementations can use a simple scheme based on read-write locks, and don't have to worry about deadlocks.

IDB transactions have the following modes:

read, readwrite - fairly obvious
versionchange - The only type that allows schema changes (the IDB equivalent of DDL queries). Conceptually, versionchange transactions get a write-lock on all the object stores in a database, so each versionchange transaction gets exclusive database access.

All IDB queries are scoped to a single object store (i.e., no equivalent of SQL joins). IDB supports single-key CRUD (get, put, delete). IDB also supports range-based retrieval (get, getAll) and iteration (like cursors used by many RDBMS implementations). Iteration follows the implicit ordering of the source (object store or index), but can either go in the forward direction / ascending order, or in the backward direction / descending order. There is no equivalent of SQL's ORDER BY / GROUP BY. LIMIT can be implemented using cursors, but OFFSET cannot.

An IDB transaction can have multiple requests (a rough equivalent of SQL queries) issued against a transaction at the same time. Queries are executed asynchronously. The results must obey sequential consistency based on the order in which queries were issued. The order in which results are delivered to JavaScript must match the query issuing order.

IDB transactions auto-close when all requests have been completed and no more requests are made. The details are connected to the JavaScript event model, are a bit subtle, and don‘t matter too much from a DBMS perspective. Transactions are auto-aborted when a request fails (e.g. due to a constraint violation), but the auto-abort can be prevented (by responding to the request’s error event).

Each Web application has one main JavaScript thread (per browser tab) and possibly multiple worker threads. Each thread can obtain an arbitrary number of connections to the IDB database (open it), so it is possible for a database to be accessed from multiple Web application threads. (which may be hosted by separate renderer processes in Chrome) However, IDB transactions are not cloneable or transferrable (i.e. cannot be passed around using postMessage), so each transaction belongs to exactly one thread for its entire lifetime. So, all the requests in a transaction must be issued by the same JavaScript thread, which means that the query issuing order mentioned above is well-defined.