tree: 73d15da3a06a4bf685fe7ceb7588c0ce76fcd803 [path history] [tgz]
  1. cache_entry_key.cc
  2. cache_entry_key.h
  3. cache_entry_key_unittest.cc
  4. DEPS
  5. eviction_candidate_aggregator.cc
  6. eviction_candidate_aggregator.h
  7. eviction_candidate_aggregator_unittest.cc
  8. exclusive_operation_coordinator.cc
  9. exclusive_operation_coordinator.h
  10. exclusive_operation_coordinator_unittest.cc
  11. indexed_pair_set.h
  12. indexed_pair_set_unittest.cc
  13. README.md
  14. sql_backend_constants.h
  15. sql_backend_impl.cc
  16. sql_backend_impl.h
  17. sql_backend_impl_unittest.cc
  18. sql_entry_impl.cc
  19. sql_entry_impl.h
  20. sql_persistent_store.cc
  21. sql_persistent_store.h
  22. sql_persistent_store_backend.cc
  23. sql_persistent_store_backend.h
  24. sql_persistent_store_backend_shard.cc
  25. sql_persistent_store_backend_shard.h
  26. sql_persistent_store_in_memory_index.cc
  27. sql_persistent_store_in_memory_index.h
  28. sql_persistent_store_in_memory_index_unittest.cc
  29. sql_persistent_store_queries.h
  30. sql_persistent_store_queries_unittest.cc
  31. sql_persistent_store_unittest.cc
net/disk_cache/sql/README.md

SQL disk cache backend

This directory contains an experimental SQL-based implementation of the disk cache (disk_cache::Backend). It uses SQLite to store cache entries and is designed to be more robust and performant than the default cache backends (block file backend on Windows and simple cache backend on other OSes), especially in scenarios with a large number of small entries.

The implementation is sharded to improve concurrency, with each shard managing its own SQLite database file.

Key Classes and Components

Core Backend Logic

  • SqlBackendImpl: This is the main entry point and public-facing class for the SQL-based disk cache. It implements the disk_cache::Backend interface, handling requests from the network stack to open, create, doom, and enumerate cache entries. It owns and coordinates the SqlPersistentStore.

  • SqlEntryImpl: Implements the disk_cache::Entry interface, representing a single entry in the cache. It manages the data streams (header and body) and metadata (last_used time, key, etc.) for an entry. All I/O operations are passed to the SqlBackendImpl, which then delegates them to the persistence layer.

Persistence Layer

  • SqlPersistentStore: This class serves as the primary interface to the persistence layer, abstracting away the details of database sharding and asynchronous operations. It owns multiple BackendShard instances and distributes operations among them based on the cache entry's key.

  • SqlPersistentStore::BackendShard: Manages a single shard of the cache. Each shard has its own SQLite database file, a dedicated background task runner for database operations, and an in-memory index of its entries. It owns a SequenceBound<SqlPersistentStore::Backend>.

  • SqlPersistentStore::Backend: This class encapsulates all direct interactions with a single SQLite database. It runs entirely on a background sequence to avoid blocking the main network thread. Its responsibilities include:

    • Executing all SQL queries (CREATE, READ, UPDATE, DELETE).
    • Managing the database schema and transactions.
    • Handling database initialization and error recovery.

Data Structures and Utilities

  • CacheEntryKey: A memory-efficient wrapper for the cache key string. Since cache keys can be long and are stored in multiple in-memory data structures, this class uses a scoped_refptr<base::RefCountedString> to share the underlying string data, reducing memory overhead.

  • SqlPersistentStoreInMemoryIndex: An in-memory index that maps a hash of a CacheEntryKey to its ResId (the primary key in the database). This allows for fast, synchronous checks to see if an entry is likely to exist in the cache without needing to perform a slow, asynchronous database query. It is highly optimized for memory usage.

  • ExclusiveOperationCoordinator: A synchronization primitive that serializes access to resources. It ensures that “exclusive” operations (like cache-wide eviction or cleanup) do not run concurrently with “normal” operation (like reading or writing a single entry). Normal operations on different cache keys can run in parallel.

  • EvictionCandidateAggregator: A thread-safe helper class used during cache eviction. Each shard independently generates a list of its least recently used entries as eviction candidates. This class aggregates these lists from all shards, performs a final sort, and selects the global set of entries to be evicted to bring the cache size back under its limit.

  • InFlightEntryModification: A mechanism to queue metadata updates (last_used time, headers, body size) for a cache entry that is not currently active (i.e., not held open as a SqlEntryImpl object). When an operation modifies an entry, it records the change as an in-flight modification. If the entry is opened again before the background database write completes, these queued modifications are applied to the entry's data as it is read from disk. This ensures that the in-memory representation of an entry is always consistent with pending operations, even with fully asynchronous database writes.

How It Works

  1. Initialization: SqlBackendImpl creates a SqlPersistentStore, which in turn creates a number of BackendShard instances (e.g., 3), each with its own background task runner. Each shard initializes its SQLite database.

  2. Entry Operations (Create/Open):

    • A request to open or create an entry arrives at SqlBackendImpl.
    • The entry's key is hashed to determine which BackendShard is responsible for it.
    • The operation is posted to the shard's background task runner.
    • The SqlPersistentStore::Backend for that shard executes the necessary SQL commands to find or create the entry in its database.
    • The result (a SqlEntryImpl or an error) is returned to the main thread via a callback.
  3. Data I/O: Reading and writing data to a SqlEntryImpl follows a similar pattern, with operations being posted to the appropriate shard's background task runner.

  4. Eviction:

    • When the cache size exceeds a certain threshold, SqlBackendImpl initiates eviction.
    • It posts an exclusive “start eviction” task to the SqlPersistentStore.
    • Each shard queries its database for a list of its least recently used entries.
    • The EvictionCandidateAggregator collects these lists, selects the entries to be removed, and sends the list of doomed entries back to each shard to be deleted from the database.
  5. Coordination: The ExclusiveOperationCoordinator ensures that operations like eviction, which affect the entire cache, do not conflict with ongoing reads and writes to individual entries. When an exclusive operation is requested, the coordinator waits for all active normal operations to complete, runs the exclusive operation, and then resumes queued normal operations.

Database Schema

Each shard of the SQL disk cache uses a SQLite database with the following schema:

Tables

  • resources: Stores the main metadata for each cache entry.

    • res_id (INTEGER, PRIMARY KEY AUTOINCREMENT): Unique ID for the resource.
    • last_used (INTEGER): Timestamp for LRU.
    • body_end (INTEGER): End offset of the body.
    • bytes_usage (INTEGER): Total bytes consumed by the entry.
    • doomed (INTEGER): Flag for entries pending deletion (0 for live, 1 for doomed).
    • check_sum (INTEGER): The checksum crc32(head + cache_key_hash).
    • cache_key_hash (INTEGER): The hash of cache_key.
    • cache_key (TEXT): The full cache key string.
    • head (BLOB): Serialized response headers.
  • blobs: Stores the data chunks of the cached body.

    • blob_id (INTEGER, PRIMARY KEY AUTOINCREMENT): Unique ID for the blob.
    • res_id (INTEGER): Foreign key to resources.res_id.
    • start (INTEGER): Start offset of this blob chunk.
    • end (INTEGER): End offset of this blob chunk.
    • check_sum (INTEGER): The checksum crc32(blob + cache_key_hash).
    • blob (BLOB): The actual data chunk.

Indexes

  • index_resources_cache_key_hash_doomed: ON resources(cache_key_hash,doomed)

    • Speeds up lookups for live entries (doomed=0) by cache_key_hash. Crucial for OpenEntry and similar operations.
  • index_live_resources_last_used_bytes_usage: ON resources(last_used, bytes_usage) WHERE doomed=0

    • A covering index on last_used and bytes_usage for live entries. Essential for efficient eviction logic, which targets the least recently used entries without needing to access the resources table directly.
  • index_blobs_res_id_start: (UNIQUE) ON blobs(res_id, start)

    • A unique index on (res_id, start) in the blobs table. Ensures quick retrieval of data blobs for a given entry at a specific offset and maintains data integrity by preventing overlapping blobs for the same entry.