tree: 1bcb97f9c7a992f6ce81f33095c61ab1e02aa3ff [path history] [tgz]
  1. README.md
  2. sdch_owner.cc
  3. sdch_owner.h
  4. sdch_owner_unittest.cc
net/sdch/README.md

SDCH

“SDCH” stands for “Shared Dictionary Compression over HTTP”. It is a protocol for compressing URL responses used when the server and the client share a dictionary that can be referred to for compression/encoding and decompression/decoding. The details of the SDCH protocol are specified in the spec (soon to be moved to github) but in brief:

  1. If the client supports SDCH decoding, it advertises “sdch” in the “Accept-Encoding” header.
  2. If the server could have encoded a response with a dictionary (but didn‘t, because the client didn’t have the dictionary), it includes an advisory “Get-Dictionary: ” header in its response.
  3. If the client has a dictionary that the server has previously advertised as being usable for encoding a particular requests, it advertises that dictionary as being available via an “Avail-Dictionary: ” header in the request.
  4. If the server chooses to encode a response with a dictionary, it includes “sdch” in a “Content-Encoding” header, in which case the body will reference the dictionary to be used for decoding (which must be one the client advertised in the original request). Encodings may be chained; often responses are SDCH encoded, and then gzip encoded.

SDCH in Chromium: Overview

The SDCH implementation in Chromium is spread across several classes in several different directories:

  • SdchManager (in net/base): This class contains all dictionaries currently known to Chromium. Each URLRequestContext points to an SdchManager; at the chrome/ level, there is one SdchManager per profile. URLRequestHttpJob consults the SdchManager for what dictionaries should be advertised with a URLRequest, and notifies the SdchManager whenever it sees a “Get-Dictionary” header. The SdchManager does not mediate fetching of dictionaries; it is conceptually layered underneath URLRequest and has no knowledge of URLRequests. There are several nested classes of SdchManager (Dictionary, DictionarySet) used in the SDCH implementation; see sdch_manager.h for details.
  • SdchObserver (in net/base). This is an Abstract Base Class which other classes may implement if those classes wish to receive notifications about SDCH events. Such classes should also register as observers with the SdchManager.
  • SdchFilter (int net/filter). This class is derived from net::Filter that is used for decoding the SDCH response; it cooperates with SdchManager and the URLRequestJob to decode SDCH encoded responses.
  • SdchDictionaryFetcher (int net/url_request): This class implements the nuts&bolts of fetching an SDCH dictionary.
  • SdchOwner (in net/sdch): This class is an SdchObserver. It contains policy for the SDCH implementation, including mediation of fetching dictionaries, prioritization and eviction of dictionaries in response to new fetches, and constraints on the amount of memory that is usable by SDCH dictionaries. It initiates dictionary fetches as appropriate when it receives notification of a “Get-Dictionary” header from the SdchManager.

A net/ embedder should instantiate an SdchManager and an SdchOwner, and guarantee that the SdchManager outlive the SdchOwner.

Note the layering of the above classes:

  1. The SdchManager class has no knowledge of URLRequests. URLRequest is dependent on that class, not the reverse.
  2. SdchDictionaryFetcher is dependent on URLRequest, but is still a utility class exported by the net/ library for use by higher levels.
  3. SdchOwner manages the entire system on behalf of the embedder. The intent is that the embedder can change policies through methods on SdchOwner, while letting the SdchOwner class take care of policy implementation.

SDCH in Chromium: Debugging

Data that is useful in debugging SDCH problems:

  • The SDCH UMA prefix is “Sdch3”, and histograms that have been found useful for debugging include
    • ProblemCodes_* (though this requires trawling the source for each bucket).
    • ResponseCorruptionDetection.{Cached,Uncached}: An attempt to make sense of the twisted mess in SdchFilter::ReadFilteredData mentioned above.
    • BlacklistReason: Why requests avoid using SDCH when they could use it.
  • about:net-internals has an SDCH tab, showing loaded dictionaries and other information. Searching in net-internals for “Get-Dictionary”, the URLRequest that actually fetches that dictionary, and then the hash of that dictionary (often used as the file name) can also be useful.

SDCH in Chromium: Gotchas and corner cases

There are a couple of known issues in SDCH in Chromium that developers in this space should be aware of:

  • As noted in the spec above, there have historically been problems with middleboxes stripping or corrupting SDCH encoded responses. For this reason, the protocol requires that if a server is not using SDCH encoding when it has previously advertised the availability of doing such, it includes an “X-SDCH-Encode: 0” header in the response. Servers don't always do this (especially multi-servers), and that can result in failed decodings and requests being dropped on the floor. The code to handle this is a twisted mess (see SdchFilter::ReadFilteredData()) and problems have often been seen from or associated with it.
  • If the decoding logic trips over a problem, it will often blacklist the server in question, temporarily (if it can recover that request) or permanently (if it can‘t). This can lead to a mysterious lack of SDCH encoding when it’s expected to be present.
  • The network cache currently stores the response precisely as received from the network. This means that requests that don't advertise SDCH may get a cached value that is SDCH encoded, and requests that do advertise SDCH may get a cached value that is not SDCH encoded. The second case is handled transparently, but the first case may lead to request failure.