filter/dscache/doc.go - infra/luci/gae - Git at Google

 // Copyright 2015 The LUCI Authors. All rights reserved.
 // Use of this source code is governed under the Apache License, Version 2.0
 // that can be found in the LICENSE file.

 // Package dscache provides a transparent cache for RawDatastore which is
 // backed by Memcache.
 //
 // Inspiration
 //
 // Although this is not a port of any particular implementation, it takes
 // inspiration from these fine libraries:
 //   - https://cloud.google.com/appengine/docs/python/ndb/
 //   - https://github.com/qedus/nds
 //   - https://github.com/mjibson/goon
 //
 // Algorithm
 //
 // Memcache contains cache entries for single datastore entities. The memcache
 // key looks like
 //
 //   "gae:" | vers | ":" | shard | ":" | Base64_std_nopad(SHA1(datastore.Key))
 //
 // Where:
 //   - vers is an ascii-hex-encoded number (currently 1).
 //   - shard is a zero-based ascii-hex-encoded number (depends on shardsForKey).
 //   - SHA1 has been chosen as unlikely (p == 1e-18) to collide, given dedicated
 //     memcache sizes of up to 170 Exabytes (assuming an average entry size of
 //     100KB including the memcache key). This is clearly overkill, but MD5
 //     could start showing collisions at this probability in as small as a 26GB
 //     cache (and also MD5 sucks).
 //
 // The memcache value is a compression byte, indicating the scheme (See
 // CompressionType), followed by the encoded (and possibly compressed) value.
 // Encoding is done with datastore.PropertyMap.Write(). The memcache value
 // may also be the empty byte sequence, indicating that this entity is deleted.
 //
 // The memcache entry may also have a 'flags' value set to one of the following:
 //   - 0 "entity" (cached value)
 //   - 1 "lock"   (someone is mutating this entry)
 //
 // Algorithm - Put and Delete
 //
 // On a Put (or Delete), an empty value is unconditionally written to
 // memcache with a LockTimeSeconds expiration (default 31 seconds), and
 // a memcache flag value of 0x1 (indicating that it's a put-locked key). The
 // random value is to preclude Get operations from believing that they possess
 // the lock.
 //
 // NOTE: If this memcache Set fails, it's a HARD ERROR. See DANGER ZONE.
 //
 // The datastore operation will then occur. Assuming success, Put will then
 // unconditionally delete all of the memcache locks. At some point later, a
 // Get will write its own lock, get the value from datastore, and compare and
 // swap to populate the value (detailed below).
 //
 // Algorithm - Get
 //
 // On a Get, "Add" a lock for it (which only does something if there's no entry
 // in memcache yet) with a nonce value. We immediately Get the memcache entries
 // back (for CAS purposes later).
 //
 // If it doesn't exist (unlikely since we just Add'd it) or if its flag is
 // "lock" and the Value != the nonce we put there, go hit the datastore without
 // trying to update memcache.
 //
 // If its flag is "entity", decode the object and return it. If the Value is
 // the empty byte sequence, return ErrNoSuchEntity.
 //
 // If its flag is "lock" and the Value equals the nonce, go get it from the
 // datastore. If that's successful, then encode the value to bytes, and CAS
 // the object to memcache. The CAS will succeed if nothing else touched the
 // memcache in the meantime (like a Put, a memcache expiration/eviction, etc.).
 //
 // Algorithm - Transactions
 //
 // In a transaction, all Put memcache operations are held until the very end of
 // the transaction. Right before the transaction is committed, all accumulated
 // Put memcache items are unconditionally set into memcache.
 //
 // NOTE: If this memcache Set fails, it's a HARD ERROR. See DANGER ZONE.
 //
 // If the transaction is sucessfully committed (err == nil), then all the locks
 // will be deleted.
 //
 // The assumption here is that get operations apply all outstanding
 // transactions before they return data (https://cloud.google.com/appengine/docs/go/datastore/#Go_Datastore_writes_and_data_visibility),
 // and so it is safe to purge all the locks if the transaction is known-good.
 //
 // If the transaction succeeds, but RunInTransaction returns an error (which can
 // happen), or if the transaction fails, then the lock entries time out
 // naturally. This will mean 31-ish seconds of direct datastore access, but it's
 // the more-correct thing to do.
 //
 // Gets and Queries in a transaction pass right through without reading or
 // writing memcache.
 //
 // Cache control
 //
 // An entity may expose the following metadata (see
 // datastore.PropertyLoadSaver.GetMeta) to control the behavior of its cache.
 //
 //   - `gae:"$dscache.enable,<true|false>"` - whether or not this entity should
 //      be cached at all. If ommitted, dscache defaults to true.
 //   - `gae:"$dscache.expiration,#seconds"` - the number of seconds of
 //     persistance to use when this item is cached. 0 is infinite. If omitted,
 //     defaults to 0.
 //
 // In addition, the application may set a function shardsForKey(key) which
 // returns the number of shards to use for a given datastore key. This function
 // is set with the invocation of FilterRDS.
 //
 // Shards have the effect that all write (Put/Delete) operations clear all
 // memcache entries for the given datastore entry, and all reads read (and
 // possibly populate) one of the memcache entries. So if an entity has 4 shards,
 // a datastore Get for it will pull from one of the 4 possible memcache keys
 // at random. This is good for heavily-read, but infrequently updated, entities.
 // The purpose of sharding is to alleviate hot memcache keys, as recommended by
 // https://cloud.google.com/appengine/articles/best-practices-for-app-engine-memcache#distribute-load .
 //
 // Caveats
 //
 // A couple things to note that may differ from other appengine datastore
 // caching libraries (like goon, nds, or ndb).
 //
 //   - It does NOT provide in-memory ("per-request") caching.
 //   - It's INtolerant of some memcache failures, but in exchange will not return
 //     inconsistent results. See DANGER ZONE for details.
 //   - Queries do not interact with the cache at all.
 //   - Negative lookups (e.g. ErrNoSuchEntity) are cached.
 //
 // DANGER ZONE
 //
 // As mentioned in the Put/Delete/Transactions sections above, if the memcache
 // Set fails, that's a HARD ERROR. The reason for this is that otherwise in the
 // event of transient memcache failures, the memcache may be permanently left in
 // an inconsistent state, since there will be nothing to actually ensure that
 // the bad value is flushed from memcache. As long as the Put is allowed to
 // write the lock, then all will be (eventually) well, and so all other memcache
 // operations are best effort.
 //
 // So, if memcache is DOWN, you will effectively see tons of errors in the logs,
 // and all cached datastore access will be essentially degraded to a slow
 // read-only state. At this point, you have essentially 3 mitigration
 // strategies:
 //   - wait for memcache to come back up.
 //   - dynamically disable all memcache access by writing the datastore entry:
 //       /dscache,1 = {"Enable": false}
 //     in the default namespace. This can be done by invoking the
 //     SetDynamicGlobalEnable method. This can take up to 5 minutes to take
 //     effect. If you have very long-running backend requests, you may need to
 //     cycle them to have it take effect. This dynamic bit is read essentially
 //     once per http request (when FilteRDS is called on the context).
 //   - push a new version of the application disabling the cache filter
 //     by setting InstanceEnabledStatic to false in an init() function.
 //
 // On every dscache.FilterRDS invocation, it takes the opportunity to fetch this
 // datastore value, if it hasn't been fetched in the last
 // GlobalEnabledCheckInterval time (5 minutes). This equates to essentially once
 // per http request, per 5 minutes, per instance.
 //
 // AppEngine's memcache reserves the right to evict keys at any moment. This is
 // especially true for shared memcache, which is subject to pressures outside of
 // your application. When when eviction happens due to memory pressure,
 // least-recently-used values are evicted first.
 //
 // https://cloud.google.com/appengine/docs/go/memcache/#Go_How_cached_data_expires
 //
 // Eviction presents a potential race window, as lock items that were put in
 // memcache may be evicted prior to the locked operations completing (or
 // failing), causing concurrent Get operations to cache bad data indefinitely.
 //
 // In practice, a dedicated memcache will be safe. LRU-based eviction means that
 // that locks recently added will almost certainly not be evicted before their
 // operations are complete, and a dedicated memcache lowers eviction pressure to
 // a single application's operation. Production systems that have data integrity
 // requirements are encouraged to use a dedicated memcache.
 //
 // Note that flusing memcache of a running application may also induce this
 // race. Flushes should be performed with this concern in mind.
 //
 // TODO: A potential mitigation to lock eviction poisoning is to use memcache
 // Statistics to identify the oldest memcache item and use that age to bound
 // the lifetime of cached datastore entries. This would cause dscache items
 // created around the time of a flush to expire quickly (instead of never),
 // bounding the period of time when poisoned data may reside in the cache.
 package dscache
	// Copyright 2015 The LUCI Authors. All rights reserved.
	// Use of this source code is governed under the Apache License, Version 2.0
	// that can be found in the LICENSE file.

	// Package dscache provides a transparent cache for RawDatastore which is
	// backed by Memcache.
	//
	// Inspiration
	//
	// Although this is not a port of any particular implementation, it takes
	// inspiration from these fine libraries:
	// - https://cloud.google.com/appengine/docs/python/ndb/
	// - https://github.com/qedus/nds
	// - https://github.com/mjibson/goon
	//
	// Algorithm
	//
	// Memcache contains cache entries for single datastore entities. The memcache
	// key looks like
	//
	// "gae:" \| vers \| ":" \| shard \| ":" \| Base64_std_nopad(SHA1(datastore.Key))
	//
	// Where:
	// - vers is an ascii-hex-encoded number (currently 1).
	// - shard is a zero-based ascii-hex-encoded number (depends on shardsForKey).
	// - SHA1 has been chosen as unlikely (p == 1e-18) to collide, given dedicated
	// memcache sizes of up to 170 Exabytes (assuming an average entry size of
	// 100KB including the memcache key). This is clearly overkill, but MD5
	// could start showing collisions at this probability in as small as a 26GB
	// cache (and also MD5 sucks).
	//
	// The memcache value is a compression byte, indicating the scheme (See
	// CompressionType), followed by the encoded (and possibly compressed) value.
	// Encoding is done with datastore.PropertyMap.Write(). The memcache value
	// may also be the empty byte sequence, indicating that this entity is deleted.
	//
	// The memcache entry may also have a 'flags' value set to one of the following:
	// - 0 "entity" (cached value)
	// - 1 "lock" (someone is mutating this entry)
	//
	// Algorithm - Put and Delete
	//
	// On a Put (or Delete), an empty value is unconditionally written to
	// memcache with a LockTimeSeconds expiration (default 31 seconds), and
	// a memcache flag value of 0x1 (indicating that it's a put-locked key). The
	// random value is to preclude Get operations from believing that they possess
	// the lock.
	//
	// NOTE: If this memcache Set fails, it's a HARD ERROR. See DANGER ZONE.
	//
	// The datastore operation will then occur. Assuming success, Put will then
	// unconditionally delete all of the memcache locks. At some point later, a
	// Get will write its own lock, get the value from datastore, and compare and
	// swap to populate the value (detailed below).
	//
	// Algorithm - Get
	//
	// On a Get, "Add" a lock for it (which only does something if there's no entry
	// in memcache yet) with a nonce value. We immediately Get the memcache entries
	// back (for CAS purposes later).
	//
	// If it doesn't exist (unlikely since we just Add'd it) or if its flag is
	// "lock" and the Value != the nonce we put there, go hit the datastore without
	// trying to update memcache.
	//
	// If its flag is "entity", decode the object and return it. If the Value is
	// the empty byte sequence, return ErrNoSuchEntity.
	//
	// If its flag is "lock" and the Value equals the nonce, go get it from the
	// datastore. If that's successful, then encode the value to bytes, and CAS
	// the object to memcache. The CAS will succeed if nothing else touched the
	// memcache in the meantime (like a Put, a memcache expiration/eviction, etc.).
	//
	// Algorithm - Transactions
	//
	// In a transaction, all Put memcache operations are held until the very end of
	// the transaction. Right before the transaction is committed, all accumulated
	// Put memcache items are unconditionally set into memcache.
	//
	// NOTE: If this memcache Set fails, it's a HARD ERROR. See DANGER ZONE.
	//
	// If the transaction is sucessfully committed (err == nil), then all the locks
	// will be deleted.
	//
	// The assumption here is that get operations apply all outstanding
	// transactions before they return data (https://cloud.google.com/appengine/docs/go/datastore/#Go_Datastore_writes_and_data_visibility),
	// and so it is safe to purge all the locks if the transaction is known-good.
	//
	// If the transaction succeeds, but RunInTransaction returns an error (which can
	// happen), or if the transaction fails, then the lock entries time out
	// naturally. This will mean 31-ish seconds of direct datastore access, but it's
	// the more-correct thing to do.
	//
	// Gets and Queries in a transaction pass right through without reading or
	// writing memcache.
	//
	// Cache control
	//
	// An entity may expose the following metadata (see
	// datastore.PropertyLoadSaver.GetMeta) to control the behavior of its cache.
	//
	// - `gae:"$dscache.enable,<true\|false>"` - whether or not this entity should
	// be cached at all. If ommitted, dscache defaults to true.
	// - `gae:"$dscache.expiration,#seconds"` - the number of seconds of
	// persistance to use when this item is cached. 0 is infinite. If omitted,
	// defaults to 0.
	//
	// In addition, the application may set a function shardsForKey(key) which
	// returns the number of shards to use for a given datastore key. This function
	// is set with the invocation of FilterRDS.
	//
	// Shards have the effect that all write (Put/Delete) operations clear all
	// memcache entries for the given datastore entry, and all reads read (and
	// possibly populate) one of the memcache entries. So if an entity has 4 shards,
	// a datastore Get for it will pull from one of the 4 possible memcache keys
	// at random. This is good for heavily-read, but infrequently updated, entities.
	// The purpose of sharding is to alleviate hot memcache keys, as recommended by
	// https://cloud.google.com/appengine/articles/best-practices-for-app-engine-memcache#distribute-load .
	//
	// Caveats
	//
	// A couple things to note that may differ from other appengine datastore
	// caching libraries (like goon, nds, or ndb).
	//
	// - It does NOT provide in-memory ("per-request") caching.
	// - It's INtolerant of some memcache failures, but in exchange will not return
	// inconsistent results. See DANGER ZONE for details.
	// - Queries do not interact with the cache at all.
	// - Negative lookups (e.g. ErrNoSuchEntity) are cached.
	//
	// DANGER ZONE
	//
	// As mentioned in the Put/Delete/Transactions sections above, if the memcache
	// Set fails, that's a HARD ERROR. The reason for this is that otherwise in the
	// event of transient memcache failures, the memcache may be permanently left in
	// an inconsistent state, since there will be nothing to actually ensure that
	// the bad value is flushed from memcache. As long as the Put is allowed to
	// write the lock, then all will be (eventually) well, and so all other memcache
	// operations are best effort.
	//
	// So, if memcache is DOWN, you will effectively see tons of errors in the logs,
	// and all cached datastore access will be essentially degraded to a slow
	// read-only state. At this point, you have essentially 3 mitigration
	// strategies:
	// - wait for memcache to come back up.
	// - dynamically disable all memcache access by writing the datastore entry:
	// /dscache,1 = {"Enable": false}
	// in the default namespace. This can be done by invoking the
	// SetDynamicGlobalEnable method. This can take up to 5 minutes to take
	// effect. If you have very long-running backend requests, you may need to
	// cycle them to have it take effect. This dynamic bit is read essentially
	// once per http request (when FilteRDS is called on the context).
	// - push a new version of the application disabling the cache filter
	// by setting InstanceEnabledStatic to false in an init() function.
	//
	// On every dscache.FilterRDS invocation, it takes the opportunity to fetch this
	// datastore value, if it hasn't been fetched in the last
	// GlobalEnabledCheckInterval time (5 minutes). This equates to essentially once
	// per http request, per 5 minutes, per instance.
	//
	// AppEngine's memcache reserves the right to evict keys at any moment. This is
	// especially true for shared memcache, which is subject to pressures outside of
	// your application. When when eviction happens due to memory pressure,
	// least-recently-used values are evicted first.
	//
	// https://cloud.google.com/appengine/docs/go/memcache/#Go_How_cached_data_expires
	//
	// Eviction presents a potential race window, as lock items that were put in
	// memcache may be evicted prior to the locked operations completing (or
	// failing), causing concurrent Get operations to cache bad data indefinitely.
	//
	// In practice, a dedicated memcache will be safe. LRU-based eviction means that
	// that locks recently added will almost certainly not be evicted before their
	// operations are complete, and a dedicated memcache lowers eviction pressure to
	// a single application's operation. Production systems that have data integrity
	// requirements are encouraged to use a dedicated memcache.
	//
	// Note that flusing memcache of a running application may also induce this
	// race. Flushes should be performed with this concern in mind.
	//
	// TODO: A potential mitigation to lock eviction poisoning is to use memcache
	// Statistics to identify the oldest memcache item and use that age to bound
	// the lifetime of cached datastore entries. This would cause dscache items
	// created around the time of a flush to expire quickly (instead of never),
	// bounding the period of time when poisoned data may reside in the cache.
	package dscache