blob: 69bab0272a68231051c01f331f35bdf8dc04beb4 [file] [log] [blame]
  1. dm-bht
  2. ======
  3. dm-bht provides a block hash tree implementation. The use of dm-bht allows
  4. for integrity checking of a given block device without reading the entire
  5. set of blocks into memory before use.
  6. In particular, dm-bht supplies an interface for creating and verifying a tree
  7. of cryptographic digests with any algorithm supported by the kernel crypto API.
  8. The code is meant to be usable from user-space for creation and verification as
  9. well as directly from a Device-Mapper target. The `verity' target is the
  10. motivating example.
  11. Theory of operation
  12. ===================
  13. dm-bht is logically comprised of multiple nodes organized in a tree-like
  14. structure. Each node in the tree is a cryptographic hash. If it is a leaf
  15. node, the hash is of some block data on disk. If it is an intermediary node,
  16. then the hash is of a number of child nodes.
  17. dm-bht has a given depth starting at 1 (ignoring the root node). Each level in
  18. the tree is concretely made up of dm_bht_entry structs. Each entry in the tree
  19. is a collection of neighboring nodes that fit in one page-sized block. The
  20. number is determined based on PAGE_SIZE and the size of the selected
  21. cryptographic digest algorithm. The hashes are linearly ordered in this entry
  22. and any unaligned trailing space is ignored but included when calculating the
  23. parent node.
  24. The tree looks something like:
  25. depth = 2, alg= sha256, num_blocks = 32767
  26. [ root ]
  27. / . . . \
  28. [entry_0] [entry_1]
  29. / . . . \ . . . \
  30. [entry_0_0] . . . [entry_0_127] . . . . [entry_1_127]
  31. / ... \ / . . . \ / \
  32. blk_0 ... blk_127 blk_16256 blk_16383 blk_32640 . . . blk_32767
  33. root is treated independently from the depth and the blocks are expected to
  34. be hashed and supplied to the dm-bht. hash blocks that make up the entry
  35. contents are expected to be read from disk.
  36. dm-bht does not handle I/O directly but instead expects the consumer to
  37. supply callbacks. The read callback will always receive a page-align value
  38. to pass to the block device layer to read in a hash value.
  39. Usage
  40. =====
  41. The API provides mechanisms for reading and verifying a tree as well as
  42. creating and modifying the tree. These two code paths were not meant to be
  43. used in parallel and modify the atomic entry values in incompatible ways.
  44. Where possible, tree creation and modification should be handled independently
  45. from tree verification.
  46. When reading, all required data for the hash tree should be populated for a
  47. block before attempting a verify. This can be done by calling
  48. dm_bht_populate(). When all data is ready, a call to dm_bht_verify_block()
  49. with the expected hash value will perform both the direct block hash check and
  50. the hashes of the parent and neighboring nodes where needed to ensure validity
  51. up to the root hash. Note, dm_bht_set_root_hexdigest() should be called before
  52. any verification attempts occur.
  53. When updating the tree, all block hashes should be stored with
  54. dm_bht_store_block(). Once all hashes are stored, a call to dm_bht_compute()
  55. will initiate a full tree update by walking all of the blocks of hashes
  56. starting at the leaf nodes and computing upward to the root node. On
  57. completion, dm_bht_sync() may be called to write the tree to disk (or wherever
  58. the callback writes to).