| |
| |
| ### Section 7: Boolean Entropy Decoder {#h-07-00} |
| |
| As discussed in the overview above, essentially the entire VP8 data |
| stream is encoded using a boolean entropy coder. |
| |
| An understanding of the `bool_decoder` is critical to the |
| implementation of a VP8 decompressor, so we discuss the `bool_decoder` |
| in detail. It is easier to comprehend the `bool_decoder` in |
| conjunction with the `bool_encoder` used by the compressor to write |
| the compressed data partitions. |
| |
| The `bool_encoder` encodes (and the `bool_decoder` decodes) one bool |
| (zero-or-one boolean value) at a time. Its purpose is to losslessly |
| compress a sequence of bools for which the probability of their being |
| zero or one can be well-estimated (via constant or previously coded |
| information) at the time they are written, using identical |
| corresponding probabilities at the time they are read. |
| |
| As the reader is probably aware, if a bool is much more likely to be |
| zero than one (for instance), it can, on average, be faithfully |
| encoded using much less than one bit per value. The `bool_encoder` |
| exploits this. |
| |
| In the 1940s, [Shannon] proved that there is a lower bound for the |
| average datarate of a faithful encoding of a sequence of bools (whose |
| probability distributions are known and are independent of each |
| other) and also that there are encoding algorithms that approximate |
| this lower bound as closely as one wishes. |
| |
| If we encode a sequence of bools whose probability of being zero is p |
| (and whose probability of being 1 is 1-p), the lowest possible |
| datarate per value is |
| |
| p*log(p) + (1-p)*log(1-p); |
| |
| taking the logarithms to the base 1/2 expresses the datarate in |
| bits/value. |
| |
| We give two simple examples. At one extreme, if p = 1/2, then log(p) |
| = log(1-p) = 1, and the lowest possible datarate per bool is 1/2 + |
| 1/2 = 1; that is, we cannot do any better than simply literally |
| writing out bits. At another extreme, if p is very small, say p = |
| 1/1024, then log(p)=10, log(1-p) is roughly .0014, and the lowest |
| possible datarate is approximately 10/1024 + .0014, roughly 1/100 of |
| a bit per bool. |
| |
| Because most of the bools in the VP8 datastream have zero- |
| probabilities nowhere near 1/2, the compression provided by the |
| `bool_encoder` is critical to the performance of VP8. |
| |
| The boolean coder used by VP8 is a variant of an arithmetic coder. |
| An excellent discussion of arithmetic coding (and other lossless |
| compression techniques) can be found in [Bell]. |
| |