| |
| |
| #### 13.3 Token Probabilities {#h-13-03} |
| |
| |
| The probability specification for the token tree (unlike that for the |
| "extra bits" described above) is rather involved. It uses three |
| pieces of context to index a large probability table, the contents of |
| which may be incrementally modified in the frame header. The full |
| (non-constant) probability table is laid out as follows. |
| |
| |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| Prob coeff_probs [4] [8] [3] [num_dct_tokens-1]; |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| {:lang="c"} |
| |
| |
| Working from the outside in, the outermost dimension is indexed by |
| the type of plane being decoded: |
| |
| * `0` - Y beginning at coefficient 1 (i.e., Y after Y2) |
| |
| * `1` - Y2 |
| |
| * `2` - U or V |
| |
| * `3` - Y beginning at coefficient 0 (i.e., Y in the absence of Y2). |
| |
| The next dimension is selected by the position of the coefficient |
| being decoded. That position, c, steps by ones up to 15, starting |
| from zero for block types 1, 2, or 3 and starting from one for block |
| type 0. The second array index is then |
| |
| |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| coeff_bands [c] |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| {:lang="c"} |
| |
| |
| Where: |
| |
| |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| const int coeff_bands [16] = { |
| 0, 1, 2, 3, 6, 4, 5, 6, 6, 6, 6, 6, 6, 6, 6, 7 |
| }; |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| {:lang="c"} |
| |
| |
| is a fixed mapping of position to "band". |
| |
| The third dimension is the trickiest. Roughly speaking, it measures |
| the "local complexity" or extent to which nearby coefficients are |
| non-zero. |
| |
| For the first coefficient (DC, unless the block type is 0), we |
| consider the (already encoded) blocks within the same plane (Y2, Y, |
| U, or V) above and to the left of the current block. The context |
| index is then the number (0, 1, or 2) of these blocks that had at |
| least one non-zero coefficient in their residue record. Specifically |
| for Y2, because macroblocks above and to the left may or may not have |
| a Y2 block, the block above is determined by the most recent |
| macroblock in the same column that has a Y2 block, and the block to |
| the left is determined by the most recent macroblock in the same row |
| that has a Y2 block. |
| |
| Beyond the first coefficient, the context index is determined by the |
| absolute value of the most recently decoded coefficient (necessarily |
| within the current block) and is `0` if the last coefficient was a |
| zero, `1` if it was plus or minus one, and `2` if its absolute value |
| exceeded one. |
| |
| Note that the intuitive meaning of this measure changes as |
| coefficients are decoded. For example, prior to the first token, a |
| zero means that the neighbors are empty, suggesting that the current |
| block may also be empty. After the first token, because an end-of- |
| block token must have at least one non-zero value before it, a zero |
| means that we just decoded a zero and hence guarantees that a non- |
| zero coefficient will appear later in this block. However, this |
| shift in meaning is perfectly okay because the complete context |
| depends also on the coefficient band (and since band 0 is occupied |
| exclusively by position 0). |
| |
| As with other contexts used by VP8, the "neighboring block" context |
| described here needs a special definition for subblocks lying along |
| the top row or left edge of the frame. These "non-existent" |
| predictors above and to the left of the image are simply taken to be |
| empty -- that is, taken to contain no non-zero coefficients. |
| |
| The residue decoding of each macroblock then requires, in each of two |
| directions (above and to the left), an aggregate coefficient |
| predictor consisting of a single Y2 predictor, two predictors for |
| each of U and V, and four predictors for Y. In accordance with the |
| scan-ordering of macroblocks, a decoder needs to maintain a single |
| "left" aggregate predictor and a row of "above" aggregate predictors. |
| |
| Before decoding any residue, these maintained predictors may simply |
| be cleared, in compliance with the definition of "non-existent" |
| prediction. After each block is decoded, the two predictors |
| referenced by the block are replaced with the (empty or non-empty) |
| state of the block, in preparation for the later decoding of the |
| blocks below and to the right of the block just decoded. |
| |
| The fourth, and final, dimension of the token probability array is of |
| course indexed by (half) the position in the token tree structure, as |
| are all tree probability arrays. |
| |
| The pseudocode below illustrates the decoding process. Note that |
| criteria, functions, etc. delimited with `**` are either dependent on |
| decoder architecture or are elaborated on elsewhere in this document. |
| |
| |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| int block[16] = { 0 }; /* current 4x4 block coeffs */ |
| int firstCoeff = 0; |
| int plane; |
| int ctx2; |
| int ctx3 = 0; /* the 3rd context referred to in above description */ |
| Prob *probTable; |
| int token; |
| int sign; |
| int absValue; |
| int extraBits; |
| bool prevCoeffWasZero = false; |
| bool currentBlockHasCoeffs = false; |
| /* base coeff abs values per each category, elem #0 is |
| DCT_VAL_CATEGORY1, * #1 is DCT_VAL_CATEGORY2, etc. */ |
| int categoryBase[6] = { 5, 7, 11, 19, 35, 67 }; |
| |
| /* Determine plane to use */ |
| if( **current_block_is_Y2_block** ) plane = 0; |
| else if ( **current_block_is_chroma** ) plane = 2; |
| else if ( **current_macroblock_has_Y2** ) plane = 1; |
| else plane = 3; |
| |
| /* For luma blocks of a "Y2 macroblock" we skip coeff index #0 */ |
| if( plane == 1 ) |
| firstCoeff++; |
| |
| /* Determine whether neighbor 4x4 blocks have coefficients. |
| This is dependent on the plane we are currently decoding; |
| i.e., we check only coefficients from the same plane as the |
| current block. */ |
| if( **left_neighbor_block_has_coefficients(plane)** ) |
| ctx3++; |
| if( **above_neighbor_block_has_coefficients(plane)** ) |
| ctx3++; |
| |
| for( i = firstCoeff ; i < 16 ; ++i ) |
| { |
| ctx2 = coeff_bands[i]; |
| probTable = coeff_probs[plane][ctx2][ctx3]; |
| |
| /* skip first code (dct_eob) if previous token was DCT_0 */ |
| if( prevCoeffWasZero ) |
| token = treed_read ( d, **coeff_tree_without_eob**, |
| probTable ); |
| else |
| token = treed_read ( d, coeff_tree, probTable ); |
| |
| if( token == dct_eob ) |
| break; |
| |
| if( token != DCT_0 ) |
| { |
| currentBlockHasCoeffs = true; |
| if( **token_has_extra_bits(token)** ) |
| { |
| extraBits = DCTextra( token ); |
| absValue = |
| categoryBase[**token_to_cat_index(token)**] + |
| extraBits; |
| } |
| else |
| { |
| absValue = **token_to_abs_value(token)**; |
| } |
| |
| sign = read_bool(d, 128); |
| block[i] = sign ? -absValue : absValue; |
| } |
| else |
| { |
| absValue = 0; |
| } |
| |
| /* Set contexts and stuff for next coeff */ |
| if( absValue == 0 ) ctx3 = 0; |
| else if ( absValue == 1 ) ctx3 = 1; |
| else ctx3 = 2; |
| prevCoeffWasZero = true; |
| } |
| |
| /* Store current block status to decoder internals */ |
| **block_has_coefficients[currentMb][currentBlock]** = |
| currentBlockHasCoeffs; |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| {:lang="c"} |
| |
| While we have in fact completely described the coefficient decoding |
| procedure, the reader will probably find it helpful to consult the |
| reference implementation, which can be found in the file `tokens.c` |
| (Section 20.16). |
| |