text_src/09.01__vp8-bitstream__uncompressed-data-chunk.txt - webm/bitstream-guide - Git at Google



 #### 9.1 Uncompressed Data Chunk                           {#h-09-01}

 The uncompressed data chunk comprises a common (for key frames and
 interframes) 3-byte frame tag that contains four fields, as follows:

   1. A 1-bit frame type (`0` for key frames, `1` for interframes).

   2. A 3-bit version number (`0` - `3` are defined as four different
      profiles with different decoding complexity; other values may be
      defined for future variants of the VP8 data format).

   3. A 1-bit `show_frame` flag (`0` when current frame is not for display,
      `1` when current frame is for display).

   4. A 19-bit field containing the size of the first data partition in
      bytes.

 The version number setting enables or disables certain features in
 the bitstream, as follows:


 | Version | Reconstruction Filter   | Loop Filter |
 | ------- | ----------------------- | ----------- |
 | `0`     | Bicubic                 | Normal      |
 | `1`     | Bilinear                | Simple      |
 | `2`     | Bilinear                | None        |
 | `3`     | None                    | None        |
 | Other   | Reserved for future use |             |


 The reference software also adjusts the loop filter based on version
 number, as per the table above.  Version number `1` implies a "simple"
 loop filter, and version numbers `2` and `3` imply no loop filter.
 However, the "simple" filter setting in this context has no effect
 whatsoever on the decoding process, and the "no loop filter" setting
 only forces the reference encoder to set filter level equal to `0`.
 Neither affect the decoding process.  In decoding, the only loop
 filter settings that matter are those in the frame header.

 For key frames, the frame tag is followed by a further 7 bytes of
 uncompressed data, as follows:


 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 Start code byte 0     0x9d
 Start code byte 1     0x01
 Start code byte 2     0x2a

 16 bits      :     (2 bits Horizontal Scale << 14) | Width (14 bits)
 16 bits      :     (2 bits Vertical Scale << 14) | Height (14 bits)
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 {:lang="c"}


 The following source code segment illustrates validation of the start
 code and reading the width, height, and scale factors for a key
 frame.


 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 unsigned char *c = pbi->source+3;

 // vet via sync code
 if(c[0]!=0x9d||c[1]!=0x01||c[2]!=0x2a)
     return -1;
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 {:lang="c"}


 Where `pbi->source` points to the beginning of the frame.

 The following code reads the image dimension from the bitstream:


 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 pc->Width      = swap2(*(unsigned short*)(c+3))&0x3fff;
 pc->horiz_scale = swap2(*(unsigned short*)(c+3))>>14;
 pc->Height     = swap2(*(unsigned short*)(c+5))&0x3fff;
 pc->vert_scale  = swap2(*(unsigned short*)(c+5))>>14;
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 {:lang="c"}


 Where the `swap2` macro takes care of the endian on a different
 platform:


 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 #if defined(__ppc__) || defined(__ppc64__)
 # define swap2(d)  \
   ((d&0x000000ff)<<8) |  \
   ((d&0x0000ff00)>>8)
 #else
   # define swap2(d) d
 #endif
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 {:lang="c"}


 While each frame is encoded as a raster scan of 16x16 macroblocks,
 the frame dimensions are not necessarily evenly divisible by 16.  In
 this case, write `ew = 16 - (width & 15)` and `eh = 16 - (height & 15)`
 for the excess width and height, respectively.  Although they are
 encoded, the last `ew` columns and `eh` rows are not actually part of the
 image and should be discarded before final output.  However, these
 "excess pixels" should be maintained in the internal reconstruction
 buffer used to predict ensuing frames.

 The scaling specifications for each dimension are encoded as follows.


 | Value  | Scaling
 | ------ | -------------------------------------
 | `0`    | No upscaling (the most common case).
 | `1`    | Upscale by 5/4.
 | `2`    | Upscale by 5/3.
 | `3`    | Upscale by 2.


 Upscaling does **not** affect the reconstruction buffer, which should be
 maintained at the encoded resolution.  Any reasonable method of
 upsampling (including any that may be supported by video hardware in
 the playback environment) may be used.  Since scaling has no effect
 on decoding, we do not discuss it any further.

 As discussed in Section 5, allocation (or re-allocation) of data
 structures (such as the reconstruction buffer) whose size depends on
 dimension will be triggered here.


	#### 9.1 Uncompressed Data Chunk {#h-09-01}

	The uncompressed data chunk comprises a common (for key frames and
	interframes) 3-byte frame tag that contains four fields, as follows:

	1. A 1-bit frame type (`0` for key frames, `1` for interframes).

	2. A 3-bit version number (`0` - `3` are defined as four different
	profiles with different decoding complexity; other values may be
	defined for future variants of the VP8 data format).

	3. A 1-bit `show_frame` flag (`0` when current frame is not for display,
	`1` when current frame is for display).

	4. A 19-bit field containing the size of the first data partition in
	bytes.

	The version number setting enables or disables certain features in
	the bitstream, as follows:


	\| Version \| Reconstruction Filter \| Loop Filter \|
	\| ------- \| ----------------------- \| ----------- \|
	\| `0` \| Bicubic \| Normal \|
	\| `1` \| Bilinear \| Simple \|
	\| `2` \| Bilinear \| None \|
	\| `3` \| None \| None \|
	\| Other \| Reserved for future use \| \|


	The reference software also adjusts the loop filter based on version
	number, as per the table above. Version number `1` implies a "simple"
	loop filter, and version numbers `2` and `3` imply no loop filter.
	However, the "simple" filter setting in this context has no effect
	whatsoever on the decoding process, and the "no loop filter" setting
	only forces the reference encoder to set filter level equal to `0`.
	Neither affect the decoding process. In decoding, the only loop
	filter settings that matter are those in the frame header.

	For key frames, the frame tag is followed by a further 7 bytes of
	uncompressed data, as follows:


	~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
	Start code byte 0 0x9d
	Start code byte 1 0x01
	Start code byte 2 0x2a

	16 bits : (2 bits Horizontal Scale << 14) \| Width (14 bits)
	16 bits : (2 bits Vertical Scale << 14) \| Height (14 bits)
	~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
	{:lang="c"}


	The following source code segment illustrates validation of the start
	code and reading the width, height, and scale factors for a key
	frame.


	~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
	unsigned char *c = pbi->source+3;

	// vet via sync code
	if(c[0]!=0x9d\|\|c[1]!=0x01\|\|c[2]!=0x2a)
	return -1;
	~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
	{:lang="c"}


	Where `pbi->source` points to the beginning of the frame.

	The following code reads the image dimension from the bitstream:


	~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
	pc->Width = swap2((unsigned short)(c+3))&0x3fff;
	pc->horiz_scale = swap2((unsigned short)(c+3))>>14;
	pc->Height = swap2((unsigned short)(c+5))&0x3fff;
	pc->vert_scale = swap2((unsigned short)(c+5))>>14;
	~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
	{:lang="c"}


	Where the `swap2` macro takes care of the endian on a different
	platform:


	~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
	#if defined(__ppc__) \|\| defined(__ppc64__)
	# define swap2(d) \
	((d&0x000000ff)<<8) \| \
	((d&0x0000ff00)>>8)
	#else
	# define swap2(d) d
	#endif
	~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
	{:lang="c"}


	While each frame is encoded as a raster scan of 16x16 macroblocks,
	the frame dimensions are not necessarily evenly divisible by 16. In
	this case, write `ew = 16 - (width & 15)` and `eh = 16 - (height & 15)`
	for the excess width and height, respectively. Although they are
	encoded, the last `ew` columns and `eh` rows are not actually part of the
	image and should be discarded before final output. However, these
	"excess pixels" should be maintained in the internal reconstruction
	buffer used to predict ensuing frames.

	The scaling specifications for each dimension are encoded as follows.


	\| Value \| Scaling
	\| ------ \| -------------------------------------
	\| `0` \| No upscaling (the most common case).
	\| `1` \| Upscale by 5/4.
	\| `2` \| Upscale by 5/3.
	\| `3` \| Upscale by 2.


	Upscaling does not affect the reconstruction buffer, which should be
	maintained at the encoded resolution. Any reasonable method of
	upsampling (including any that may be supported by video hardware in
	the playback environment) may be used. Since scaling has no effect
	on decoding, we do not discuss it any further.

	As discussed in Section 5, allocation (or re-allocation) of data
	structures (such as the reconstruction buffer) whose size depends on
	dimension will be triggered here.