document/core/binary/values.rst - external/github.com/WebAssembly/spec - Git at Google

 .. index:: value
    pair: binary format; value
 .. _binary-value:

 Values
 ------


 .. index:: byte
    pair: binary format; byte
 .. _binary-byte:

 Bytes
 ~~~~~

 :ref:`Bytes <syntax-byte>` encode themselves.

 .. math::
    \begin{array}{llcll@{\qquad}l}
    \production{byte} & \Bbyte &::=&
      \hex{00} &\Rightarrow& \hex{00} \\ &&|&&
      \dots \\ &&|&
      \hex{FF} &\Rightarrow& \hex{FF} \\
    \end{array}


 .. index:: integer, unsigned integer, signed integer, uninterpreted integer, LEB128, two's complement
    pair: binary format; integer
    pair: binary format; unsigned integer
    pair: binary format; signed integer
    pair: binary format; uninterpreted integer
 .. _binary-sint:
 .. _binary-uint:
 .. _binary-int:

 Integers
 ~~~~~~~~

 All :ref:`integers <syntax-int>` are encoded using the |LEB128|_ variable-length integer encoding, in either unsigned or signed variant.

 :ref:`Unsigned integers <syntax-uint>` are encoded in |UnsignedLEB128|_ format.
 As an additional constraint, the total number of bytes encoding a value of type :math:`\uN` must not exceed :math:`\F{ceil}(N/7)` bytes.

 .. math::
    \begin{array}{llclll@{\qquad}l}
    \production{unsigned integer} & \BuN &::=&
      n{:}\Bbyte &\Rightarrow& n & (\iff n < 2^7 \wedge n < 2^N) \\ &&|&
      n{:}\Bbyte~~m{:}\BuX{(N\B{-7})} &\Rightarrow&
        2^7\cdot m + (n-2^7) & (\iff n \geq 2^7 \wedge N > 7) \\
    \end{array}

 :ref:`Signed integers <syntax-sint>` are encoded in |SignedLEB128|_ format, which uses a two's complement representation.
 As an additional constraint, the total number of bytes encoding a value of type :math:`\sN` must not exceed :math:`\F{ceil}(N/7)` bytes.

 .. math::
    \begin{array}{llclll@{\qquad}l}
    \production{signed integer} & \BsN &::=&
      n{:}\Bbyte &\Rightarrow& n & (\iff n < 2^6 \wedge n < 2^{N-1}) \\ &&|&
      n{:}\Bbyte &\Rightarrow& n-2^7 & (\iff 2^6 \leq n < 2^7 \wedge n \geq 2^7-2^{N-1}) \\ &&|&
      n{:}\Bbyte~~m{:}\BsX{(N\B{-7})} &\Rightarrow&
        2^7\cdot m + (n-2^7) & (\iff n \geq 2^7 \wedge N > 7) \\
    \end{array}

 :ref:`Uninterpreted integers <syntax-int>` are encoded as signed integers.

 .. math::
    \begin{array}{llclll@{\qquad\qquad}l}
    \production{uninterpreted integer} & \BiN &::=&
      n{:}\BsN &\Rightarrow& i & (\iff n = \signed_{\iN}(i))
    \end{array}

 .. note::
    The side conditions :math:`N > 7` in the productions for non-terminal bytes of the :math:`\uX{}` and :math:`\sX{}` encodings restrict the encoding's length.
    However, "trailing zeros" are still allowed within these bounds.
    For example, :math:`\hex{03}` and :math:`\hex{83}~\hex{00}` are both well-formed encodings for the value :math:`3` as a |u8|.
    Similarly, either of :math:`\hex{7e}` and :math:`\hex{FE}~\hex{7F}` and :math:`\hex{FE}~\hex{FF}~\hex{7F}` are well-formed encodings of the value :math:`-2` as a |s16|.

    The side conditions on the value :math:`n` of terminal bytes further enforce that
    any unused bits in these bytes must be :math:`0` for positive values and :math:`1` for negative ones.
    For example, :math:`\hex{83}~\hex{10}` is malformed as a |u8| encoding.
    Similarly, both :math:`\hex{83}~\hex{3E}` and :math:`\hex{FF}~\hex{7B}` are malformed as |s8| encodings.


 .. index:: floating-point number, little endian
    pair: binary format; floating-point number
 .. _binary-float:

 Floating-Point
 ~~~~~~~~~~~~~~

 :ref:`Floating-point <syntax-float>` values are encoded directly by their |IEEE754|_ (Section 3.4) bit pattern in |LittleEndian|_ byte order:

 .. math::
    \begin{array}{llclll@{\qquad\qquad}l}
    \production{floating-point value} & \BfN &::=&
      b^\ast{:\,}\Bbyte^{N/8} &\Rightarrow& \bytes_{\fN}^{-1}(b^\ast) \\
    \end{array}


 .. index:: name, byte, Unicode, ! UTF-8
    pair: binary format; name
 .. _binary-utf8:
 .. _binary-name:

 Names
 ~~~~~

 :ref:`Names <syntax-name>` are encoded as a :ref:`vector <binary-vec>` of bytes containing the |Unicode|_ (Section 3.9) UTF-8 encoding of the name's code point sequence.

 .. math::
    \begin{array}{llclllll}
    \production{name} & \Bname &::=&
      b^\ast{:}\Bvec(\Bbyte) &\Rightarrow& \name
        && (\iff \utf8(\name) = b^\ast) \\
    \end{array}

 The auxiliary |utf8| function expressing this encoding is defined as follows:

 .. math::
    \begin{array}{@{}l@{}}
    \begin{array}{@{}lcl@{\qquad}l@{}}
    \utf8(c^\ast) &=& (\utf8(c))^\ast \\[1ex]
    \utf8(c) &=& b &
      (\begin{array}[t]{@{}c@{~}l@{}}
       \iff & c < \unicode{80} \\
       \wedge & c = b) \\
       \end{array} \\
    \utf8(c) &=& b_1~b_2 &
      (\begin{array}[t]{@{}c@{~}l@{}}
       \iff & \unicode{80} \leq c < \unicode{800} \\
       \wedge & c = 2^6(b_1-\hex{C0})+(b_2-\hex{80})) \\
       \end{array} \\
    \utf8(c) &=& b_1~b_2~b_3 &
      (\begin{array}[t]{@{}c@{~}l@{}}
       \iff & \unicode{800} \leq c < \unicode{D800} \vee \unicode{E000} \leq c < \unicode{10000} \\
       \wedge & c = 2^{12}(b_1-\hex{E0})+2^6(b_2-\hex{80})+(b_3-\hex{80})) \\
       \end{array} \\
    \utf8(c) &=& b_1~b_2~b_3~b_4 &
      (\begin{array}[t]{@{}c@{~}l@{}}
       \iff & \unicode{10000} \leq c < \unicode{110000} \\
       \wedge & c = 2^{18}(b_1-\hex{F0})+2^{12}(b_2-\hex{80})+2^6(b_3-\hex{80})+(b_4-\hex{80})) \\
       \end{array} \\
    \end{array} \\
    \where b_2, b_3, b_4 < \hex{C0} \\
    \end{array}

 .. note::
    Unlike in some other formats, name strings are not 0-terminated.
	.. index:: value
	pair: binary format; value
	.. _binary-value:

	Values
	------


	.. index:: byte
	pair: binary format; byte
	.. _binary-byte:

	Bytes
	~~~~~

	:ref:`Bytes <syntax-byte>` encode themselves.

	.. math::
	\begin{array}{llcll@{\qquad}l}
	\production{byte} & \Bbyte &::=&
	\hex{00} &\Rightarrow& \hex{00} \\ &&\|&&
	\dots \\ &&\|&
	\hex{FF} &\Rightarrow& \hex{FF} \\
	\end{array}


	.. index:: integer, unsigned integer, signed integer, uninterpreted integer, LEB128, two's complement
	pair: binary format; integer
	pair: binary format; unsigned integer
	pair: binary format; signed integer
	pair: binary format; uninterpreted integer
	.. _binary-sint:
	.. _binary-uint:
	.. _binary-int:

	Integers
	~~~~~~~~

	All :ref:`integers <syntax-int>` are encoded using the \|LEB128\|_ variable-length integer encoding, in either unsigned or signed variant.

	:ref:`Unsigned integers <syntax-uint>` are encoded in \|UnsignedLEB128\|_ format.
	As an additional constraint, the total number of bytes encoding a value of type :math:`\uN` must not exceed :math:`\F{ceil}(N/7)` bytes.

	.. math::
	\begin{array}{llclll@{\qquad}l}
	\production{unsigned integer} & \BuN &::=&
	n{:}\Bbyte &\Rightarrow& n & (\iff n < 2^7 \wedge n < 2^N) \\ &&\|&
	n{:}\Bbyte~~m{:}\BuX{(N\B{-7})} &\Rightarrow&
	2^7\cdot m + (n-2^7) & (\iff n \geq 2^7 \wedge N > 7) \\
	\end{array}

	:ref:`Signed integers <syntax-sint>` are encoded in \|SignedLEB128\|_ format, which uses a two's complement representation.
	As an additional constraint, the total number of bytes encoding a value of type :math:`\sN` must not exceed :math:`\F{ceil}(N/7)` bytes.

	.. math::
	\begin{array}{llclll@{\qquad}l}
	\production{signed integer} & \BsN &::=&
	n{:}\Bbyte &\Rightarrow& n & (\iff n < 2^6 \wedge n < 2^{N-1}) \\ &&\|&
	n{:}\Bbyte &\Rightarrow& n-2^7 & (\iff 2^6 \leq n < 2^7 \wedge n \geq 2^7-2^{N-1}) \\ &&\|&
	n{:}\Bbyte~~m{:}\BsX{(N\B{-7})} &\Rightarrow&
	2^7\cdot m + (n-2^7) & (\iff n \geq 2^7 \wedge N > 7) \\
	\end{array}

	:ref:`Uninterpreted integers <syntax-int>` are encoded as signed integers.

	.. math::
	\begin{array}{llclll@{\qquad\qquad}l}
	\production{uninterpreted integer} & \BiN &::=&
	n{:}\BsN &\Rightarrow& i & (\iff n = \signed_{\iN}(i))
	\end{array}

	.. note::
	The side conditions :math:`N > 7` in the productions for non-terminal bytes of the :math:`\uX{}` and :math:`\sX{}` encodings restrict the encoding's length.
	However, "trailing zeros" are still allowed within these bounds.
	For example, :math:`\hex{03}` and :math:`\hex{83}~\hex{00}` are both well-formed encodings for the value :math:`3` as a \|u8\|.
	Similarly, either of :math:`\hex{7e}` and :math:`\hex{FE}~\hex{7F}` and :math:`\hex{FE}~\hex{FF}~\hex{7F}` are well-formed encodings of the value :math:`-2` as a \|s16\|.

	The side conditions on the value :math:`n` of terminal bytes further enforce that
	any unused bits in these bytes must be :math:`0` for positive values and :math:`1` for negative ones.
	For example, :math:`\hex{83}~\hex{10}` is malformed as a \|u8\| encoding.
	Similarly, both :math:`\hex{83}~\hex{3E}` and :math:`\hex{FF}~\hex{7B}` are malformed as \|s8\| encodings.


	.. index:: floating-point number, little endian
	pair: binary format; floating-point number
	.. _binary-float:

	Floating-Point
	~~~~~~~~~~~~~~

	:ref:`Floating-point <syntax-float>` values are encoded directly by their \|IEEE754\|_ (Section 3.4) bit pattern in \|LittleEndian\|_ byte order:

	.. math::
	\begin{array}{llclll@{\qquad\qquad}l}
	\production{floating-point value} & \BfN &::=&
	b^\ast{:\,}\Bbyte^{N/8} &\Rightarrow& \bytes_{\fN}^{-1}(b^\ast) \\
	\end{array}


	.. index:: name, byte, Unicode, ! UTF-8
	pair: binary format; name
	.. _binary-utf8:
	.. _binary-name:

	Names
	~~~~~

	:ref:`Names <syntax-name>` are encoded as a :ref:`vector <binary-vec>` of bytes containing the \|Unicode\|_ (Section 3.9) UTF-8 encoding of the name's code point sequence.

	.. math::
	\begin{array}{llclllll}
	\production{name} & \Bname &::=&
	b^\ast{:}\Bvec(\Bbyte) &\Rightarrow& \name
	&& (\iff \utf8(\name) = b^\ast) \\
	\end{array}

	The auxiliary \|utf8\| function expressing this encoding is defined as follows:

	.. math::
	\begin{array}{@{}l@{}}
	\begin{array}{@{}lcl@{\qquad}l@{}}
	\utf8(c^\ast) &=& (\utf8(c))^\ast \\[1ex]
	\utf8(c) &=& b &
	(\begin{array}[t]{@{}c@{~}l@{}}
	\iff & c < \unicode{80} \\
	\wedge & c = b) \\
	\end{array} \\
	\utf8(c) &=& b_1~b_2 &
	(\begin{array}[t]{@{}c@{~}l@{}}
	\iff & \unicode{80} \leq c < \unicode{800} \\
	\wedge & c = 2^6(b_1-\hex{C0})+(b_2-\hex{80})) \\
	\end{array} \\
	\utf8(c) &=& b_1~b_2~b_3 &
	(\begin{array}[t]{@{}c@{~}l@{}}
	\iff & \unicode{800} \leq c < \unicode{D800} \vee \unicode{E000} \leq c < \unicode{10000} \\
	\wedge & c = 2^{12}(b_1-\hex{E0})+2^6(b_2-\hex{80})+(b_3-\hex{80})) \\
	\end{array} \\
	\utf8(c) &=& b_1~b_2~b_3~b_4 &
	(\begin{array}[t]{@{}c@{~}l@{}}
	\iff & \unicode{10000} \leq c < \unicode{110000} \\
	\wedge & c = 2^{18}(b_1-\hex{F0})+2^{12}(b_2-\hex{80})+2^6(b_3-\hex{80})+(b_4-\hex{80})) \\
	\end{array} \\
	\end{array} \\
	\where b_2, b_3, b_4 < \hex{C0} \\
	\end{array}

	.. note::
	Unlike in some other formats, name strings are not 0-terminated.