document/core/valid/conventions.rst - external/github.com/WebAssembly/spec - Git at Google

 .. index:: ! validation, ! type system, function type, table type, memory type, globaltype, valtype, resulttype, index space, instantiation. module
 .. _type-system:

 Conventions
 -----------

 Validation checks that a WebAssembly module is well-formed.
 Only valid modules can be :ref:`instantiated <exec-instantiation>`.

 Validity is defined by a *type system* over the :ref:`abstract syntax <syntax>` of a :ref:`module <syntax-module>` and its contents.
 For each piece of abstract syntax, there is a typing rule that specifies the constraints that apply to it.
 All rules are given in two *equivalent* forms:

 1. In *prose*, describing the meaning in intuitive form.
 2. In *formal notation*, describing the rule in mathematical form. [#cite-pldi2017]_

 .. note::
    The prose and formal rules are equivalent,
    so that understanding of the formal notation is *not* required to read this specification.
    The formalism offers a more concise description in notation that is used widely in programming languages semantics and is readily amenable to mathematical proof.

 In both cases, the rules are formulated in a *declarative* manner.
 That is, they only formulate the constraints, they do not define an algorithm.
 The skeleton of a sound and complete algorithm for type-checking instruction sequences according to this specification is provided in the :ref:`appendix <algo-valid>`.


 .. index:: ! context, function type, table type, memory type, global type, value type, result type, index space, module, function
 .. _context:

 Contexts
 ~~~~~~~~

 Validity of an individual definition is specified relative to a *context*,
 which collects relevant information about the surrounding :ref:`module <syntax-module>` and the definitions in scope:

 * *Types*: the list of types defined in the current module.
 * *Functions*: the list of functions declared in the current module, represented by their function type.
 * *Tables*: the list of tables declared in the current module, represented by their table type.
 * *Memories*: the list of memories declared in the current module, represented by their memory type.
 * *Globals*: the list of globals declared in the current module, represented by their global type.
 * *Locals*: the list of locals declared in the current function (including parameters), represented by their value type.
 * *Labels*: the stack of labels accessible from the current position, represented by their result type.
 * *Return*: the return type of the current function, represented as an optional result type that is absent when no return is allowed, as in free-standing expressions.

 In other words, a context contains a sequence of suitable :ref:`types <syntax-type>` for each :ref:`index space <syntax-index>`,
 describing each defined entry in that space.
 Locals, labels and return type are only used for validating :ref:`instructions <syntax-instr>` in :ref:`function bodies <syntax-func>`, and are left empty elsewhere.
 The label stack is the only part of the context that changes as validation of an instruction sequence proceeds.

 More concretely, contexts are defined as :ref:`records <notation-record>` :math:`C` with abstract syntax:

 .. math::
    \begin{array}{llll}
    \production{(context)} & C &::=&
      \begin{array}[t]{l@{~}ll}
      \{ & \CTYPES & \functype^\ast, \\
         & \CFUNCS & \functype^\ast, \\
         & \CTABLES & \tabletype^\ast, \\
         & \CMEMS & \memtype^\ast, \\
         & \CGLOBALS & \globaltype^\ast, \\
         & \CLOCALS & \valtype^\ast, \\
         & \CLABELS & \resulttype^\ast, \\
         & \CRETURN & \resulttype^? ~\} \\
      \end{array}
    \end{array}

 .. _notation-extend:

 In addition to field access written :math:`C.\K{field}` the following notation is adopted for manipulating contexts:

 * When spelling out a context, empty fields are omitted.

 * :math:`C,\K{field}\,A^\ast` denotes the same context as :math:`C` but with the elements :math:`A^\ast` prepended to its :math:`\K{field}` component sequence.

 .. note::
    We use :ref:`indexing notation <notation-index>` like :math:`C.\CLABELS[i]` to look up indices in their respective :ref:`index space <syntax-index>` in the context.
    Context extension notation :math:`C,\K{field}\,A` is primarily used to locally extend *relative* index spaces, such as :ref:`label indices <syntax-labelidx>`.
    Accordingly, the notation is defined to append at the *front* of the respective sequence, introducing a new relative index :math:`0` and shifting the existing ones.


 .. _valid-notation-textual:

 Prose Notation
 ~~~~~~~~~~~~~~

 Validation is specified by stylised rules for each relevant part of the :ref:`abstract syntax <syntax>`.
 The rules not only state constraints defining when a phrase is valid,
 they also classify it with a type.
 The following conventions are adopted in stating these rules.

 * A phrase :math:`A` is said to be "valid with type :math:`T`"
   if and only if all constraints expressed by the respective rules are met.
   The form of :math:`T` depends on what :math:`A` is.

   .. note::
      For example, if :math:`A` is a :ref:`function <syntax-func>`,
      then  :math:`T` is a :ref:`function type <syntax-functype>`;
      for an :math:`A` that is a :ref:`global <syntax-global>`,
      :math:`T` is a :ref:`global type <syntax-globaltype>`;
      and so on.

 * The rules implicitly assume a given :ref:`context <context>` :math:`C`.

 * In some places, this context is locally extended to a context :math:`C'` with additional entries.
   The formulation "Under context :math:`C'`, ... *statement* ..." is adopted to express that the following statement must apply under the assumptions embodied in the extended context.


 .. index:: ! typing rules
 .. _valid-notation:

 Formal Notation
 ~~~~~~~~~~~~~~~

 .. note::
    This section gives a brief explanation of the notation for specifying typing rules formally.
    For the interested reader, a more thorough introduction can be found in respective text books. [#cite-tapl]_

 The proposition that a phrase :math:`A` has a respective type :math:`T` is written :math:`A : T`.
 In general, however, typing is dependent on a context :math:`C`.
 To express this explicitly, the complete form is a *judgement* :math:`C \vdash A : T`,
 which says that :math:`A : T` holds under the assumptions encoded in :math:`C`.

 The formal typing rules use a standard approach for specifying type systems, rendering them into *deduction rules*.
 Every rule has the following general form:

 .. math::
    \frac{
      \X{premise}_1 \qquad \X{premise}_2 \qquad \dots \qquad \X{premise}_n
    }{
      \X{conclusion}
    }

 Such a rule is read as a big implication: if all premises hold, then the conclusion holds.
 Some rules have no premises; they are *axioms* whose conclusion holds unconditionally.
 The conclusion always is a judgment :math:`C \vdash A : T`,
 and there is one respective rule for each relevant construct :math:`A` of the abstract syntax.

 .. note::
    For example, the typing rule for the :math:`\I32.\ADD` instruction can be given as an axiom:

    .. math::
       \frac{
       }{
         C \vdash \I32.\ADD : [\I32~\I32] \to [\I32]
       }

    The instruction is always valid with type :math:`[\I32~\I32] \to [\I32`]
    (saying that it consumes two |I32| values and produces one),
    independent of any side conditions.

    An instruction like |LOCALGET| can be typed as follows:

    .. math::
       \frac{
         C.\CLOCALS[x] = t
       }{
         C \vdash \LOCALGET~x : [] \to [t]
       }

    Here, the premise enforces that the immediate :ref:`local index <syntax-localidx>` :math:`x` exists in the context.
    The instruction produces a value of its respective type :math:`t`
    (and does not consume any values).
    If :math:`C.\CLOCALS[x]` does not exist then the premise does not hold,
    and the instruction is ill-typed.

    Finally, a :ref:`structured <syntax-instr-control>` instruction requires
    a recursive rule, where the premise is itself a typing judgement:

    .. math::
       \frac{
         C,\LABEL\,[t^?] \vdash \instr^\ast : [] \to [t^?]
       }{
         C \vdash \BLOCK~[t^?]~\instr^\ast~\END : [] \to [t^?]
       }

    A |BLOCK| instruction is only valid when the instruction sequence in its body is.
    Moreover, the result type must match the block's annotation :math:`[t^?]`.
    If so, then the |BLOCK| instruction has the same type as the body.
    Inside the body an additional label of the same type is available,
    which is expressed by extending the context :math:`C` with the additional label information for the premise.


 .. [#cite-pldi2017]
    The semantics is derived from the following article:
    Andreas Haas, Andreas Rossberg, Derek Schuff, Ben Titzer, Dan Gohman, Luke Wagner, Alon Zakai, JF Bastien, Michael Holman. |PLDI2017|_. Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2017). ACM 2017.

 .. [#cite-tapl]
    For example: Benjamin Pierce. |TAPL|_. The MIT Press 2002
	.. index:: ! validation, ! type system, function type, table type, memory type, globaltype, valtype, resulttype, index space, instantiation. module
	.. _type-system:

	Conventions
	-----------

	Validation checks that a WebAssembly module is well-formed.
	Only valid modules can be :ref:`instantiated <exec-instantiation>`.

	Validity is defined by a type system over the :ref:`abstract syntax <syntax>` of a :ref:`module <syntax-module>` and its contents.
	For each piece of abstract syntax, there is a typing rule that specifies the constraints that apply to it.
	All rules are given in two equivalent forms:

	1. In prose, describing the meaning in intuitive form.
	2. In formal notation, describing the rule in mathematical form. [#cite-pldi2017]_

	.. note::
	The prose and formal rules are equivalent,
	so that understanding of the formal notation is not required to read this specification.
	The formalism offers a more concise description in notation that is used widely in programming languages semantics and is readily amenable to mathematical proof.

	In both cases, the rules are formulated in a declarative manner.
	That is, they only formulate the constraints, they do not define an algorithm.
	The skeleton of a sound and complete algorithm for type-checking instruction sequences according to this specification is provided in the :ref:`appendix <algo-valid>`.


	.. index:: ! context, function type, table type, memory type, global type, value type, result type, index space, module, function
	.. _context:

	Contexts
	~~~~~~~~

	Validity of an individual definition is specified relative to a context,
	which collects relevant information about the surrounding :ref:`module <syntax-module>` and the definitions in scope:

	* Types: the list of types defined in the current module.
	* Functions: the list of functions declared in the current module, represented by their function type.
	* Tables: the list of tables declared in the current module, represented by their table type.
	* Memories: the list of memories declared in the current module, represented by their memory type.
	* Globals: the list of globals declared in the current module, represented by their global type.
	* Locals: the list of locals declared in the current function (including parameters), represented by their value type.
	* Labels: the stack of labels accessible from the current position, represented by their result type.
	* Return: the return type of the current function, represented as an optional result type that is absent when no return is allowed, as in free-standing expressions.

	In other words, a context contains a sequence of suitable :ref:`types <syntax-type>` for each :ref:`index space <syntax-index>`,
	describing each defined entry in that space.
	Locals, labels and return type are only used for validating :ref:`instructions <syntax-instr>` in :ref:`function bodies <syntax-func>`, and are left empty elsewhere.
	The label stack is the only part of the context that changes as validation of an instruction sequence proceeds.

	More concretely, contexts are defined as :ref:`records <notation-record>` :math:`C` with abstract syntax:

	.. math::
	\begin{array}{llll}
	\production{(context)} & C &::=&
	\begin{array}[t]{l@{~}ll}
	\{ & \CTYPES & \functype^\ast, \\
	& \CFUNCS & \functype^\ast, \\
	& \CTABLES & \tabletype^\ast, \\
	& \CMEMS & \memtype^\ast, \\
	& \CGLOBALS & \globaltype^\ast, \\
	& \CLOCALS & \valtype^\ast, \\
	& \CLABELS & \resulttype^\ast, \\
	& \CRETURN & \resulttype^? ~\} \\
	\end{array}
	\end{array}

	.. _notation-extend:

	In addition to field access written :math:`C.\K{field}` the following notation is adopted for manipulating contexts:

	* When spelling out a context, empty fields are omitted.

	* :math:`C,\K{field}\,A^\ast` denotes the same context as :math:`C` but with the elements :math:`A^\ast` prepended to its :math:`\K{field}` component sequence.

	.. note::
	We use :ref:`indexing notation <notation-index>` like :math:`C.\CLABELS[i]` to look up indices in their respective :ref:`index space <syntax-index>` in the context.
	Context extension notation :math:`C,\K{field}\,A` is primarily used to locally extend relative index spaces, such as :ref:`label indices <syntax-labelidx>`.
	Accordingly, the notation is defined to append at the front of the respective sequence, introducing a new relative index :math:`0` and shifting the existing ones.


	.. _valid-notation-textual:

	Prose Notation
	~~~~~~~~~~~~~~

	Validation is specified by stylised rules for each relevant part of the :ref:`abstract syntax <syntax>`.
	The rules not only state constraints defining when a phrase is valid,
	they also classify it with a type.
	The following conventions are adopted in stating these rules.

	* A phrase :math:`A` is said to be "valid with type :math:`T`"
	if and only if all constraints expressed by the respective rules are met.
	The form of :math:`T` depends on what :math:`A` is.

	.. note::
	For example, if :math:`A` is a :ref:`function <syntax-func>`,
	then :math:`T` is a :ref:`function type <syntax-functype>`;
	for an :math:`A` that is a :ref:`global <syntax-global>`,
	:math:`T` is a :ref:`global type <syntax-globaltype>`;
	and so on.

	* The rules implicitly assume a given :ref:`context <context>` :math:`C`.

	* In some places, this context is locally extended to a context :math:`C'` with additional entries.
	The formulation "Under context :math:`C'`, ... statement ..." is adopted to express that the following statement must apply under the assumptions embodied in the extended context.


	.. index:: ! typing rules
	.. _valid-notation:

	Formal Notation
	~~~~~~~~~~~~~~~

	.. note::
	This section gives a brief explanation of the notation for specifying typing rules formally.
	For the interested reader, a more thorough introduction can be found in respective text books. [#cite-tapl]_

	The proposition that a phrase :math:`A` has a respective type :math:`T` is written :math:`A : T`.
	In general, however, typing is dependent on a context :math:`C`.
	To express this explicitly, the complete form is a judgement :math:`C \vdash A : T`,
	which says that :math:`A : T` holds under the assumptions encoded in :math:`C`.

	The formal typing rules use a standard approach for specifying type systems, rendering them into deduction rules.
	Every rule has the following general form:

	.. math::
	\frac{
	\X{premise}_1 \qquad \X{premise}_2 \qquad \dots \qquad \X{premise}_n
	}{
	\X{conclusion}
	}

	Such a rule is read as a big implication: if all premises hold, then the conclusion holds.
	Some rules have no premises; they are axioms whose conclusion holds unconditionally.
	The conclusion always is a judgment :math:`C \vdash A : T`,
	and there is one respective rule for each relevant construct :math:`A` of the abstract syntax.

	.. note::
	For example, the typing rule for the :math:`\I32.\ADD` instruction can be given as an axiom:

	.. math::
	\frac{
	}{
	C \vdash \I32.\ADD : [\I32~\I32] \to [\I32]
	}

	The instruction is always valid with type :math:`[\I32~\I32] \to [\I32`]
	(saying that it consumes two \|I32\| values and produces one),
	independent of any side conditions.

	An instruction like \|LOCALGET\| can be typed as follows:

	.. math::
	\frac{
	C.\CLOCALS[x] = t
	}{
	C \vdash \LOCALGET~x : [] \to [t]
	}

	Here, the premise enforces that the immediate :ref:`local index <syntax-localidx>` :math:`x` exists in the context.
	The instruction produces a value of its respective type :math:`t`
	(and does not consume any values).
	If :math:`C.\CLOCALS[x]` does not exist then the premise does not hold,
	and the instruction is ill-typed.

	Finally, a :ref:`structured <syntax-instr-control>` instruction requires
	a recursive rule, where the premise is itself a typing judgement:

	.. math::
	\frac{
	C,\LABEL\,[t^?] \vdash \instr^\ast : [] \to [t^?]
	}{
	C \vdash \BLOCK~[t^?]~\instr^\ast~\END : [] \to [t^?]
	}

	A \|BLOCK\| instruction is only valid when the instruction sequence in its body is.
	Moreover, the result type must match the block's annotation :math:`[t^?]`.
	If so, then the \|BLOCK\| instruction has the same type as the body.
	Inside the body an additional label of the same type is available,
	which is expressed by extending the context :math:`C` with the additional label information for the premise.


	.. [#cite-pldi2017]
	The semantics is derived from the following article:
	Andreas Haas, Andreas Rossberg, Derek Schuff, Ben Titzer, Dan Gohman, Luke Wagner, Alon Zakai, JF Bastien, Michael Holman. \|PLDI2017\|_. Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2017). ACM 2017.

	.. [#cite-tapl]
	For example: Benjamin Pierce. \|TAPL\|_. The MIT Press 2002