[spec] More precise Unicode terminology (#1002)

commit: d14d538e5fccdc03a02948963addad10ad45b50d [log] [tgz]
author: Andreas Rossberg <rossberg@mpi-sws.org> Tue Apr 16 18:32:43 2019
committer: GitHub <noreply@github.com> Tue Apr 16 18:32:43 2019
tree: 699522595e99da687a86748a9af7eeac6e0bba74
parent: 4dce9569f7cf15a15d72e36c224059d06272ebb8 [diff]
diff --git a/document/core/appendix/embedding.rst b/document/core/appendix/embedding.rst
index 48b601b..9558bf1 100644
--- a/document/core/appendix/embedding.rst
+++ b/document/core/appendix/embedding.rst

@@ -82,10 +82,10 @@
 .. index:: text format
 .. _embed-parse-module:
 
-:math:`\F{parse\_module}(\codepoint^\ast) : \module ~|~ \error`
+:math:`\F{parse\_module}(\char^\ast) : \module ~|~ \error`
 ...............................................................
 
-1. If there exists a derivation for the :ref:`source <text-source>` :math:`\codepoint^\ast` as a :math:`\Tmodule` according to the :ref:`text grammar for modules <text-module>`, yielding a :ref:`module <syntax-module>` :math:`m`, then return :math:`m`.
+1. If there exists a derivation for the :ref:`source <text-source>` :math:`\char^\ast` as a :math:`\Tmodule` according to the :ref:`text grammar for modules <text-module>`, yielding a :ref:`module <syntax-module>` :math:`m`, then return :math:`m`.
 
 2. Else, return :math:`\ERROR`.
 

diff --git a/document/core/appendix/implementation.rst b/document/core/appendix/implementation.rst
index 83d7d87..addd57c 100644
--- a/document/core/appendix/implementation.rst
+++ b/document/core/appendix/implementation.rst

@@ -24,7 +24,7 @@
 Syntactic Limits
 ~~~~~~~~~~~~~~~~
 
-.. index:: abstract syntax, module, type, function, table, memory, global, element, data, import, export, parameter, result, local, structured control instruction, instruction, name, Unicode, code point
+.. index:: abstract syntax, module, type, function, table, memory, global, element, data, import, export, parameter, result, local, structured control instruction, instruction, name, Unicode, character
 .. _impl-syntax:
 
 Structure
@@ -52,7 +52,7 @@
 * the length of an :ref:`element segment <syntax-elem>`
 * the length of a :ref:`data segment <syntax-data>`
 * the length of a :ref:`name <syntax-name>`
-* the range of :ref:`code points <syntax-codepoint>` in a :ref:`name <syntax-name>`
+* the range of :ref:`characters <syntax-char>` in a :ref:`name <syntax-name>`
 
 If the limits of an implementation are exceeded for a given module,
 then the implementation may reject the :ref:`validation <valid>`, compilation, or :ref:`instantiation <exec-instantiation>` of that module with an embedder-specific error.
@@ -91,7 +91,7 @@
 * the size of an individual :ref:`token <text-token>`
 * the nesting depth of :ref:`folded instructions <text-foldedinstr>`
 * the length of symbolic :ref:`identifiers <text-id>`
-* the range of literal :ref:`characters <text-char>` (code points) allowed in the :ref:`source text <source>`
+* the range of literal :ref:`characters <text-char>` allowed in the :ref:`source text <source>`
 
 
 .. index:: validation, function

diff --git a/document/core/binary/values.rst b/document/core/binary/values.rst
index 772309f..c626862 100644
--- a/document/core/binary/values.rst
+++ b/document/core/binary/values.rst

@@ -105,7 +105,7 @@
 Names
 ~~~~~
 
-:ref:`Names <syntax-name>` are encoded as a :ref:`vector <binary-vec>` of bytes containing the |Unicode|_ (Section 3.9) UTF-8 encoding of the name's code point sequence.
+:ref:`Names <syntax-name>` are encoded as a :ref:`vector <binary-vec>` of bytes containing the |Unicode|_ (Section 3.9) UTF-8 encoding of the name's character sequence.
 
 .. math::
    \begin{array}{llclllll}

diff --git a/document/core/intro/introduction.rst b/document/core/intro/introduction.rst
index f952302..c190d3c 100644
--- a/document/core/intro/introduction.rst
+++ b/document/core/intro/introduction.rst

@@ -75,7 +75,7 @@
 These will each define a WebAssembly *application programming interface (API)* suitable for a given environment.
 
 
-.. index:: IEEE 754, floating point, Unicode, name, text format, UTF-8, code point
+.. index:: IEEE 754, floating point, Unicode, name, text format, UTF-8, character
 .. _dependencies:
 
 Dependencies
@@ -88,7 +88,7 @@
 * |Unicode|_, for the representation of import/export :ref:`names <syntax-name>` and the :ref:`text format <text>`.
 
 However, to make this specification self-contained, relevant aspects of the aforementioned standards are defined and formalized as part of this specification,
-such as the :ref:`binary representation <aux-fbits>` and :ref:`rounding <aux-ieee>` of floating-point values, and the :ref:`value range <syntax-codepoint>` and :ref:`UTF-8 encoding <binary-utf8>` of Unicode characters.
+such as the :ref:`binary representation <aux-fbits>` and :ref:`rounding <aux-ieee>` of floating-point values, and the :ref:`value range <syntax-char>` and :ref:`UTF-8 encoding <binary-utf8>` of Unicode characters.
 
 .. note::
    The aforementioned standards are the authoritative source of all respective definitions.

diff --git a/document/core/syntax/values.rst b/document/core/syntax/values.rst
index 69385e6..7443842 100644
--- a/document/core/syntax/values.rst
+++ b/document/core/syntax/values.rst

@@ -146,21 +146,21 @@
 * The meta variable :math:`z` ranges over floating-point values where clear from context.
 
 
-.. index:: ! name, byte, Unicode, UTF-8, code point, binary format
+.. index:: ! name, byte, Unicode, UTF-8, character, binary format
    pair: abstract syntax; name
-.. _syntax-codepoint:
+.. _syntax-char:
 .. _syntax-name:
 
 Names
 ~~~~~
 
-*Names* are sequences of scalar *code points* as defined by |Unicode|_ (Section 2.4).
+*Names* are sequences of *characters*, which are *scalar values* as defined by |Unicode|_ (Section 2.4).
 
 .. math::
    \begin{array}{llclll}
    \production{name} & \name &::=&
-     \codepoint^\ast \qquad\qquad (\iff |\utf8(\codepoint^\ast)| < 2^{32}) \\
-   \production{code point} & \codepoint &::=&
+     \char^\ast \qquad\qquad (\iff |\utf8(\char^\ast)| < 2^{32}) \\
+   \production{character} & \char &::=&
      \unicode{00} ~|~ \dots ~|~ \unicode{D7FF} ~|~
      \unicode{E000} ~|~ \dots ~|~ \unicode{10FFFF} \\
    \end{array}
@@ -172,4 +172,4 @@
 Convention
 ..........
 
-* Code points are sometimes used interchangeably with natural numbers :math:`n < 1114112`.
+* Characters (Unicode scalar values) are sometimes used interchangeably with natural numbers :math:`n < 1114112`.

diff --git a/document/core/text/conventions.rst b/document/core/text/conventions.rst
index 1d87c8c..c73e9c0 100644
--- a/document/core/text/conventions.rst
+++ b/document/core/text/conventions.rst

@@ -32,7 +32,7 @@
 In order to distinguish symbols of the textual syntax from symbols of the abstract syntax, :math:`\mathtt{typewriter}` font is adopted for the former.
 
 * Terminal symbols are either literal strings of characters enclosed in quotes
-  or expressed as |Unicode|_ code points: :math:`\text{module}`, :math:`\unicode{0A}`.
+  or expressed as |Unicode|_ scalar values: :math:`\text{module}`, :math:`\unicode{0A}`.
   (All characters written literally are unambiguously drawn from the 7-bit |ASCII|_ subset of Unicode.)
 
 * Nonterminal symbols are written in typewriter font: :math:`\T{valtype}, \T{instr}`.

diff --git a/document/core/text/lexical.rst b/document/core/text/lexical.rst
index b4df167..a3b4529 100644
--- a/document/core/text/lexical.rst
+++ b/document/core/text/lexical.rst

@@ -5,7 +5,7 @@
 --------------
 
 
-.. index:: ! character, Unicode, ASCII, code point, ! source text
+.. index:: ! character, Unicode, ASCII, character, ! source text
    pair: text format; character
 .. _source:
 .. _text-source:
@@ -15,7 +15,7 @@
 ~~~~~~~~~~
 
 The text format assigns meaning to *source text*, which consists of a sequence of *characters*.
-Characters are assumed to be represented as valid |Unicode|_ (Section 2.4) *code points*.
+Characters are assumed to be represented as valid |Unicode|_ (Section 2.4) *scalar values*.
 
 .. math::
    \begin{array}{llll}

diff --git a/document/core/text/values.rst b/document/core/text/values.rst
index 51be72e..d93dfba 100644
--- a/document/core/text/values.rst
+++ b/document/core/text/values.rst

@@ -179,7 +179,7 @@
    \end{array}
 
 
-.. index:: name, byte, character, code point
+.. index:: name, byte, character, character
    pair: text format; name
 .. _text-name:
 
@@ -187,7 +187,7 @@
 ~~~~~
 
 :ref:`Names <syntax-name>` are strings denoting a literal character sequence. 
-A name string must form a valid UTF-8 encoding as defined by |Unicode|_ (Section 2.5) and is interpreted as a string of Unicode code points.
+A name string must form a valid UTF-8 encoding as defined by |Unicode|_ (Section 2.5) and is interpreted as a string of Unicode scalar values.
 
 .. math::
    \begin{array}{llclll@{\qquad}l}

diff --git a/document/core/util/macros.def b/document/core/util/macros.def
index 67d9b85..34d821c 100644
--- a/document/core/util/macros.def
+++ b/document/core/util/macros.def

@@ -150,7 +150,7 @@
 .. |f64| mathdef:: \xref{syntax/values}{syntax-float}{\fX{\X{64}}}
 
 .. |name| mathdef:: \xref{syntax/values}{syntax-name}{\X{name}}
-.. |codepoint| mathdef:: \xref{syntax/values}{syntax-name}{\X{codepoint}}
+.. |char| mathdef:: \xref{syntax/values}{syntax-name}{\X{char}}
 
 
 .. Values, meta functions
@@ -434,7 +434,7 @@
 .. |Bf64| mathdef:: \xref{binary/values}{binary-float}{\BfX{\B{64}}}
 
 .. |Bname| mathdef:: \xref{binary/values}{binary-name}{\B{name}}
-.. |Bcodepoint| mathdef:: \xref{binary/values}{binary-name}{\B{codepoint}}
+.. |Bchar| mathdef:: \xref{binary/values}{binary-name}{\B{char}}
 
 
 .. Values, meta functions
@@ -593,7 +593,7 @@
 .. |Tstringelem| mathdef:: \xref{text/values}{text-string}{\T{stringelem}}
 .. |Tstringchar| mathdef:: \xref{text/values}{text-string}{\T{stringchar}}
 .. |Tname| mathdef:: \xref{text/values}{text-name}{\T{name}}
-.. |Tcodepoint| mathdef:: \xref{text/values}{text-name}{\T{codepoint}}
+.. |Tchar| mathdef:: \xref{text/values}{text-name}{\T{char}}
 .. |Tcodeval| mathdef:: \xref{text/values}{text-name}{\T{codeval}}
 .. |Tcodecont| mathdef:: \xref{text/values}{text-name}{\T{cont}}
commit	d14d538e5fccdc03a02948963addad10ad45b50d	[log] [tgz]
author	Andreas Rossberg <rossberg@mpi-sws.org>	Tue Apr 16 18:32:43 2019
committer	GitHub <noreply@github.com>	Tue Apr 16 18:32:43 2019
tree	699522595e99da687a86748a9af7eeac6e0bba74
parent	4dce9569f7cf15a15d72e36c224059d06272ebb8 [diff]