This proposal adds function references that are typed and can be called directly. Unlike funcref and the existing call_indirect instruction, typed function references do not need to be stored in a table to be called (though they can). A typed function reference can be formed from any function index.
The proposal distinguished regular and nullable function reference. The former cannot be null, and a call through them does not require any runtime check.
The proposal has instructions for producing and consuming (calling) function references. It also includes instruction for testing and converting between regular and nullable references.
Typed references have no canonical default value, because they cannot be null. To enable storing them in locals, which so far depend on default values for initialization, the proposal also tracks the initialization status of locals during validation.
Enable efficient indirect function calls without runtime checks
Represent first-class function pointers without the need for tables
Easier and more efficient exchange of function references between modules and with host environment
Separate independently useful features from GC proposal
This proposal is based on the reference types proposal
Add a new form of typed reference type ref $t and a nullable variant (ref null $t), where $t is a type index; can be used as both a value type or an element type for tables
Add an instruction ref.as_non_null that converts a nullable reference to a non-nullable one or traps if null
Add an instruction br_on_null that converts a nullable reference to a non-nullable one or branches if null, as well as the inverted br_on_non_null
Add an instruction call_ref for calling a function through a ref $t
Refine the instruction ref.func $f to return a typed function reference
Track initialisation status of locals during validation and only allow local.get after a local.set/tee in the same or a surrounding block.
Add an optional initializer expression to table definitions, for element types that do not have an implicit default value.
The function $hof takes a function pointer as parameter, and is invoked by $caller, passing $inc as argument:
(type $i32-i32 (func (param i32) (result i32))) (elem declare funcref (ref.func $inc)) (func $hof (param $f (ref $i32-i32)) (result i32) (i32.add (i32.const 10) (call_ref $i32-i32 (i32.const 42) (local.get $f))) ) (func $inc (param $i i32) (result i32) (i32.add (local.get $i) (i32.const 1)) ) (func $caller (result i32) (call $hof (ref.func $inc)) )
It is also possible to create a typed function table:
(table 1 (ref $i32-i32) (ref.func $inc))
Such a table can neither contain null entries nor functions of another type. Because entries can't be null, tables of concrete type are required to be declared with an explicit initialisation value. Any use of call_indirect on this table does hence avoid all runtime checks beyond the basic bounds check. By using multiple tables, each one can be given a homogeneous type. The table can be initialized with an initializer or by growing it.
Typed function references are a subtype of funcref, so they can also be used as untyped references. All previous uses of ref.func remain valid:
(func $f (param i32)) (func $g) (func $h (result i64)) (table 10 funcref) (func $init (table.set (i32.const 0) (ref.func $f)) (table.set (i32.const 1) (ref.func $g)) (table.set (i32.const 2) (ref.func $h)) )
Based on reference types proposal, which introduces types funcref and externref.
A heap type denotes a user-defined or pre-defined data type that is not a primitive scalar:
heaptype ::= <typeidx> | func | extern
$t ok iff $t is defined in the contextfunc ok and extern ok, alwaysIn the binary encoding,
<typeidx> type is encoded as a (positive) signed LEBfuncref, externref, respectivelyA reference type denotes the type of a reference to some data. It may either include or exclude null:
(ref null? <heaptype>) is a new form of reference type
reftype ::= ref null? <heaptype>ref null? <heaptype> ok iff <heaptype> okReference types now all take the form ref null? <heaptype>
funcref and externref are reinterpreted as abbreviations (in both binary and text format) for (ref null func) and (ref null extern), respectivelyfunc and extern as heap types, which is relevant for future extensions such as type importsIn the binary encoding,
funcref and externref continue to exist as shorthands as described abovefunctype* okfunctype* = epsilonfunctype* = functype'* functype''and functype'* ok and functype'' ok using only type indices up to |functype'*|-1The following rules, now defined in terms of heap types, replace and extend the rules for basic reference types.
Reference types are covariant in the referenced heap type
(ref null <heaptype1>) <: (ref null <heaptype2>)<heaptype1> <: <heaptype2>(ref <heaptype1>) <: (ref <heaptype2>)<heaptype1> <: <heaptype2>Non-null types are subtypes of possibly-null types
(ref <heaptype1>) <: (ref null <heaptype2>)<heaptype1> <: <heaptype2>func$t <: func$t = <functype>$t <: $t'$t = <functype> and $t' = <functype'> and <functype> == <functype'>Any numeric value type is defaultable (to 0)
A reference value type is defaultable (to null) iff it is of the form ref null $t
Function-level locals must have a type that is defaultable.
Table definitions with a type that is not defaultable must have an initializer value. (Imports are not affected.)
localtype ::= set? <valtype>set records that the local has been initializedInstructions and instruction sequences are now typed with types of the following form:
instrtype ::= <functype> <localidx>*There is a natural notion of subtyping on instruction types:
[t1*] -> [t2*] x1* <: [t3*] -> [t4*] x2*t1* = t0* t1'*t2* = t0* t2'*[t1'*] -> [t2'*] <: [t3*] -> [t4*]*{x2*} subset {x1*}Block types are instruction types with empty index set.
Note: Extending block types with index sets to allow initialization status to last beyond a block's end is a possible extension.
ref.func creates a function reference from a function index
ref.func $f : [] -> [(ref $t)]$f : $tcall_ref <typeidx> calls a function through a reference
call_ref $t : [t1* (ref null $t)] -> [t2*]$t = [t1*] -> [t2*]nullreturn_call_ref tail-calls a function through a reference
return_call_ref $t : [t1* (ref null $t)] -> [t2*]$t = [t1*] -> [t2*]t2* <: C.resultnullref.null <heaptype> is generalised to take a <heaptype> immediate
ref.null ht: [] -> [(ref null ht)]ht okref.as_non_null converts a nullable reference to a non-null one
ref.as_non_null : [(ref null ht)] -> [(ref ht)]ht oknullbr_on_null <labelidx> checks for null and branches if present
br_on_null $l : [t* (ref null ht)] -> [t* (ref ht)]$l : [t*]ht ok$l on null, otherwise returns operand as non-nullbr_on_non_null <labelidx> checks for null and branches if not present
br_on_non_null $l : [t* (ref null ht)] -> [t*]$l : [t* (ref ht)]ht ok$l if operand is not null, passing the operand itself under non-null type (along with potential additional operands)Note: ref.is_null already exists via the reference types proposal
Typing of local instructions is updated to account for the initialization status of locals.
local.get <localidx>
local.get $x : [] -> [t]$x : set tlocal.set <localidx>
local.set $x : [t] -> [] $x$x : set? tlocal.tee <localidx>
local.tee $x : [t] -> [t] $x$x : set? tNote: These typing rules do not try to exclude indices for locals that have already been set, but an implementation could.
Typing of instruction sequences is updated to account for initialization of locals.
instr*instr1 instr* : [t1*] -> [t3*] x1* x2*instr1 : [t1*] -> [t2*] x1*instr* : [t2*] -> [t3*] x2* under a context where x1* are changed to setepsilon : [] -> [] epsilonNote: These typing rules do not try to eliminate duplicate indices, but an implementation could.
A subsumption rule allows to go to a supertype for any instruction:
instrinstr : [t1*] -> [t2*] x*instr : [t1'*] -> [t2'*] x'*[t1'*] -> [t2'*] x'* <: [t1*] -> [t2*] x*Table definitions have an initializer value:
(table <tabletype> <constexpr>) is an extended form of table definition
(table <limits> <reftype> <constexpr>) ok iff <limits> <reftype> ok and <constexpr> : <reftype>(table <tabletype>) is shorthand for (table <tabletype> (ref.null <heaptype>)), where <heaptype> is the element heap type contained in <tabletype>
| Opcode | Type | Parameters |
|---|---|---|
| -0x10 | funcref | |
| -0x11 | externref | |
| -0x1c | (ref ht) | $t : heaptype |
| -0x1d | (ref null ht) | $t : heaptype |
The opcode for heap types is encoded as an s33.
| Opcode | Type | Parameters |
|---|---|---|
| i >= 0 | i | |
| -0x10 | func | |
| -0x11 | extern |
| Opcode | Instruction | Immediates |
|---|---|---|
| 0x14 | call_ref $t | $t : u32 |
| 0x15 | return_call_ref $t | $t : u32 |
| 0xd4 | ref.as_non_null | |
| 0xd5 | br_on_null $l | $l : u32 |
| 0xd6 | br_on_non_null $l | $l : u32 |
Entries to the table section are extended as follows:
| Table Definition | Note |
|---|---|
| tabletype | null-initialized table (as before) |
| 0x40 0x00 tabletype constexpr | explicitly initialized table |
The encoding of a table type starts with the encoding of a reference type, which cannot be 0x40 (since this is a pseudo type code otherwise only used in block types). Consequently, both forms can be distinguished by the first byte. The second byte is reserved for possible future extensions.
The group decided to go with the “no-frills” approach for the JS API for the time being. In the context of this proposal, that means that conversions (or type reflection) at concrete reference type throws a TypeError exception.
In addition to the rules for basic reference types:
If the target type of a ToWebAssemblyValue is a concrete reference type, then throw TypeError.
If the source type of a ToJSValue is a concrete reference type, then throw TypeError.
GlobalTypeError is produced if the Global constructor is invoked without a value argument but a type that is not defaultable.TableThe Table constructor gets an additional optional argument init that is used to initialize the table slots. It defaults to null. A TypeError is produced if the argument is omitted and the table's element type is not defaultable.
The Table method grow gets an additional optional argument init that is used to initialize the new table slots. It defaults to null. A TypeError is produced if the argument is omitted and the table's element type is not defaultable.
In the presence of the JS type reflection proposal:
Global.type throws TypeError when encountering a concrete reference type.
Table.type throws TypeError when encountering a concrete reference type as element type.
Function.type throws TypeError when encountering a concrete reference type as parameter or result type.
Module.imports throws TypeError when encountering a concrete reference type in any of the global, table or function types of imports.
Module.exports throws TypeError when encountering a concrete reference type in any of the global, table or function types of exports.
Note: The GC proposal is expected to at least add anyref as a recognised RefType, but possibly throw on other abstract reference types.
Post-MVP, the type reflection abilities of the JS API could be refined based on the JS type reflection proposal.
A RefType can be described by an object of the form {ref: HeapType, ...}
type RefType = ... | {ref: HeapType, nullable: bool}A HeapType can be described by a suitable union type
type HeapType = "func" | "extern" | DefTypeNote: The GC proposal adds additional heap types.
In addition to the rules for basic reference types:
Any function that is an instance of WebAssembly.Function with type <functype> is allowed as ref <functype> or ref null <functype>.
Any non-null external reference is allowed as ref extern.
The null value is allowed as ref null ht.
Note: The GC proposal is expected to allow additional conversions for anyref.
In the future (especially with the GC proposal) it will be desirable to allow the usual subtyping rules on function types. This is relevant, for example, to encode method tables.
While doing so, the semantics of static and dynamic type checks need to remain coherent. That is, both static and dynamic type checking (as well as link-time checks) must use the same subtype relation. Otherwise, values of a subtype would not always be substitutable for values of a supertype, thereby breaking a fundemental property of subtyping and various transformations and optimisations relying on substitutability.
At the same time, care has to be taken to not damage the performance of existing instructions. In particular, call_indirect performs a runtime type check over (structural) function types that is expected to be no more expensive than a single pointer comparison. That effectively rules out allowing any form of subtyping to kick in for this check.
This tension with call_indirect can be resolved by distinguishing function types that have further subtypes and those that don't. The latter are sometimes called exact types.
Exact types might come in handy in a few other circumstances, so we could distinguish the two forms in a generic manner by enriching heap types with a flag as follows:
heaptype ::= exact? <typeidx> | func | externExact types are themselves subtypes of corresponding non-exact types, but the crucial difference is that they do not have further subtypes themselves. That is, the following subtype rules would be defined on heap types:
exact? $t <: $t'
$t = <functype> and $t' = <functype'><functype> <: <functype'>exact $t <: exact $t'
$t = <functype> and $t' = <functype'><functype> == <functype'>in combination with the canonical rules for function subtyping itself:
[t1*]->[t2*] <: [t1'*]->[t2'*](t1' <: t1)*(t2 <: t2')*(and appropriate rules for other types that may be added to the type section in the future).
With this:
The type of each function definition is interpreted as its exact type.
The type annotation on call_indirect is interpreted as an exact type. (It would also be possible to extend call_indirect to allow either choice, leaving it to to the producer to make the trade-off between performance and flexibility.)
For imports and references, every module can decide individually where it requires an exact type.
For any import or reference in a module's interface that flows into a function table on which it invokes call_indirect, the module can thereby enforce exact types that are guaranteed to succeed the signature check.
Consider:
(module $A
(type $proc (func))
(func (export "f") (result funcref) ...)
(func (export "g") (result (ref $proc)) ...)
)
(module $B
(type $th (func (result funcref)))
(func $h (import "..." "h") (type exact $th))
(table $tab funcref (elem $h))
(func $call
...
(call_indirect $th (i32.const 0))
...
)
(func $update (param $f (ref exact $th))
(table.set (local.get $f) (i32.const 0))
)
)
In module $B, the imported function $h requires an exact type match. That ensures that a client cannot supply a function of a subtype, such as A.g, which would break the use of call_indirect in $call. An attempt to do so will fail at link time. Passing A.f would be accepted, however.
Similarly, $update can only be passed a reference with exact type, again ensuring that $call will not fail unexpectedly subsequently.
If a function import (or reference) never interferes with call_indirect, however, then the import can have a more flexible type:
(module $C
(type $th (func (result funcref)))
(func $h (import "..." "h") (type $th))
(func $k (result (ref $th))
(call $h)
)
)
Here, the laxer import type is okay: for $h, any function that returns a subtype of funcref is fine, since all choices will satisfy the result type (ref $th) of function $k performing the call.
More generally, the following imports are well-typed:
(module (type $proc (func)) (type $tf (func (result funcref))) (type $tg (func (result (ref $proc)))) (func $h1 (import "A" "g") (type $tg)) (func $h2 (import "A" "g") (type exact $tg)) (func $h3 (import "A" "g") (type $tf)) ;; (func $h4 (import "A" "g") (type exact $tf)) ;; fails to link! )
Note that the use of exact types for imports or references flowing into a function table is not enforced by the type system. It is the producer's responsibility to protect its interface with suitably narrow types.
For exposition, we assumed above that exact types are a new construct, denoted by an explicit flag. However, such a design would still change the meaning of existing type expressions (by suddenly allowing subtyping) and would hence not be backwards compatible, for the reasons stated above.
To avoid breaking existing modules when operating in new environments that support subtyping, we must preserve the current subtype-less, i.e. exact, meaning of the pre-existing function types -- at least in the binary format. Subtypable function references/imports must be introduced as the new construct.
It‘s a slightly different question what to do for the text format, however. To maintain full backwards compatibility, the flag would likewise need to be inverted: instead of an optional exact flag we’d have an optional sub flag with the opposite meaning:
heaptype ::= sub? <typeidx> | func | externIn the binary format it doesn't really matter which way both alternatives are encoded. But for the text format, this inversion will be rather annoying: the normal use case is to allow subtyping; but to express that, every import or reference type will have to explicitly state sub.
One possibility might be to keep the exact syntax above for the text format, and effectively change the interpretation of the existing syntax. That means that existing text format programs using e.g. function imports would change their meaning, and running wasm/wat converters would produce different results. In particular, rebuilding a binary from a pre-existing text file may produce a binary with weaker interface requirements.
Is the risk involved in that worth taking? How much of a problem do we think this would be?