This document proposes an extended API for SIMD.js which is meant provide access to platforms-specific optimizations. It will sit on top of and complement the base API.
The expectation is that most users will use the base API most of the time. While some compromises are being made to serve portability, most of the base API will still be fast, and it will deliver the most consistent results. The extension API will offer opportunities for performance tuning, will support specialized code sequences, and will aid in porting of code from other platforms.
This proposal splits the problem space into two parts:
Primarily, this will use a new
SIMD.Relaxed.Int32x4.fromFloat32x4 // relaxed on NaN or overflow SIMD.Relaxed.Float32x4.max // relaxed on NaN, 0 and -0 fungible SIMD.Relaxed.Int32x4.shiftLeftByScalar // relaxed on shift count overflow ...
SIMD.Relaxed mimic functions in the base API with corresponding names, and provide weaker portability with greater potential for performance, for example by having unspecified results if NaN appear in any part of the (implied) computation, by treating negative zero as interchangeable with zero, or by having unspecified results if an overflow occurs.
Note that an implementation in which these are all identical to their corresponding functions in the base namespace will be fully conforming.
Accompanying this is a new
SIMD.Checked namespace to help developers find errors:
SIMD.Checked.Int32x4.fromFloat32x4 SIMD.Checked.Float32x4.max SIMD.Checked.Int32x4.shiftLeftByScalar ...
SIMD.Checked all correspond to functions in
SIMD.Relaxed and throw on any value which would produce unspecified results. They may also canonicalize negative zero to positive zero. We'll publish a standard polyfill for these functions which implementations or users can use if they wish.
Operations from all platforms are collected together in a single
SIMD.Universe.Float32x4.fma SIMD.Universe.Int32x4.rotateLeft SIMD.Universe.Int32x4.rotateRight SIMD.Universe.Int32x4.signMask // movmskps on x86 SIMD.Universe.Int32x4.bitInsertIfTrue // vbit on ARM ...
Unlike in the
SIMD.Relaxed namespace, these operations all have fairly strict semantics.
We‘ll publish a standard polyfill that will fill in all functions in the
SIMD.Universe namespace that the JIT doesn’t predefine. This will ensure that programs continue to at least execute across platforms, though of course the performance may vary widely.
Some indication of the performance will be made:
This function takes a single argument, a function in the
SIMD.Universe API, and returns a bool indicating whether the given function is “fast” -- roughly meaning a single operation in the underlying platform.