- 0b66c9f Fix slice bugs by Dillon Sharlet · 3 hours ago upstream/master
- 2d16035 Fix bugs with reduce fusion by Dillon Sharlet · 7 hours ago
- 7829cd6 Move row sum rewrite to after other optimization rewrites. by Quentin Khan · 11 hours ago
- cf96f77 Fix fully-connected DynamicB tests to work with QP8. by Marie White · 18 hours ago
- 1dbb15f Fix fully-connected DynamicB tests to work with QP8. by Marie White · 22 hours ago
- 73c5abb Fix bug in `get_max_concurrency`. by Marie White · 22 hours ago
- 2e6e343 Rewrite reduce kernels to optimize for numerical behavior by Dillon Sharlet · 24 hours ago
- 713c3b7 Require reshape strides to be the shape we need too by Dillon Sharlet · 25 hours ago
- 5d007c4 Make reference int2/int4 convert work with unaligned n. by Volodymyr Kysenko · 26 hours ago
- 25d1560 Fix store in the tail of transpose kernels for sub-byte types. by Volodymyr Kysenko · 28 hours ago
- 016914c [gn] Add pthreadpool for the Chromium config by Richard Townsend · 29 hours ago
- 28ef957 Disable sum(a*b) => dot(a, b) rewrite if there are no broadcast dimensions on either side by Dillon Sharlet · 29 hours ago
- 74daa88 Use internal define_static_expand_dims in define_dot by Dillon Sharlet · 30 hours ago
- 6e50ae9 Fix reshape -> slice pattern by Dillon Sharlet · 30 hours ago
- f43db48 Fix loss of precision for fp64 constants by Dillon Sharlet · 30 hours ago
- 0a27dcf Enable f16 vsin and vcos wasmrelaxedsimd kernel and scalar fallbacks by Frank Barchard · 30 hours ago
- c723a99 Prepare static_reduce test for upcoming fp16 to fp32 rewrite. by Quentin Khan · 35 hours ago
- f919d36 Don't call optimize in fp16 rewrite tests. by Quentin Khan · 35 hours ago
- 99e4485 Add `horizontal_sum` for floating point types by Dillon Sharlet · 2 days ago
- 091b9be Enable f16 vsin and vcos wasmrelaxedsimd kernel and scalar fallbacks by XNNPACK Team · 4 days ago
- 3245ce2 Enable f16 vsin and vcos wasmrelaxedsimd kernel and scalar fallbacks by Frank Barchard · 4 days ago
- 6c8ac56 F16-VTANH for avx512, wasm and scalar by Frank Barchard · 4 days ago
- 5039d21 Fix unsimplified slice extents by Dillon Sharlet · 4 days ago
- dbf0402 Fix handling of sub-byte types in packer. by Volodymyr Kysenko · 5 days ago
- d877e1a Fix warning "unexpected tokens following preprocessor directive - expected a newline" by Dillon Sharlet · 5 days ago
- c5c413d [gn] Update DEPS by Richard Townsend · 5 days ago
- 84aa6a9 Move gemm, conv shapes hardcoded in benchmarks to text files by Dillon Sharlet · 5 days ago
- 4908d19 Fix get_dot_kernel type bug by Marie White · 5 days ago
- d48bc34 Add support for rewriting `sum(a*b, init_c)` => `dot(a, b, init_c)` by Dillon Sharlet · 5 days ago
- be45bb3 Add more test coverage for reduce operators by Dillon Sharlet · 5 days ago
- 48e1d0f Add test coverage of static and dynamic shapes by Dillon Sharlet · 5 days ago
- 445e613 Fix spurious debug messages about sum(a*b) -> dot(a, b) rewrites by Dillon Sharlet · 6 days ago
- d8f5abe Rename svcnt => svcnts by Dillon Sharlet · 6 days ago
- fb15252 Rename QD8F32QC8W benchmark to QD8F32QC8WFullyConnected for consistency. by Volodymyr Kysenko · 6 days ago
- c3d8c27 Disabled bmm rewrite by default as gemma4 fails precision. by Misha Gutman · 6 days ago
- b12ed13 Fix memory outdated planning optimization invalidated by reshapes. by Quentin Khan · 6 days ago
- 8e4e78f Don't use `graph::Tensor` in the XNNPack lowering interface. by Quentin Khan · 7 days ago
- 11e206b Implement `static_expand_dims` using `static_transpose` by XNNPACK Team · 7 days ago
- 5660b4b Implement `static_expand_dims` using `static_transpose` by Dillon Sharlet · 7 days ago
- f6cf463 Disable BatchMatrixMultiplyDequantBmmRewrite test under ynnpack. by Volodymyr Kysenko · 7 days ago
- 62f1d60 Added rewrite `bmm(a:f32, dequant(b:qint8):f32) -> f32` into by Misha Gutman · 7 days ago
- 16c63a3 Add benchmarks for fully connected with QC4W and QC2W weights. by Volodymyr Kysenko · 7 days ago
- 778408a Add exp_fp64 kernels by Dillon Sharlet · 7 days ago
- 58a233a Merge pull request #10167 from JonathanC-ARM:jonclo01/sync_bazel_cmake_defaults by XNNPACK Team · 7 days ago
- fc7f897 Fix alignment-related crash on AVX512 by Frederic Rechtenstein · 7 days ago
- ece55c6 Add rewrite for `sum(a*b)` => `dot(a, b)` where appropriate by Dillon Sharlet · 7 days ago
- 58698bd Add `exp2_round` simd helper by Dillon Sharlet · 7 days ago
- a571a74 Add tolerance for quantized int8 operations that may round differently by Dillon Sharlet · 8 days ago
- 7b1bde3 Remove ternary multiply for purely float types by Dillon Sharlet · 8 days ago
- d830cd1 Rewrite reduce(static_transpose(x)) into reduce(x) by Volodymyr Kysenko · 8 days ago
- 95103d5 Enable f16 vsqrt wasmrelaxedsimd kernel and scalar fallbacks by Frank Barchard · 8 days ago
- a493bbe Add missing benchmark by Dillon Sharlet · 8 days ago
- 8f17e0c Relax tolerances of dequantize_dot more by Dillon Sharlet · 8 days ago
- b3daaef Enable sum(squared(x)) => sum_squared(x) for fp64 by Dillon Sharlet · 8 days ago
- 50b0164 Fix Hexagon HVX build failure 'sf type used as qf32' on Clang 19 by Frank Barchard · 8 days ago
- e0729a7 Add assert to catch infinite loop case by Dillon Sharlet · 8 days ago
- cceae52 Always constant fold pack_b ops by Dillon Sharlet · 8 days ago
- 94ce3bb Refactor extent handling in YNNPACK subgraph. by Volodymyr Kysenko · 8 days ago
- 6bd5049 Fixed the crash due to unaligned read. by Misha Gutman · 8 days ago
- 52d9458 Merge pull request #9851 from mohammadmseet-hue:fix/nchw-reduce-overflow-and-shim-bounds by XNNPACK Team · 8 days ago
- 6dfbf30 Optimize xnn_round_f32 for Hexagon HVX. by Frank Barchard · 8 days ago
- b5bc455 Enable adding and removing dimensions via static_transpose by Dillon Sharlet · 8 days ago
- 0fc9e7e Disable subgraph_matcher_test when use_ynnpack is enabled. by Volodymyr Kysenko · 8 days ago
- a3664b2 Avoid capturing kernel in reduce ops by Dillon Sharlet · 8 days ago
- 689c5c6 Removed convert qint8 to qcint8 tests from ynnpack test set. by Misha Gutman · 8 days ago
- bb6c6a4 Added convert from qint8 to qcint8. by Misha Gutman · 9 days ago
- 8e406b8 Loosen tolerances for dequantize_dot test by Dillon Sharlet · 9 days ago
- 807d9f9 Introduce XNN_NO_SANITIZE_FUNCTION macro. by Alexander Shaposhnikov · 10 days ago
- c8c8639 Add fp64 fma rules to elementwise compiler by Dillon Sharlet · 11 days ago
- 8a3902d Add optimized kernels for fp64 elementwise ops by Dillon Sharlet · 11 days ago
- a3da013 Add benchmark coverage of reference fp64 elementwise ops by Dillon Sharlet · 11 days ago
- a9390e5 Fix hexagon build by Dillon Sharlet · 11 days ago
- e9de268 Add reference kernels for fp64 elementwise ops by Dillon Sharlet · 11 days ago
- 8c2df4d Parallelize reductions in YNNPACK by Dillon Sharlet · 11 days ago
- 04b6775 Refactor tolerance calculations by Dillon Sharlet · 11 days ago
- b0328fc Fix WAsm typo in XNNPACK by renaming to Wasm by Frank Barchard · 11 days ago
- 26c61a7 Merge pull request #9989 from ken-unger:f16-unary-trig-rvv by XNNPACK Team · 11 days ago
- 6833e63 Merge branch 'master' into f16-unary-trig-rvv by Dillon · 11 days ago
- f589c63 Update CMakeLists.txt to match SME defaults from bazel by Jonathan Clohessy · 11 days ago
- 4780ab7 Run generator to create rvv kernels by Frank Barchard · 12 days ago
- 4a318ee Add portable SIMD template for f16-vsqrt by Frank Barchard · 12 days ago
- f81e3ed Support channelwise zero points in YNNPACK quantized dot products. by Volodymyr Kysenko · 12 days ago
- b2f46c0 Add a matcher to to check whether two graph are isomorphic. by Quentin Khan · 13 days ago
- 5aa5d64 Add a parallel lib to `utils:matchers` for internal targets that are only compiled with OSS. by Quentin Khan · 13 days ago
- e2da1ed Add f16_wasmrelaxedsimd SIMD headers by Frank Barchard · 13 days ago
- 834051a Merge pull request #10101 from MarkLee131:fix/qpint8-null-deref by XNNPACK Team · 13 days ago
- 8e4e9d5 Change reduce to make the identity buffer in slinky, instead of in the subgraph by Dillon Sharlet · 13 days ago
- 562e527 Refactor `make_schedule` to allow building just the loop splits, and not a whole `scheduling_info` by Dillon Sharlet · 13 days ago
- 51759bd Merge pull request #10102 from MarkLee131:fix/integer-overflow-tensor-size by XNNPACK Team · 13 days ago
- bbc68d9 Merge pull request #9963 from velonica0:rvv-elementwise by XNNPACK Team · 13 days ago
- 56496fd Add int32 sum kernels by Dillon Sharlet · 13 days ago
- 6408104 Resubmit #10069 by Dillon Sharlet · 13 days ago
- f074438 Split xnn_safe_mul/xnn_safe_add into separate statements by MarkLee131 · 13 days ago
- 1437d94 Use xnn_safe_mul/xnn_safe_add in get_tensor_size by MarkLee131 · 13 days ago
- 7b37580 Clarify qpint8 rejection wording by MarkLee131 · 13 days ago
- c817561 Move declaration of `NativeStorage` and clarify comment of `StorageImpl`. by Quentin Khan · 13 days ago
- 2c52c9f Add a conversion function to be able to specialize buffer copy from a sequence. by Quentin Khan · 13 days ago
- 9c70eb9 Add reverse data type to native type mapping. by Quentin Khan · 13 days ago
- b298557 Add wrappers for storage type of 2/4 bit int and 16 bit floats. by Quentin Khan · 13 days ago
- 4ca5fb8 Merge branch 'master' into f16-unary-trig-rvv by Dillon · 14 days ago