1. 0b66c9f Fix slice bugs by Dillon Sharlet · 3 hours ago upstream/master
  2. 2d16035 Fix bugs with reduce fusion by Dillon Sharlet · 7 hours ago
  3. 7829cd6 Move row sum rewrite to after other optimization rewrites. by Quentin Khan · 11 hours ago
  4. cf96f77 Fix fully-connected DynamicB tests to work with QP8. by Marie White · 18 hours ago
  5. 1dbb15f Fix fully-connected DynamicB tests to work with QP8. by Marie White · 22 hours ago
  6. 73c5abb Fix bug in `get_max_concurrency`. by Marie White · 22 hours ago
  7. 2e6e343 Rewrite reduce kernels to optimize for numerical behavior by Dillon Sharlet · 24 hours ago
  8. 713c3b7 Require reshape strides to be the shape we need too by Dillon Sharlet · 25 hours ago
  9. 5d007c4 Make reference int2/int4 convert work with unaligned n. by Volodymyr Kysenko · 26 hours ago
  10. 25d1560 Fix store in the tail of transpose kernels for sub-byte types. by Volodymyr Kysenko · 28 hours ago
  11. 016914c [gn] Add pthreadpool for the Chromium config by Richard Townsend · 29 hours ago
  12. 28ef957 Disable sum(a*b) => dot(a, b) rewrite if there are no broadcast dimensions on either side by Dillon Sharlet · 29 hours ago
  13. 74daa88 Use internal define_static_expand_dims in define_dot by Dillon Sharlet · 30 hours ago
  14. 6e50ae9 Fix reshape -> slice pattern by Dillon Sharlet · 30 hours ago
  15. f43db48 Fix loss of precision for fp64 constants by Dillon Sharlet · 30 hours ago
  16. 0a27dcf Enable f16 vsin and vcos wasmrelaxedsimd kernel and scalar fallbacks by Frank Barchard · 30 hours ago
  17. c723a99 Prepare static_reduce test for upcoming fp16 to fp32 rewrite. by Quentin Khan · 35 hours ago
  18. f919d36 Don't call optimize in fp16 rewrite tests. by Quentin Khan · 35 hours ago
  19. 99e4485 Add `horizontal_sum` for floating point types by Dillon Sharlet · 2 days ago
  20. 091b9be Enable f16 vsin and vcos wasmrelaxedsimd kernel and scalar fallbacks by XNNPACK Team · 4 days ago
  21. 3245ce2 Enable f16 vsin and vcos wasmrelaxedsimd kernel and scalar fallbacks by Frank Barchard · 4 days ago
  22. 6c8ac56 F16-VTANH for avx512, wasm and scalar by Frank Barchard · 4 days ago
  23. 5039d21 Fix unsimplified slice extents by Dillon Sharlet · 4 days ago
  24. dbf0402 Fix handling of sub-byte types in packer. by Volodymyr Kysenko · 5 days ago
  25. d877e1a Fix warning "unexpected tokens following preprocessor directive - expected a newline" by Dillon Sharlet · 5 days ago
  26. c5c413d [gn] Update DEPS by Richard Townsend · 5 days ago
  27. 84aa6a9 Move gemm, conv shapes hardcoded in benchmarks to text files by Dillon Sharlet · 5 days ago
  28. 4908d19 Fix get_dot_kernel type bug by Marie White · 5 days ago
  29. d48bc34 Add support for rewriting `sum(a*b, init_c)` => `dot(a, b, init_c)` by Dillon Sharlet · 5 days ago
  30. be45bb3 Add more test coverage for reduce operators by Dillon Sharlet · 5 days ago
  31. 48e1d0f Add test coverage of static and dynamic shapes by Dillon Sharlet · 5 days ago
  32. 445e613 Fix spurious debug messages about sum(a*b) -> dot(a, b) rewrites by Dillon Sharlet · 6 days ago
  33. d8f5abe Rename svcnt => svcnts by Dillon Sharlet · 6 days ago
  34. fb15252 Rename QD8F32QC8W benchmark to QD8F32QC8WFullyConnected for consistency. by Volodymyr Kysenko · 6 days ago
  35. c3d8c27 Disabled bmm rewrite by default as gemma4 fails precision. by Misha Gutman · 6 days ago
  36. b12ed13 Fix memory outdated planning optimization invalidated by reshapes. by Quentin Khan · 6 days ago
  37. 8e4e78f Don't use `graph::Tensor` in the XNNPack lowering interface. by Quentin Khan · 7 days ago
  38. 11e206b Implement `static_expand_dims` using `static_transpose` by XNNPACK Team · 7 days ago
  39. 5660b4b Implement `static_expand_dims` using `static_transpose` by Dillon Sharlet · 7 days ago
  40. f6cf463 Disable BatchMatrixMultiplyDequantBmmRewrite test under ynnpack. by Volodymyr Kysenko · 7 days ago
  41. 62f1d60 Added rewrite `bmm(a:f32, dequant(b:qint8):f32) -> f32` into by Misha Gutman · 7 days ago
  42. 16c63a3 Add benchmarks for fully connected with QC4W and QC2W weights. by Volodymyr Kysenko · 7 days ago
  43. 778408a Add exp_fp64 kernels by Dillon Sharlet · 7 days ago
  44. 58a233a Merge pull request #10167 from JonathanC-ARM:jonclo01/sync_bazel_cmake_defaults by XNNPACK Team · 7 days ago
  45. fc7f897 Fix alignment-related crash on AVX512 by Frederic Rechtenstein · 7 days ago
  46. ece55c6 Add rewrite for `sum(a*b)` => `dot(a, b)` where appropriate by Dillon Sharlet · 7 days ago
  47. 58698bd Add `exp2_round` simd helper by Dillon Sharlet · 7 days ago
  48. a571a74 Add tolerance for quantized int8 operations that may round differently by Dillon Sharlet · 8 days ago
  49. 7b1bde3 Remove ternary multiply for purely float types by Dillon Sharlet · 8 days ago
  50. d830cd1 Rewrite reduce(static_transpose(x)) into reduce(x) by Volodymyr Kysenko · 8 days ago
  51. 95103d5 Enable f16 vsqrt wasmrelaxedsimd kernel and scalar fallbacks by Frank Barchard · 8 days ago
  52. a493bbe Add missing benchmark by Dillon Sharlet · 8 days ago
  53. 8f17e0c Relax tolerances of dequantize_dot more by Dillon Sharlet · 8 days ago
  54. b3daaef Enable sum(squared(x)) => sum_squared(x) for fp64 by Dillon Sharlet · 8 days ago
  55. 50b0164 Fix Hexagon HVX build failure 'sf type used as qf32' on Clang 19 by Frank Barchard · 8 days ago
  56. e0729a7 Add assert to catch infinite loop case by Dillon Sharlet · 8 days ago
  57. cceae52 Always constant fold pack_b ops by Dillon Sharlet · 8 days ago
  58. 94ce3bb Refactor extent handling in YNNPACK subgraph. by Volodymyr Kysenko · 8 days ago
  59. 6bd5049 Fixed the crash due to unaligned read. by Misha Gutman · 8 days ago
  60. 52d9458 Merge pull request #9851 from mohammadmseet-hue:fix/nchw-reduce-overflow-and-shim-bounds by XNNPACK Team · 8 days ago
  61. 6dfbf30 Optimize xnn_round_f32 for Hexagon HVX. by Frank Barchard · 8 days ago
  62. b5bc455 Enable adding and removing dimensions via static_transpose by Dillon Sharlet · 8 days ago
  63. 0fc9e7e Disable subgraph_matcher_test when use_ynnpack is enabled. by Volodymyr Kysenko · 8 days ago
  64. a3664b2 Avoid capturing kernel in reduce ops by Dillon Sharlet · 8 days ago
  65. 689c5c6 Removed convert qint8 to qcint8 tests from ynnpack test set. by Misha Gutman · 8 days ago
  66. bb6c6a4 Added convert from qint8 to qcint8. by Misha Gutman · 9 days ago
  67. 8e406b8 Loosen tolerances for dequantize_dot test by Dillon Sharlet · 9 days ago
  68. 807d9f9 Introduce XNN_NO_SANITIZE_FUNCTION macro. by Alexander Shaposhnikov · 10 days ago
  69. c8c8639 Add fp64 fma rules to elementwise compiler by Dillon Sharlet · 11 days ago
  70. 8a3902d Add optimized kernels for fp64 elementwise ops by Dillon Sharlet · 11 days ago
  71. a3da013 Add benchmark coverage of reference fp64 elementwise ops by Dillon Sharlet · 11 days ago
  72. a9390e5 Fix hexagon build by Dillon Sharlet · 11 days ago
  73. e9de268 Add reference kernels for fp64 elementwise ops by Dillon Sharlet · 11 days ago
  74. 8c2df4d Parallelize reductions in YNNPACK by Dillon Sharlet · 11 days ago
  75. 04b6775 Refactor tolerance calculations by Dillon Sharlet · 11 days ago
  76. b0328fc Fix WAsm typo in XNNPACK by renaming to Wasm by Frank Barchard · 11 days ago
  77. 26c61a7 Merge pull request #9989 from ken-unger:f16-unary-trig-rvv by XNNPACK Team · 11 days ago
  78. 6833e63 Merge branch 'master' into f16-unary-trig-rvv by Dillon · 11 days ago
  79. f589c63 Update CMakeLists.txt to match SME defaults from bazel by Jonathan Clohessy · 11 days ago
  80. 4780ab7 Run generator to create rvv kernels by Frank Barchard · 12 days ago
  81. 4a318ee Add portable SIMD template for f16-vsqrt by Frank Barchard · 12 days ago
  82. f81e3ed Support channelwise zero points in YNNPACK quantized dot products. by Volodymyr Kysenko · 12 days ago
  83. b2f46c0 Add a matcher to to check whether two graph are isomorphic. by Quentin Khan · 13 days ago
  84. 5aa5d64 Add a parallel lib to `utils:matchers` for internal targets that are only compiled with OSS. by Quentin Khan · 13 days ago
  85. e2da1ed Add f16_wasmrelaxedsimd SIMD headers by Frank Barchard · 13 days ago
  86. 834051a Merge pull request #10101 from MarkLee131:fix/qpint8-null-deref by XNNPACK Team · 13 days ago
  87. 8e4e9d5 Change reduce to make the identity buffer in slinky, instead of in the subgraph by Dillon Sharlet · 13 days ago
  88. 562e527 Refactor `make_schedule` to allow building just the loop splits, and not a whole `scheduling_info` by Dillon Sharlet · 13 days ago
  89. 51759bd Merge pull request #10102 from MarkLee131:fix/integer-overflow-tensor-size by XNNPACK Team · 13 days ago
  90. bbc68d9 Merge pull request #9963 from velonica0:rvv-elementwise by XNNPACK Team · 13 days ago
  91. 56496fd Add int32 sum kernels by Dillon Sharlet · 13 days ago
  92. 6408104 Resubmit #10069 by Dillon Sharlet · 13 days ago
  93. f074438 Split xnn_safe_mul/xnn_safe_add into separate statements by MarkLee131 · 13 days ago
  94. 1437d94 Use xnn_safe_mul/xnn_safe_add in get_tensor_size by MarkLee131 · 13 days ago
  95. 7b37580 Clarify qpint8 rejection wording by MarkLee131 · 13 days ago
  96. c817561 Move declaration of `NativeStorage` and clarify comment of `StorageImpl`. by Quentin Khan · 13 days ago
  97. 2c52c9f Add a conversion function to be able to specialize buffer copy from a sequence. by Quentin Khan · 13 days ago
  98. 9c70eb9 Add reverse data type to native type mapping. by Quentin Khan · 13 days ago
  99. b298557 Add wrappers for storage type of 2/4 bit int and 16 bit floats. by Quentin Khan · 13 days ago
  100. 4ca5fb8 Merge branch 'master' into f16-unary-trig-rvv by Dillon · 14 days ago