Log - HEAD - external/github.com/google/XNNPACK

f34fa03 Add simd wrappers for bitwise ops. by Volodymyr Kysenko · 72 minutes ago upstream/master
a94c2ac Update SDE from 10.5 to 10.7 for github testing on Intel by Frank Barchard · 2 hours ago
15e04ff Fix incorrectly validating non-zero zero points for qcint8. by Dillon Sharlet · 6 hours ago
ecb6799 Check that input_b's zero point is 0 before ignoring it by Dillon Sharlet · 7 hours ago
60b526f Use simd wrappers for round, ceil, floor, and sqrt. by Volodymyr Kysenko · 8 hours ago
03aba00 Add floor, ceil, and sqrt SIMD wrappers. by Volodymyr Kysenko · 9 hours ago
6491ba3 Add sse2_fma to emulate fma by Dillon Sharlet · 10 hours ago
a36bd79 Move fp16 rewrite specific fields in `xnn_value` to a dedicated struct. by Quentin Khan · 10 hours ago
12d63c6 Fix incorrect claim that unary_elementwise ops can implicitly broadcast by Dillon Sharlet · 10 hours ago
1b40695 Add `xnn_subgraph_add_internal_values()`. by Quentin Khan · 10 hours ago
fb98246 Refactor `xnn_subgraph_new_node()` to use `xnn_subgraph_add_nodes()`. by Quentin Khan · 11 hours ago
fcd3a3e Move macro comment to the correct place. by Quentin Khan · 13 hours ago
3aacda2 Refactor SIMD wrappers to reduce boilerplate by Dillon Sharlet · 24 hours ago
48c0219 Add u8x4 and u8x8 using uint32_t and uint64_t implementations by Dillon Sharlet · 25 hours ago
156ef8a Test and fix empty reductions in YNNPACK by Dillon Sharlet · 26 hours ago
64e490f Use infix operators if they're defined for simd::vec in elementwise compiler. by Volodymyr Kysenko · 30 hours ago
48d83fe Only test empty reduction inputs, not empty reduction outputs. by Dillon Sharlet · 33 hours ago
1960fd1 Check and get permissions before executing a kernel in schedule_bench tool. by Marie White · 2 days ago
1114532 Refactor emit_op function to remove redundant code. by Volodymyr Kysenko · 3 days ago
bb52ed0 Remove redundant type cast in elementwise kernel compiler. by Volodymyr Kysenko · 3 days ago
d94380b Fix empty reductions by Dillon Sharlet · 4 days ago
f137acb Add a dot kernel benchmark with a manual dot scheduling command line interface by Dillon Sharlet · 5 days ago
34e5cce Enable arm fp32 and fp64 dot kernels to unroll k by more than one vector by Dillon Sharlet · 5 days ago
67dec31 Limit the number of threads used by `dot_bench` by Dillon Sharlet · 5 days ago
f652afe Fixed input size issue found via masan in 2-bit FC. by Misha Gutman · 5 days ago
34f5e8a Tweak scheduling logic for dots by Dillon Sharlet · 6 days ago
2c0677c Rename kAmxTileRowBytes to tile_row_bytes by Marie White · 6 days ago
8da5e67 Add 2x2 AMX BF16 and INT8 kernels by Marie White · 6 days ago
262e13f Add fp64 dot kernels by Dillon Sharlet · 6 days ago
0011854 Added qd8_f16_qc2w and qdu8_f16_qc2w on operator and subgraph levels. by Misha Gutman · 6 days ago
e4b5160 Enable AVXVNNI qd8 qc2w GEMM microkernel by Frank Barchard · 7 days ago
957988f Enable AVXVNNI qs8 qc2w GEMM microkernel by Frank Barchard · 7 days ago
cdd3b23 Change fp16 softmax kernels to compute the sum as fp32 by Dillon Sharlet · 7 days ago
41d9b2d Allow an absolute error of 1 for integer convert outputs. by Dillon Sharlet · 7 days ago
5b4268d Run generators to update pqs8 tests and benchmarks by Frank Barchard · 7 days ago
7734071 AVXVNNI qd8_f16_qc2w and qd8_f32_qc2w GEMM microkernels by Frank Barchard · 7 days ago
9213121 Merge pull request #8880 from qualcomm:sme1/pqs8-qc8w-gemm-igemm by XNNPACK Team · 7 days ago
dc51753 Scalar qd8_f16_qc2w GEMM microkernel by Frank Barchard · 7 days ago
9670025 Reduce log level from info to warning in debug builds by Dillon Sharlet · 7 days ago
b6b9992 Remove `testonly = True` from `benchmark_main` by Dillon Sharlet · 7 days ago
c4c44a7 Skip buffer out of bounds checks when running with msan by Dillon Sharlet · 8 days ago
26feacc Handle quantization with separate ops by Dillon Sharlet · 8 days ago chromium/7718
1b70ef3 AVXVNNI qs8 qc2w GEMM microkernel by Frank Barchard · 8 days ago
7aa9183 Add `ynn_define_tensor` API by Dillon Sharlet · 8 days ago
05e81ea [gn] February update of DEPS by Richard Townsend · 8 days ago
cd2a0d4 Merge pull request #9594 from ken-unger:f16-gemm-spmm-dwconv-rvv by XNNPACK Team · 8 days ago
b7c88d4 Increase timeout for `qd8_f32_qc8w_igemm_minmax_test` by Dillon Sharlet · 9 days ago
406cc2d Unroll the tail loop of SME kernels by Dillon Sharlet · 9 days ago
f870adb Relax tolerance for YNN_NODE_FLAG_F32_DOT_TO_BF16_X3 dots by Dillon Sharlet · 9 days ago
ce7aaae [gn] Link pthreadpool_standalone only against the module that needs it by Richard Townsend · 11 days ago
639decd Use avx512 for small output tiles by Dillon Sharlet · 11 days ago
337dacf Rename avx512f and avx512bw kernels to avx512 by Dillon Sharlet · 12 days ago
77f08c2 Clean up unnecessary checks in generated dot kernels by Dillon Sharlet · 12 days ago
511cc99 Merge remote-tracking branch 'google/master' into sme1/pqs8-qc8w-gemm-igemm by Vaisakh K V · 12 days ago
924bf08 Dot kernel naming and other cleanup by Dillon Sharlet · 12 days ago
cb00a82 Rewrite common subgraphs test to use `ynn_subgraph` by reference by Dillon Sharlet · 12 days ago
c0b9276 s32_mul for simd-hvx.h use mpyieo for upper 16 bits. by Frank Barchard · 12 days ago
035f3dd Add ARM SVE transpose kernels by Dillon Sharlet · 12 days ago
d2ae61d Add SVE partial loads/stores to SIMD wrappers by Dillon Sharlet · 12 days ago
6fb6cad Fix benchmarks attempting to run code without checking the architecture first by Dillon Sharlet · 12 days ago
04581a6 Update SDE from 9.58 to 10.5 for github testing on Intel by Frank Barchard · 13 days ago
6361eab Add `dequantize` ternary kernels and use these to implement convert by Dillon Sharlet · 13 days ago
066e3e7 Add an `operator<<` for `ynn_subgraph` to make test failures easier to read by Dillon Sharlet · 14 days ago
8dc139c add rvv fp16 kernels for f16-gemm, f16-igemm, f16-dwconv, f16-spmm by Ken Unger · 14 days ago
6425d79 Add "scalar" quantize kernels, and use them to implement convert assuming they exist. by Dillon Sharlet · 14 days ago
43feb82 Use slinky::span instead of std::vector when possible by Dillon Sharlet · 14 days ago
726be35 Minor cleanups of fusion by Dillon Sharlet · 2 weeks ago
9ab5966 Randomize arithmetic tests for SIMD wrappers by Dillon Sharlet · 2 weeks ago
d6298c9 [gn] Fixups required for integration into Chromium by Richard Townsend · 2 weeks ago
e3f5ee8 Reorder x86 ISA enum by performance for QD8 to select faster ISA by Frank Barchard · 2 weeks ago
c28ce78 Add basic HVX reduce kernels. by Dillon Sharlet · 2 weeks ago
af8ea33 Use guard bytes on 32-bit arm by Dillon Sharlet · 2 weeks ago
533d396 Clean up f16-raddstoreexpminusmax kernels by Dillon Sharlet · 2 weeks ago
46ac455 Disable our own guard bytes mechanism if we have asan or msan by Dillon Sharlet · 3 weeks ago
f32d329 Fix padding for subconvolution case by Dillon Sharlet · 3 weeks ago
7fd0f79 Use load/store from SIMD wrappers in elementwise compiler. by Volodymyr Kysenko · 3 weeks ago
bd547a5 Add f16x4 load/store operations. by Volodymyr Kysenko · 3 weeks ago
b347035 Allow partial SIMD load/store functions to handle full-size vectors. by Volodymyr Kysenko · 3 weeks ago
2a32558 Add missing rule for `broadcast_like` test by Dillon Sharlet · 3 weeks ago
5121fe0 [gn] Remove unnecessary DEPS by Richard Townsend · 3 weeks ago
5b05286 Use generic simd::min/max for elementwise kernels. by Volodymyr Kysenko · 3 weeks ago
de693ef Replace `TypeGenerator` with `fill_random` by Dillon Sharlet · 3 weeks ago
10b7aa0 Use simd::broadcast in elementwise kernel compiler. by Volodymyr Kysenko · 3 weeks ago
928baa2 Refactor YNNPACK kernel generation to use simd::vec wrappers. (1/N) by Volodymyr Kysenko · 3 weeks ago
226f135 Enable AVX2 and AVX10 5x8 QS8 2 bit GEMM microkernels by Frank Barchard · 3 weeks ago
19e1944 Specialize `TypeGenerator` for all integers by Dillon Sharlet · 3 weeks ago
e607cdd Use shorthand consistent type names for hexagon_hvx simd test by Dillon Sharlet · 3 weeks ago
61ba61d Avoid BENCHMARK_CAPTURE when unnecessary by Dillon Sharlet · 3 weeks ago
e8aad3c Fix 32-bit integer multiply on HVX by Dillon Sharlet · 3 weeks ago
9940dd6 Reduce kernel test improvements by Dillon Sharlet · 3 weeks ago
57b14a8 Better sigmoid reference implementation by Dillon Sharlet · 3 weeks ago
82367b5 Don't replace minimum/maximum operators with clamp when input is broadcast by Reilly Grant · 3 weeks ago
9f84c7a Remove unused generate-spmm-test.py by Dillon Sharlet · 3 weeks ago
236a3e2 Update use of deprecated benchmark::internal::Benchmark by Dillon Sharlet · 3 weeks ago
4a547fe Simplify global construction of GEMM benchmarks by Dillon Sharlet · 3 weeks ago
35b9206 Clean up unnecessary codepaths from operator-run by Dillon Sharlet · 3 weeks ago
4f38bf9 Added qdu8_qc2w to FC for qc2w. This is needed to use fast avx2 kernels for DQ. by Misha Gutman · 3 weeks ago
b83baa8 AVX10 qs8 qc2w GEMM microkernels by Frank Barchard · 4 weeks ago
810b9af [gn] Add experimental integration with KleidiAI by Richard Townsend · 4 weeks ago
661b24e Improve transpose benchmark tile size selection by Dillon Sharlet · 4 weeks ago