Sign in
chromium
/
external
/
github.com
/
google
/
XNNPACK
/
HEAD
35b9206
Clean up unnecessary codepaths from operator-run
by Dillon Sharlet
· 6 hours ago
upstream/master
4f38bf9
Added qdu8_qc2w to FC for qc2w. This is needed to use fast avx2 kernels for DQ.
by Misha Gutman
· 22 hours ago
b83baa8
AVX10 qs8 qc2w GEMM microkernels
by Frank Barchard
· 3 days ago
810b9af
[gn] Add experimental integration with KleidiAI
by Richard Townsend
· 3 days ago
661b24e
Improve transpose benchmark tile size selection
by Dillon Sharlet
· 3 days ago
538ac8f
Merge pull request #9516 from ken-unger:f16-vunary-vbinary-rvv
by XNNPACK Team
· 3 days ago
0c8ef5d
Use memcpy to load partial NEON vectors
by Dillon Sharlet
· 4 days ago
9d47c4a
Use simd::store for partial tiles
by Dillon Sharlet
· 4 days ago
b9fde4f
Only allow f32 to bf16 dot rewrite when B is constant. This is a temporary workaround for an accuracy bug.
by Marie White
· 4 days ago
c0a9b3a
Add support for reduce_window(square) rewriting.
by Alexander Shaposhnikov
· 4 days ago
7d657e4
Fix fma benchmark
by Dillon Sharlet
· 4 days ago
4e3eca3
Strengthen disabling of consistent_reduce_test
by Dillon Sharlet
· 4 days ago
ba53e53
Add avx512 transpose kernels
by Dillon Sharlet
· 4 days ago
f6576aa
Use gmock to test simd vector ops
by Dillon Sharlet
· 4 days ago
27769b3
Emulate fp32 dots with 3 bf16 dots
by Marie White
· 4 days ago
c140cb6
[gn] Add extra GN config for MacOS
by Richard Townsend
· 5 days ago
6e4a91d
Use avx512 instructions for 128- and 256-bit partial loads/stores when available.
by Dillon Sharlet
· 5 days ago
41fda8b
For avx512, combine f, bw, vl, dq into one target
by Dillon Sharlet
· 5 days ago
cb4d18a
[gn] Add initial support for building XNNPACK benchmarks
by Richard Townsend
· 5 days ago
49ad0fa
Fix msan for simd/bench
by Dillon Sharlet
· 6 days ago
ad5f476
Make `xnn_fingerprint_id_to_string` part of the available API.
by Quentin Khan
· 6 days ago
beb38a8
Look up packed weights in the cache before computing them in 2D convolutions.
by Quentin Khan
· 6 days ago
aa15b51
Add Hexagon transpose kernels to YNNPACK
by Dillon Sharlet
· 6 days ago
8748d64
Add `interleave_in_place` and optimize lifetimes of transpose intermediates
by Dillon Sharlet
· 6 days ago
a670397
Add initial Hexagon HVX support to YNNPACK
by Dillon Sharlet
· 6 days ago
6932ee7
AVX qd8 qc2w GEMM microkernels generated for MR=2 to 8
by Frank Barchard
· 6 days ago
eec404f
Disable avx512f on gcc9 too.
by Dillon Sharlet
· 6 days ago
1abd243
Optimize partial loads on x86
by Dillon Sharlet
· 6 days ago
14f12a9
Don't rely on dummy load/store ops to avoid incorrectly computing the min/max
by Dillon Sharlet
· 6 days ago
349e42a
Run generators for binary ops on RISC-V
by Frank Barchard
· 6 days ago
eea59d1
Align mask to avoid crossing cache line boundaries
by Dillon Sharlet
· 7 days ago
d4323cb
[gn] Restrict maximum test output lines in CI
by Richard Townsend
· 7 days ago
44284b7
Remove `YNN_ALIGN` macro, `alignas` is standard C++
by Dillon Sharlet
· 7 days ago
396a8ef
Add `zeros` and `undef` tags for partial loads
by Dillon Sharlet
· 7 days ago
ef5c0b9
[gn] Switch off AVX512, reduce volume of output
by Richard Townsend
· 7 days ago
d171011
Add avx2 kernels for statically quantized 2-bit FC.
by Misha Gutman
· 7 days ago
2ae275e
Refactor `src/operators/convolution-nchw.c` to mirror `convolution-nhwc.c`.
by Quentin Khan
· 7 days ago
8023ac4
Add common subexpression elimination pass
by Marie White
· 7 days ago
869b909
Fix benchmarks getting stripped by the linker
by Dillon Sharlet
· 7 days ago
fefde82
Refactor tests:
by Marie White
· 7 days ago
33bda67
Docker.riscv and Docker.sme2 are multi-stage docker files,
by Alexander Shaposhnikov
· 7 days ago
6b2936d
Update partial load implementation in XNNPACK x86 kernels.
by Volodymyr Kysenko
· 7 days ago
def51dc
Add xnnpack user to sudoers inside the container.
by Alexander Shaposhnikov
· 7 days ago
394b332
Add benchmarks of some SIMD wrapper operations
by Dillon Sharlet
· 7 days ago
0d80f77
Disable all AVX512 on gcc9
by Dillon Sharlet
· 7 days ago
b91c032
Another attempt at fixing Windows build.
by Volodymyr Kysenko
· 7 days ago
83ce1b6
SIMD wrapper cleanups
by Dillon Sharlet
· 8 days ago
fb61513
Make SIMD test names consistent with the vector type names.
by Dillon Sharlet
· 8 days ago
7b0beed
cleanup
by Ken Unger
· 8 days ago
b6f471e
Fix missing declaration of header
by Dillon Sharlet
· 8 days ago
c8c583a
Speed up transpose kernel tests
by Dillon Sharlet
· 8 days ago
abe167a
Fix Windows build.
by Volodymyr Kysenko
· 8 days ago
3a0101d
Add pf32 support for f32_f16 FC node to enable SME acceleration.
by XNNPACK Team
· 8 days ago
b6f1b88
Add pf32 support for f32_f16 CONV_2D to enable SME acceleration.
by XNNPACK Team
· 8 days ago
9bab87b
1. Skip adding 0 padding 2. Fix handling of broadcastable dimensions.
by Alexander Shaposhnikov
· 8 days ago
b497ce5
Fix cleanup command for sme2 build.
by Alexander Shaposhnikov
· 8 days ago
7dc968f
Add xnn_get_fingerprint functions to XNNPACK shim (stubs).
by Alexander Shaposhnikov
· 9 days ago
d7ddd02
Store static_pad buffer at the innermost level.
by Volodymyr Kysenko
· 10 days ago
52ac311
Reorder operations in YNNPACK's grouped convolution definition.
by Volodymyr Kysenko
· 10 days ago
4939810
Add scalar support and test coverage of horizontal_sum
by Dillon Sharlet
· 10 days ago
548e701
Add scalar implementation of simd::vec
by Dillon Sharlet
· 10 days ago
f5f93c4
Make all floating point sum kernels arithmetically consistent on x86.
by Dillon Sharlet
· 11 days ago
3e0e760
Fix target processor inference has logic bug
by Frank Barchard
· 11 days ago
92d786a
Add linkopts and malloc settings to new binaries
by Dillon Sharlet
· 11 days ago
d899254
Add max pooling test case to reduce_window test
by Dillon Sharlet
· 11 days ago
1e15a3e
Set output shape too
by Dillon Sharlet
· 11 days ago
0d1b729
[gn] Add support for most of XNNPACK's tests
by Richard Townsend
· 11 days ago
d08f992
More narrowly disable sme2-qemu build
by Dillon Sharlet
· 11 days ago
d698580
Change simd tests to skip in Test::SetUp instead of a custom main
by Dillon Sharlet
· 12 days ago
313afbb
Added qs8_qc2w neondot kernel.
by Misha Gutman
· 12 days ago
fa19b54
Add a test only implementation of reduce_window.
by Alexander Shaposhnikov
· 12 days ago
023f2bd
Force root for pack_b in YNNPACK dot subgraph when num_k_dims > 1.
by Volodymyr Kysenko
· 12 days ago
4e3a0a6
Add subtract_fp32_bf16 kernel
by Marie White
· 12 days ago
102dac3
Temporarily disable sme2-qemu
by Dillon Sharlet
· 12 days ago
675932d
Fix fingerprinting test compilation for C++11.
by Quentin Khan
· 12 days ago
4cad8c8
Temporarily disable sme2-qemu
by Dillon Sharlet
· 12 days ago
4a5d698
Remove WORKSPACE
by Dillon Sharlet
· 12 days ago
cb7ba0c
Update bazel dependencies
by Dillon Sharlet
· 12 days ago
f837e5b
Speculative fix for:
by Dillon Sharlet
· 13 days ago
7cf53d6
Fix workflow trigger paths
by Dillon Sharlet
· 13 days ago
16c0f7d
Temporarily disable sme2-qemu
by Dillon Sharlet
· 13 days ago
f967def
Merge branch 'google:master' into f16-vunary-vbinary-rvv
by Ken Unger
· 13 days ago
b33278f
Add 2-bit QD8_F32_QC2W GEMM for AVX2 - add template and generate 1x8
by Frank Barchard
· 13 days ago
f95d1d3
Make SME2 docker image multi-arch
by Dillon Sharlet
· 13 days ago
46426be
Fix dot_bench argument parsing
by Marie White
· 13 days ago
fd47a05
Remove explicit bazel version
by Dillon Sharlet
· 13 days ago
2ceace7
Fix usage of `std::uniform_int_distribution` with unsupported types.
by Dillon Sharlet
· 13 days ago
a49f92c
Fix incorrect usages of anonymous namespaces
by Dillon Sharlet
· 13 days ago
1b32c8a
Remove slinky integration in XNNPACK
by Dillon Sharlet
· 14 days ago
199903d
Merge pull request #9393 from ken-unger:f32-rsum-rvv
by XNNPACK Team
· 14 days ago
3caec45
Remove 12x4x4 dot kernel from avx512
by Dillon Sharlet
· 14 days ago
b4deb87
Fix flaky failures due to differences in overflow when converting float to bf16
by Dillon Sharlet
· 2 weeks ago
0a76944
Tweak LUTs so we only need unsigned index kernels.
by Dillon Sharlet
· 2 weeks ago
4b8cfde
Add convert bf16 to fp32 kernels for AVX2 and AVX512
by Marie White
· 2 weeks ago
88634cd
Fix 2D lut use cases
by Dillon Sharlet
· 2 weeks ago
4db8f78
Implement convert f32 to bf16 for AVX2 and AVX512BF
by Marie White
· 2 weeks ago
e7da7bc
Clean up `get_binary_kernel`
by Dillon Sharlet
· 2 weeks ago
7e65961
Update KleidiAI dependency
by Dillon Sharlet
· 2 weeks ago
e9459e3
Update cpuinfo, gtest, KleidiAI dependencies
by Dillon Sharlet
· 2 weeks ago
8ff3752
Add `//ynnpack/subgraph/test:dot_bench`
by Dillon Sharlet
· 2 weeks ago
Next »