commit | 1c31461771ed6d21101ea7236496a620ba926863 | [log] [tgz] |
---|---|---|
author | George Steed <george.steed@arm.com> | Thu Apr 18 09:40:41 2024 |
committer | Frank Barchard <fbarchard@chromium.org> | Mon Sep 16 04:31:11 2024 |
tree | adaeea75061dc01560f5defe3f0dfbdc61f23583 | |
parent | 2d62d8d22a612a51f91d9f90f55c674b77b340e9 [diff] |
[AArch64] Add Neon dot-product implementation for ARGBGrayRow We can use dot product instructions to apply the coefficients without needing to use LD4 deinterleaving load instructions, and then TBL to mix in the original alpha component. This is significantly faster on some micro-architectures where LD4 instructions are known to be slow compared to normal loads. Reduction in cycle counts observed compared to existing Neon code: Cortex-A55: -12.6% Cortex-A510: -48.6% Cortex-A76: -39.7% Cortex-A720: -52.3% Cortex-X1: -63.5% Cortex-X2: -67.0% Bug: b/42280946 Change-Id: I3641785e74873438acc00d675f5bc490dfa95b50 Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5785972 Reviewed-by: Justin Green <greenjustin@google.com> Reviewed-by: Frank Barchard <fbarchard@chromium.org>
libyuv is an open source project that includes YUV scaling and conversion functionality.
See Getting started for instructions on how to get started developing.
You can also browse the docs directory for more documentation.