Implement vpx_convolve8_avg_horiz_neon using SDOT instruction

Add an alternative AArch64 implementation of
vpx_convolve8_avg_horiz_neon for targets that implement the Armv8.4-A
SDOT (signed dot product) instruction.

The existing MLA-based implementation of vpx_convolve8_avg_horiz_neon
is retained and used on target CPUs that do not implement the SDOT
instruction (or CPUs executing in AArch32 mode). The availability of
the SDOT instruction is indicated by the feature macro
__ARM_FEATURE_DOTPROD.

Bug: b/181236880
Change-Id: Ib435107c47c485f325248da87ba5618d68b0c8ed
1 file changed