commit | c36aa2e9c4a610dd7f5467126c894ac4dcbded02 | [log] [tgz] |
---|---|---|
author | Jonathan Wright <jonathan.wright@arm.com> | Tue May 30 16:31:18 2023 |
committer | Jonathan Wright <jonathan.wright@arm.com> | Wed May 31 13:34:43 2023 |
tree | 8f3ebabb7de5d4a1eb3af856801ff1b49bb44b94 | |
parent | c738e87f27ef8e12dd28b9052f446a5f69abf3c9 [diff] |
Optimize Neon implementation of vpx_int_pro_row Double the number of accumulator registers to remove the bottleneck. Also peel the first loop iteration. Change-Id: I6a90680369f9c33cdfe14ea547ac1569ec3f50de