vc4: Use NEON to speed up utile stores on Pi2+.

Improves 1024x1024 TexSubImage2D by 41.2371% +/- 3.52799% (n=10).
1 file changed