Use 4-tap interp filter in speed 1 sub-pel motion search

Added the 4-tap interp filter, and used it for speed 1 sub-pel motion
search. Speed 2 motion search still used bilinear filter as before.

Speed 1 borg test showed good bit savings.
        avg_psnr:  ovr_psnr:    ssim:
lowres:  -1.125    -1.179      -1.021
midres:  -0.717    -0.710      -0.543
hdres:   -0.357    -0.370      -0.342
Speed test at speed 1 showed ~10% encoder time increase, which was
partially because of no SIMD version of 4-tap filter.

Change-Id: Ic9b48cdc6a964538c20144108526682d64348301
5 files changed