add LOCAL_CLANG_PREREQ and avoid WORK_AROUND_GCC w/3.8+

this results in a 15-20% speedup for lossy decoding on a N5/S6/CM1

BUG=webp:339

Change-Id: Icdeb84c3e0b8908147ac276b4d8f76c3d565b735
(cherry picked from commit f78da3dea6b2e02974a647122e96777667875d21)
2 files changed