zlib: inflate using wider loads and stores

In inflate_fast() the output pointer always has plenty of room to write.
This means that so long as the target is capable, wide un-aligned
loads and stores can be used to transfer several bytes at once.

When the reference distance is too short simply unroll the data a
little to increase the distance. Patch by Simon Hosie.

PNG decoding performance gains should be around 30-33%.

This also includes the fix reported in madler/zlib#245.

Bug: 697280
Change-Id: I90a9866cc56aa766df5de472cd10c007f4b560d8
Reviewed-on: https://chromium-review.googlesource.com/689961
Reviewed-by: Chris Blume <cblume@chromium.org>
Commit-Queue: Adenilson Cavalcanti <cavalcantii@chromium.org>
Cr-Original-Commit-Position: refs/heads/master@{#505276}
Cr-Mirrored-From: https://chromium.googlesource.com/chromium/src
Cr-Mirrored-Commit: 78104f4d73e3bbb4155fa804d00ed66682180556
5 files changed