[zlib][riscv] Import superior Adler-32 implementation

Replace SiFive code for an alternative checksum implementation
that works in short 22-iteration batches thus avoiding overflowing
16-bit counters.

As a result, it has better parallelism in the inner loop, yielding
a +20% faster checksum speed on a K230 board.

The average *decompression* gain while using the zlib wrapper for
the snappy data corpus was +2.15%, but with near +4% for HTML.

Patch by Simon Hosie, from:

Bug: 329282661
Change-Id: I72e2ce9bb9b3d8626dedb33cf026f1af9b9b4a33
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/5433273
Reviewed-by: Hans Wennborg <hans@chromium.org>
Commit-Queue: Adenilson Cavalcanti <cavalcantii@chromium.org>
Cr-Commit-Position: refs/heads/main@{#1284684}
GitOrigin-RevId: f68eb88e6ac1139355bad9d1f1eff784e9e82afb
1 file changed