[zlib][riscv] Import superior Adler-32 implementation

Replace SiFive code for an alternative checksum implementation
that works in short 22-iteration batches thus avoiding overflowing
16-bit counters.

As a result, it has better parallelism in the inner loop, yielding
a +20% faster checksum speed on a K230 board.

The average *decompression* gain while using the zlib wrapper for
the snappy data corpus was +2.15%, but with near +4% for HTML.

Patch by Simon Hosie, from:

