Change parallelization strategy in rayon

Intends to address the issue of effectively serialized sort, where all
tasks end up being executed on the main thread instead of being
distributed into other workers.

We had neglected that most work is scheduled in sync (apppend_row such
as in decoder.rs:903 instead of apppend_rows). This meant most were
executed with an immediate strategy.

The change pushes all items into a bounded task queue that is emptied
and actively worked on when it reaches a capacity maximum, as well as
when any component result is requested. This is in contrast to
std::multithreading where items are worked on while decoding is in
progress but task queueing itself has more overhead.

decode a 512x512 JPEG   time:   [1.7317 ms 1.7352 ms 1.7388 ms]
                        change: [-22.895% -22.646% -22.351%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
  1 (1.00%) low mild
  4 (4.00%) high mild
  1 (1.00%) high severe

decode a 512x512 progressive JPEG
                        time:   [4.7252 ms 4.7364 ms 4.7491 ms]
                        change: [-15.641% -15.349% -15.052%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  1 (1.00%) high mild
  2 (2.00%) high severe

decode a 512x512 grayscale JPEG
                        time:   [873.48 us 877.71 us 882.83 us]
                        change: [-11.470% -10.764% -10.041%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 13 outliers among 100 measurements (13.00%)
  2 (2.00%) low mild
  9 (9.00%) high mild
  2 (2.00%) high severe

extract metadata from an image
                        time:   [1.1033 us 1.1066 us 1.1099 us]
                        change: [-11.608% -9.8026% -8.3965%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  2 (2.00%) low severe
  1 (1.00%) low mild
  1 (1.00%) high mild

Benchmarking decode a 3072x2048 RGB Lossless JPEG: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 36.6s, or reduce sample count to 10.
decode a 3072x2048 RGB Lossless JPEG
                        time:   [363.07 ms 363.66 ms 364.27 ms]
                        change: [+0.0997% +0.3692% +0.6323%] (p = 0.01 < 0.05)
                        Change within noise threshold.
Found 6 outliers among 100 measurements (6.00%)
  5 (5.00%) high mild
  1 (1.00%) high severe

     Running unittests (target/release/deps/large_image-0e61f2c2f07410bd)
Gnuplot not found, using plotters backend
decode a 2268x1512 JPEG time:   [28.755 ms 28.879 ms 29.021 ms]
                        change: [-5.7714% -4.9308% -4.0969%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  3 (3.00%) high mild
  4 (4.00%) high severe
1 file changed
tree: adcccb1782f5af54d26906fa3b2e92f1d8582248
  1. .github/
  2. benches/
  3. examples/
  4. fuzz/
  5. fuzz-afl/
  6. src/
  7. tests/
  8. .gitignore
  9. appveyor.yml
  10. Cargo.toml
  11. CHANGELOG.md
  12. LICENSE-APACHE
  13. LICENSE-MIT
  14. README.md
  15. rust-toolchain
README.md

jpeg-decoder

Rust CI AppVeyor Build Status Crates.io

A Rust library for decoding JPEGs.

Documentation

Example

Cargo.toml:

[dependencies]
jpeg-decoder = "0.2"

main.rs:

extern crate jpeg_decoder as jpeg;

use std::fs::File;
use std::io::BufReader;

fn main() {
    let file = File::open("hello_world.jpg").expect("failed to open file");
    let mut decoder = jpeg::Decoder::new(BufReader::new(file));
    let pixels = decoder.decode().expect("failed to decode image");
    let metadata = decoder.info().unwrap();
}

Requirements

This crate compiles with rust >= 1.48. Minimum Supported Rust Version:

  • All releases 0.1.* compile with rust >= 1.36.
  • The releases 0.2.* may bump Rust Version requirements (TBD).