tree: 14652c1a13f6eeba7105ef7e18910020b573ddf1 [path history] [tgz]
  1. memory_dos_impact.py
  2. README.md
Tools/picklebench/README.md

Pickle Chunked Reading Benchmark

This benchmark measures the performance impact of the chunked reading optimization in GH PR #119204 for the pickle module.

What This Tests

The PR adds chunked reading (1MB chunks) to prevent memory exhaustion when unpickling large objects:

  • BINBYTES8 - Large bytes objects (protocol 4+)
  • BINUNICODE8 - Large strings (protocol 4+)
  • BYTEARRAY8 - Large bytearrays (protocol 5)
  • FRAME - Large frames
  • LONG4 - Large integers
  • An antagonistic mode that tests using memory denial of service inducing malicious pickles.

Quick Start

# Run full benchmark suite (1MiB → 200MiB, takes several minutes)
build/python Tools/picklebench/memory_dos_impact.py

# Test just a few sizes (quick test: 1, 10, 50 MiB)
build/python Tools/picklebench/memory_dos_impact.py --sizes 1 10 50

# Test smaller range for faster results
build/python Tools/picklebench/memory_dos_impact.py --sizes 1 5 10

# Output as markdown for reports
build/python Tools/picklebench/memory_dos_impact.py --format markdown > results.md

# Test with protocol 4 instead of 5
build/python Tools/picklebench/memory_dos_impact.py --protocol 4

Note: Sizes are specified in MiB. Use --sizes 1 2 5 for 1MiB, 2MiB, 5MiB objects.

Antagonistic Mode (DoS Protection Test)

The --antagonistic flag tests malicious pickles that demonstrate the memory DoS protection:

# Quick DoS protection test (claims 10, 50, 100 MB but provides 1KB)
build/python Tools/picklebench/memory_dos_impact.py --antagonistic --sizes 10 50 100

# Full DoS test (default: 10, 50, 100, 500, 1000, 5000 MB claimed)
build/python Tools/picklebench/memory_dos_impact.py --antagonistic

What This Tests

Unlike normal benchmarks that test legitimate pickles, antagonistic mode tests:

  • Truncated BINBYTES8: Claims 100MB but provides only 1KB (will fail to unpickle)
  • Truncated BINUNICODE8: Same for strings
  • Truncated BYTEARRAY8: Same for bytearrays
  • Sparse memo attacks: PUT at index 1 billion (would allocate huge array before PR)

Key difference:

  • Normal mode: Tests real data, shows ~5% time overhead
  • Antagonistic mode: Tests malicious data, shows ~99% memory savings

Expected Results

100MB Claimed (actual: 1KB)
  binbytes8_100MB_claim
    Peak memory:     1.00 MB (claimed: 100 MB, saved: 99.00 MB, 99.0%)
    Error: UnpicklingError  ← Expected!

Summary:
  Average claimed: 126.2 MB
  Average peak:    0.54 MB
  Average saved:   125.7 MB (99.6% reduction)
Protection Status: ✓ Memory DoS attacks mitigated by chunked reading

Before PR: Would allocate full claimed size (100MB+), potentially crash After PR: Allocates 1MB chunks, fails fast with minimal memory

This demonstrates the security improvement - protection against memory exhaustion attacks.

Before/After Comparison

The benchmark includes an automatic comparison feature that runs the same tests on both a baseline and current Python build.

Option 1: Automatic Comparison (Recommended)

Build both versions, then use --baseline to automatically compare:

# Build the baseline (main branch without PR)
git checkout main
mkdir -p build-main
cd build-main && ../configure && make -j $(nproc) && cd ..

# Build the current version (with PR)
git checkout unpickle-overallocate
mkdir -p build
cd build && ../configure && make -j $(nproc) && cd ..

# Run automatic comparison (quick test with a few sizes)
build/python Tools/picklebench/memory_dos_impact.py \
  --baseline build-main/python \
  --sizes 1 10 50

# Full comparison (all default sizes)
build/python Tools/picklebench/memory_dos_impact.py \
  --baseline build-main/python

The comparison output shows:

  • Side-by-side metrics (Current vs Baseline)
  • Percentage change for time and memory
  • Overall summary statistics

Interpreting Comparison Results

  • Time change: Small positive % is expected (chunking adds overhead, typically 5-10%)
  • Memory change: Negative % is good (chunking saves memory, especially for large objects)
  • Trade-off: Slightly slower but much safer against memory exhaustion attacks

Option 2: Manual Comparison

Save results separately and compare manually:

# Baseline results
build-main/python Tools/picklebench/memory_dos_impact.py --format json > baseline.json

# Current results
build/python Tools/picklebench/memory_dos_impact.py --format json > current.json

# Manual comparison
diff -y <(jq '.' baseline.json) <(jq '.' current.json)

Understanding the Results

Critical Sizes

The default test suite includes:

  • < 1MiB (999,000 bytes): No chunking, allocates full size upfront
  • = 1MiB (1,048,576 bytes): Threshold, chunking just starts
  • > 1MiB (1,048,577 bytes): Chunked reading engaged
  • 1, 2, 5, 10MiB: Show scaling behavior with chunking
  • 20, 50, 100, 200MiB: Stress test large object handling

Note: The full suite may require more than 16GiB of RAM.

Key Metrics

  • Time (mean): Average unpickling time - should be similar before/after
  • Time (stdev): Consistency - lower is better
  • Peak Memory: Maximum memory during unpickling - expected to be LOWER after PR
  • Pickle Size: Size of the serialized data on disk

Test Types

TestWhat It Stresses
bytes_*BINBYTES8 opcode, raw binary data
string_ascii_*BINUNICODE8 with simple ASCII
string_utf8_*BINUNICODE8 with multibyte UTF-8 (€ chars)
bytearray_*BYTEARRAY8 opcode (protocol 5)
list_large_items_*Multiple chunked reads in sequence
dict_large_values_*Chunking in dict deserialization
nested_*Realistic mixed data structures
tuple_*Immutable structures

Expected Results

Before PR (main branch)

  • Single large allocation per object
  • Risk of memory exhaustion with malicious pickles

After PR (unpickle-overallocate branch)

  • Chunked allocation (1MB at a time)
  • Slightly higher CPU time (multiple allocations + resizing)
  • Significantly lower peak memory (no large pre-allocation)
  • Protection against DoS via memory exhaustion

Advanced Usage

Test Specific Sizes

# Test only 5MiB and 10MiB objects
build/python Tools/picklebench/memory_dos_impact.py --sizes 5 10

# Test large objects: 50, 100, 200 MiB
build/python Tools/picklebench/memory_dos_impact.py --sizes 50 100 200

More Iterations for Stable Timing

# Run 10 iterations per test for better statistics
build/python Tools/picklebench/memory_dos_impact.py --iterations 10 --sizes 1 10

JSON Output for Analysis

# Generate JSON for programmatic analysis
build/python Tools/picklebench/memory_dos_impact.py --format json | python -m json.tool

Interpreting Memory Results

The peak memory metric shows the maximum memory allocated during unpickling:

  • Without chunking: Allocates full size immediately

    • 10MB object → 10MB allocation upfront
  • With chunking: Allocates in 1MB chunks, grows geometrically

    • 10MB object → starts with 1MB, grows: 2MB, 4MB, 8MB (final: ~10MB total)
    • Peak is lower because allocation is incremental

Typical Results

On a system with the PR applied, you should see:

1.00MiB Test Results
  bytes_1.00MiB:     ~0.3ms, 1.00MiB peak  (just at threshold)

2.00MiB Test Results
  bytes_2.00MiB:     ~0.8ms, 2.00MiB peak  (chunked: 1MiB → 2MiB)

10.00MiB Test Results
  bytes_10.00MiB:    ~3-5ms, 10.00MiB peak (chunked: 1→2→4→8→10 MiB)

Time overhead is minimal (~10-20% for very large objects), but memory safety is significantly improved.