Add normalized Levenshtein and Damerau-Levenstein (#20)

* Add tests for 'normalized_levenshtein'

* Implement 'normalized_levenshtein'

* Implement 'normalized_damerau_levenshtein'

* Add benchmarking of new functions

* Move test cases from integration tests to unit tests

* Use 'is_empty' instead of 'len'

* Count chars instead of bytes

* Update Readme
4 files changed
tree: 9d7e70a603b42d94573e0c7cacae581313b6b182
  1. .editorconfig
  2. .gitattributes
  3. .gitignore
  4. .travis.yml
  5. CHANGELOG.md
  6. Cargo.toml
  7. LICENSE
  8. README.md
  9. appveyor.yml
  10. benches/
  11. dev
  12. src/
  13. tests/
README.md

strsim-rs Crates.io Crates.io Linux build status Windows build status

Rust implementations of string similarity metrics:

Installation

# Cargo.toml
[dependencies]
strsim = "0.7.0"

Documentation

You can change the version in the url to see the documentation for an older version in the changelog.

Usage

extern crate strsim;

use strsim::{hamming, levenshtein, normalized_levenshtein, osa_distance, damerau_levenshtein,
             normalized_damerau_levenshtein, jaro, jaro_winkler};

fn main() {
    match hamming("hamming", "hammers") {
        Ok(distance) => assert_eq!(3, distance),
        Err(why) => panic!("{:?}", why)
    }

    assert_eq!(3, levenshtein("kitten", "sitting"));

    assert!((normalized_levenshtein("kitten", "sitting") - 0.57142).abs() < 0.00001);

    assert_eq!(3, osa_distance("ac", "cba"));

    assert_eq!(2, damerau_levenshtein("ac", "cba"));

    assert!((normalized_damerau_levenshtein("levenshtein", "löwenbräu") - 0.27272).abs() < 0.00001)

    assert!((0.392 - jaro("Friedrich Nietzsche", "Jean-Paul Sartre")).abs() <
            0.001);

    assert!((0.911 - jaro_winkler("cheeseburger", "cheese fries")).abs() <
            0.001);
}

Development

If you don't want to install Rust itself, you can run $ ./dev for a development CLI if you have Docker installed.

Benchmarks require a Nightly toolchain. They are run by cargo +nightly bench.

License

MIT