Skip to content

Conversation

@chingiztob
Copy link
Contributor

The std::collections::HashMap uses a DOS-resistant but relatively slow hashing algorithm. In the case of osm4routing, HashMap is heavily used while reading .pbf files, even though cryptographic security is not required.

So why don't we replace it with much faster and lighter ahash instead?

my small criterion benchmark with osm.pbf of 500.000 people city highways shows around 18% perf improvement:

      Running benches/my_benchmark.rs (target/release/deps/my_benchmark-d792725a0a6f62ee)
Gnuplot not found, using plotters backend
Benchmarking benchmark_main: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 103.4s, or reduce sample count to 10.
Benchmarking benchmark_main: Collecting 100 samples in estimated
fib 20                  time:   [1.0229 s 1.0264 s 1.0300 s]
                        change: [-18.570% -18.211% -17.876%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  4 (4.00%) high mild

The change does not seem to break any public API. However, since it involves a type change, it may require a major version bump (0.8 ?)

@Tristramg
Copy link
Collaborator

Hi, thank you for your pull request.

That’s an interesting simple and promising performance improvement for such a small change.

I would be in favor, but I didn’t notice an improvement; I simply ran the the binary on the pbf of the netherlands.

Could you share a bit more the benchmark you did (the code, the dataset)?

@chingiztob
Copy link
Contributor Author

Hi, i used function similar to "main" with fixed args. Datasets - Yaroslavl and Barcelona roads (pre-filtered for highway only with osmium).

Also added std::time::Instant:now() ... elapsed() check in main, and ran on full denmark pbf. Time dropped from ~37 seconds to ~30.

use criterion::{criterion_group, criterion_main, Criterion};

fn benchmark_main() {
    let mut reader = osm4routing::Reader::new();
    let file = "/home/chingiz/Rust/osm/roads_yar.pbf";

    match reader.read(file) {
        Ok((nodes, edges)) => osm4routing::writers::csv(nodes, edges, "nodes.csv", "edges.csv"),
        Err(error) => println!("Error: {error}"),
    }
}

fn criterion_benchmark(c: &mut Criterion) {
    c.bench_function("benchmark_main", |b| b.iter(benchmark_main));
}

criterion_group!(benches, criterion_benchmark);
criterion_main!(benches);

@chingiztob
Copy link
Contributor Author

chingiztob commented Jan 22, 2025

Hi, thank you for your pull request.

That’s an interesting simple and promising performance improvement for such a small change.

I would be in favor, but I didn’t notice an improvement; I simply ran the the binary on the pbf of the netherlands.

Could you share a bit more the benchmark you did (the code, the dataset)?

     Running benches/my_benchmark.rs (target/release/deps/my_benchmark-d792725a0a6f62ee)
Gnuplot not found, using plotters backend
Benchmarking sample-size-example/my-benchmark_main: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 304.4s.
Benchmarking sample-size-example/my-benchmark_main: Coll
Benchmarking sample-size-example/my-benchmark_main: Anal
sample-size-example/my-benchmark_main
                        time:   [30.054 s 30.288 s 30.532 s]
                        change: [-15.540% -14.687% -13.785%] (p = 0.00 < 0.05)
                        Performance has improved.

benchmark with unfiltered Denmark

@Tristramg
Copy link
Collaborator

Thank you!
I confirm your findings (maybe I was to tired the first time I tested).
Would you mind squashing the 2 commits and bump the minor (middle) version number?

@chingiztob
Copy link
Contributor Author

@Tristramg can you approve the workflow, please

@chingiztob
Copy link
Contributor Author

@Tristramg Ready to merge, i have no push access

@Tristramg Tristramg merged commit bc78837 into rust-transit:main Jan 27, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants