Performance¶

rapid_textrank is designed for speed. The Rust core delivers 10-100x faster keyword extraction compared to pure Python implementations, depending on document size and tokenization method.

Key Performance Features¶

Rust core with zero-copy data paths -- most computation happens in compiled Rust code, minimizing Python overhead
CSR graph format -- Compressed Sparse Row storage for cache-friendly PageRank iteration
String interning -- StringPool reduces memory allocations 10-100x for typical documents
Parallel processing -- Rayon provides data parallelism for internal graph construction
Link-Time Optimization -- full LTO with single codegen unit for maximum inlining
FxHash -- fast non-cryptographic hashing for internal hash maps

Approximate Speedups¶

Document Size	rapid_textrank	pytextrank + spaCy	Speedup
Small (~20 words)	~0.1 ms	~5 ms	~50x
Medium (~100 words)	~0.3 ms	~15 ms	~50x
Large (~1000 words)	~2 ms	~80 ms	~40x

Results are approximate and vary by hardware. See the Benchmarks page for a runnable benchmark script.

Learn More¶

Benchmarks -- detailed benchmark results and a script to measure performance on your system
Why Rust is Fast -- deep dive into the performance optimizations used in rapid_textrank
Comparison -- how rapid_textrank compares to alternative keyword extraction libraries