v2.5.3 — Indexing Is 5.6x Faster. Here's What We Found.

2026-04-16·5 min read
Quick Answer: v2.5.3 cuts indexing time from 171 minutes to 30.8 minutes (5.6x faster) for a 6,500-file library. The root cause was a single 45MB Excel file consuming over 2 hours. We added a 10MB size cap and text-only extraction to eliminate the bottleneck.

171 minutes → 30.8 minutes. 5.6x faster. Same 6,512 files.

That's the headline number for v2.5.3. But the story behind it is more interesting than the number itself — because the fix wasn't "make everything faster." It was "find the one thing that was making everything slow."

The diagnosis: one Excel file was eating 85% of indexing time

We profiled real-world indexing on a library of 6,512 work documents — IPO filings, contracts, financial reports, spreadsheets. The old indexing rate was 37.9 files per minute, which meant a full index took almost 3 hours.

When we broke down the timing by file, the answer was immediately obvious: Excel files (.xlsx) consumed 85.7% of total indexing time. One 45MB spreadsheet alone took over 2 hours to process. That single file was the bottleneck for the entire library.

The root cause was the parser attempting to extract and index every cell from massive spreadsheets — including machine-generated data dumps that no human would ever search through. The fix was surgical:

Result: the same 6,512-file library now indexes in 30.8 minutes instead of 171.6 minutes.

Search response: faster and more consistent

We also fixed several issues that made search feel slower than it actually was:

Target: P95 search response ≤ 150ms, down from a 336ms baseline.

Search ranking: body content finally competes with filenames

Previously, a single keyword match in a filename scored 5x higher than the same keyword appearing dozens of times in a document body. That meant a file named report.docx would always outrank a 50-page document full of the word "report" in its actual content.

v2.5.3 reduces the filename boost from 5.0x to 2.5x and increases the folder path signal. Documents in relevant folders now surface more naturally, and body content gets a fair shot at the top of results.

Parser quality: 8 formats verified on 80 real documents

We tested every supported parser against real work documents — not synthetic test files, but actual IPO filings, contracts, and financial reports. Overall quality score: 4.3 out of 5.

Specific fixes:

Under the hood

Get v2.5.3

Download from localsynapse.com. If you're upgrading from a previous version, re-indexing is recommended to get the full benefit of the parser and speed improvements.

Free. Open source (Apache 2.0). Windows and macOS.


Related Posts

Try LocalSynapse Free

Search inside files, 100% offline, free

Go to Home

Related Posts