PythonpandasFinanceAutomation

Spreadsheet Reconciler

Upload two CSV files (e.g. bank statement vs accounting ledger) and get a reconciliation report: matched rows, unmatched rows from each source, and discrepancies where amounts differ. Handles 100k+ rows, fuzzy date matching, and configurable tolerance thresholds.

Matching

Exact + fuzzy

Max rows

100,000 per file

Data stored

Never

CSV injection

Prevented

How it works

1Source A CSV
2Normalise
3Hash Index
4Source B CSV
5Normalise
6Exact Match
7Fuzzy Match
8Report

Matching Engine

Automatic date format detection and currency normalisation. Exact matching via hash-based join on key fields (O(n) complexity, ~80% match rate). Remaining rows matched fuzzily with configurable tolerances: date ±N days, amount ±N pence, partial reference matching. Only accepts matches above 0.85 confidence threshold.

Performance

1.2s for 10k rows, 4.8s for 50k, 11s for 100k (~180 MB peak memory at 100k). Output includes summary stats, matched rows, unmatched from each source, and discrepancies — downloadable as JSON or CSV.

Security

  • Zero file persistence — read into memory, processed, discarded
  • CSV injection prevention: formula characters (=, +, -, @) escaped
  • Input limits: 20 MB per CSV, 100k rows, 50 columns
  • No external API calls
  • Atomic file handling

Want something like this built for your business?

I'll look at your problem, figure out the right approach, and ship working software. No slideshows.

Book a free consultation