Email Classifier
Paste an email, get it classified: support request, sales inquiry, spam, or internal. Returns the category, confidence score, and suggested priority. Uses a two-stage pipeline: fast rule-based filtering catches obvious spam first, then a trained classifier handles the rest.
Model
TF-IDF + LinearSVC
F1 score
0.91 weighted
Latency
< 50ms
API calls
None (local model)
How it works
Classification Pipeline
Stage 1 is a fast rule-based filter using phishing URL patterns, excessive capitalisation (>40% of text), and known spam phrases. Stage 2 runs cleaned text through a LinearSVC model trained on TF-IDF vectorisation of 12k labelled emails. The two-stage approach catches 30% of spam before the ML model even runs.
Model Architecture & Training
scikit-learn Pipeline chains text preprocessing, TF-IDF vectorisation, and LinearSVC classification. No deep learning: deliberate choice for speed (under 50ms), explainability, and zero GPU dependency. Support precision: 0.93, spam recall: 0.96.
Security
- All HTML stripped before classification
- Email content processed in memory — never written to disk
- Rate limiting: 10 classifications/min per IP
- Model integrity verified via SHA-256 checksum at startup
- No external API calls — everything runs locally
Want something like this built for your business?
I'll look at your problem, figure out the right approach, and ship working software. No slideshows.
Book a free consultation