Pythonscikit-learnNLPFastAPI

Email Classifier

Paste an email, get it classified: support request, sales inquiry, spam, or internal. Returns the category, confidence score, and suggested priority. Uses a two-stage pipeline: fast rule-based filtering catches obvious spam first, then a trained classifier handles the rest.

Industrial mail sorting machine with brass chutes directing envelopes into classified categories using AI.

Model

TF-IDF + LinearSVC

F1 score

0.91 weighted

Latency

< 50ms

API calls

None (local model)

How it works

1Raw Email

→

2Sanitise

→

3Rule Filter

→

4TF-IDF Vectorise

→

5LinearSVC Classify

→

6Output

Classification Pipeline

Stage 1 is a fast rule-based filter using phishing URL patterns, excessive capitalisation (>40% of text), and known spam phrases. Stage 2 runs cleaned text through a LinearSVC model trained on TF-IDF vectorisation of 12k labelled emails. The two-stage approach catches 30% of spam before the ML model even runs.

Model Architecture & Training

scikit-learn Pipeline chains text preprocessing, TF-IDF vectorisation, and LinearSVC classification. No deep learning: deliberate choice for speed (under 50ms), explainability, and zero GPU dependency. Support precision: 0.93, spam recall: 0.96.

Security

All HTML stripped before classification
Email content processed in memory - never written to disk
Rate limiting: 10 classifications/min per IP
Model integrity verified via SHA-256 checksum at startup
No external API calls - everything runs locally

Related service

This project demonstrates the kind of work I do under Customer Support AI.

Learn more →

← Back to all projects

Want something like this built for your business?

I'll look at your problem, figure out the right approach, and ship working software. No slideshows.

Book a free consultation