Pythonscikit-learnNLPFastAPI

Email Classifier

Paste an email, get it classified: support request, sales inquiry, spam, or internal. Returns the category, confidence score, and suggested priority. Uses a two-stage pipeline: fast rule-based filtering catches obvious spam first, then a trained classifier handles the rest.

Model

TF-IDF + LinearSVC

F1 score

0.91 weighted

Latency

< 50ms

API calls

None (local model)

How it works

1Raw Email
2Sanitise
3Rule Filter
4TF-IDF Vectorise
5LinearSVC Classify
6Output

Classification Pipeline

Stage 1 is a fast rule-based filter using phishing URL patterns, excessive capitalisation (>40% of text), and known spam phrases. Stage 2 runs cleaned text through a LinearSVC model trained on TF-IDF vectorisation of 12k labelled emails. The two-stage approach catches 30% of spam before the ML model even runs.

Model Architecture & Training

scikit-learn Pipeline chains text preprocessing, TF-IDF vectorisation, and LinearSVC classification. No deep learning: deliberate choice for speed (under 50ms), explainability, and zero GPU dependency. Support precision: 0.93, spam recall: 0.96.

Security

  • All HTML stripped before classification
  • Email content processed in memory — never written to disk
  • Rate limiting: 10 classifications/min per IP
  • Model integrity verified via SHA-256 checksum at startup
  • No external API calls — everything runs locally

Want something like this built for your business?

I'll look at your problem, figure out the right approach, and ship working software. No slideshows.

Book a free consultation