TypeScriptNext.jsPrismaAI

GlobeScraper

A full-stack content, community, and rental data platform for English teachers relocating to Southeast Asia. Built from scratch with Next.js 14, a 970-line Prisma schema, 7-source scraping pipeline, and AI content engine powered by Google Gemini.

Glowing purple fiber optic web draped over a Southeast Asian globe with data packets in transit, representing multi-source web scraping.

API routes

DB models

30+

Scraper sources

Deploy

Vercel + Hetzner

How it works

17 Scraper Sources

→

2Crawl & Discover

→

3ScrapeQueue

→

4Parallel Workers

→

5Content Fingerprint

→

6DB Upsert

→

7AI Review

→

8Analytics Index

Rental Pipeline

Discover phase crawls category pages and enqueues URLs. Workers atomically claim items (SQL UPDATE…LIMIT, no locks), fetch and parse content, then upsert with content fingerprinting to prevent duplicates. AI review stage uses Gemini to classify residential vs non-residential, correct property types, and rewrite descriptions.

Platform Features

Rental marketplace with search filters, pagination, and image carousels. Community features: profiles, connections, DMs, meetups, trust panels, and moderation. AI content engine researches competitors via Serper.dev, generates articles with Gemini, creates images with Imagen 4.0, and auto-publishes with SEO scoring. Analytics heatmap covering 300+ Cambodia districts.

Key Decisions

No Tailwind - vanilla CSS + BEM for full control. Playwright for Cloudflare bypass (Khmer24 blocks HTTP scrapers). Human-like pacing with jittered delays (1.2-2s). Atomic queue claiming prevents race conditions. Gemini over GPT for speed, cost, and reliable JSON output. Hetzner VPS for browser automation (can't run Playwright on Vercel serverless).

Security

Human-like pacing prevents scraper detection and bans
Content fingerprinting prevents duplicate records
Authentication via Auth.js v5
Atomic queue claiming prevents race conditions

Related service

This project demonstrates the kind of work I do under Custom AI Solutions.

Learn more →

← Back to all projects

Want something like this built for your business?

I'll look at your problem, figure out the right approach, and ship working software. No slideshows.

Book a free consultation