Database Architecture
A showcase of the database design behind GlobeScraper: 30+ models, relationship mapping, indexing strategy, query optimization, migration management, and transaction safety.
Entity-Relationship Overview
GlobeScraper's schema models users, content, rentals, and scraping jobs. Below is a simplified view of the core relationships.
model User { id String @id @default(cuid()) email String @unique name String? role Role @default(USER) posts Post[] comments Comment[] createdAt DateTime @default(now()) updatedAt DateTime @updatedAt @@index([email]) @@index([role]) } model Listing { id String @id @default(cuid()) title String priceThb Float city String source String url String @unique tags Tag[] scrapedAt DateTime createdAt DateTime @default(now()) @@index([city, source]) @@index([priceThb]) @@index([scrapedAt]) }
- User → Post : one-to-many via
authorIdforeign key - Post → Comment : one-to-many with cascade delete
- Listing ↔ Tag : many-to-many via implicit join table
- ScrapeJob → RawListing : one-to-many, tracks provenance
- CUIDs over auto-increment : no sequential ID exposure
- Soft deletes :
deletedAtcolumn, filtered by default - Audit timestamps : every table has
createdAtandupdatedAt - Enums for roles :
USER,ADMIN,MODERATORenforced at DB level
Indexing Strategy
Indexes are designed around the most common query patterns. I profile queries in development and add indexes when a full table scan appears.
| Table | Index | Reason |
|---|---|---|
| User | email (unique) |
Login lookups : O(1) instead of scan |
| Listing | (city, source) composite |
Filter by location and source in one index hit |
| Listing | priceThb |
Price range queries for the search feature |
| Listing | scrapedAt |
"Latest listings" sorted by scrape date |
| Post | (authorId, createdAt) |
User profile: "my posts" sorted by date |
Query Optimization
I use EXPLAIN ANALYZE to verify index usage and identify slow queries during development.
EXPLAIN ANALYZE SELECT id, title, price_thb, city FROM listings WHERE city = 'Bangkok' AND source = 'hipflat' AND price_thb BETWEEN 8000 AND 25000 ORDER BY scraped_at DESC LIMIT 20; -- Before index: Seq Scan, 142ms -- After (city, source) index: Index Scan, 3ms -- 47x improvement
- No N+1 queries : use Prisma
includeor explicit joins - Paginate everything : cursor-based pagination, never
OFFSET - Select only needed columns : avoid
SELECT *in production code - Connection pooling : Prisma's built-in pool, max 10 connections
Migration & Schema Versioning
Prisma Migrate generates SQL migration files from schema changes. Every migration is reviewed, tested, and version-controlled alongside application code.
Edit schema.prisma, run prisma migrate dev.
Prisma generates a timestamped SQL migration file.
Read the generated SQL. Test against a local database. Check for data loss, breaking changes, and performance.
prisma migrate deploy applies pending migrations in production.
Migrations run inside a transaction : if any fails, it rolls back.
Multi-step operations (e.g., creating a user + profile + initial post) are wrapped in
Prisma's $transaction API. If any step fails, everything rolls back.
This prevents partial writes and data inconsistency.
Migration files live in prisma/migrations/, committed to git.
Each migration is a directory with a SQL file and metadata.
I can trace every schema change back to a specific commit and PR.