Synthesize IO
Production-Ready Synthetic Data, Generated in Seconds, at Any Scale
View Live Project






1 / 7500+
Monthly Active Users
1M+ rows/min
Generation Speed
CSV, JSON, SQL, Parquet, Excel
Export Formats
About the Project
Synthesize IO is an AI-powered synthetic data platform built as a TypeScript monorepo with two Next.js portals, a user-facing data studio and an admin portal, both styled with Shadcn UI. The FastAPI Python backend orchestrates Microsoft's Synthcity library and Faker for statistically realistic dataset generation; Celery workers handle async generation jobs with Redis as the task broker, all Dockerised on Hostinger VPS with Nginx. DodoPayments powers pay-as-you-go billing; Nodemailer delivers job completion and export notifications. With 500+ monthly active users generating 1M+ rows per minute, it supports CSV, JSON, SQL, Parquet, and Excel exports across GDPR, HIPAA, CCPA, and SOC2-compliant workflows.
How It Works
- 1
The two-portal Next.js monorepo (user studio + admin portal) shares TypeScript types and a FastAPI client wrapper; Docker Compose orchestrates the Next.js containers, FastAPI backend, Celery worker fleet, Redis broker, and PostgreSQL database on Hostinger VPS with Nginx as the reverse proxy.
- 2
Users describe their dataset in plain English through the Shadcn UI studio interface; the FastAPI backend passes the natural language schema prompt to an LLM that infers column semantics, data types, distributions, and relational constraints, storing the resolved schema in PostgreSQL.
- 3
Generation jobs are dispatched to the Celery worker fleet via Redis; each worker runs Synthcity's statistical synthesis models for structured relational data and Faker for domain-specific values (names, addresses, IBANs, phone numbers), guaranteeing referential integrity across foreign key relationships.
- 4
Completed datasets are written to PostgreSQL-backed storage and made available for streaming export in CSV, JSON, SQL, Parquet, or Excel format; Nodemailer fires a job-completion email with a signed download link valid for 24 hours.
- 5
DodoPayments handles pay-as-you-go billing by row volume tier with webhook-driven credit top-ups in PostgreSQL; Google Analytics tracks funnel drop-off from schema definition to first export, and Google Search Console surfaces the platform for synthetic data generation and GDPR-compliant test data queries.
Tech Stack
Want to build something like this?
We'd love to hear about your project. Let's talk about what you're building.