Django at 10 Million Records: Lessons from the RDM Migration
Eight months, 10M+ records, 26 Django apps, and one legacy ASP.NET system that would not die quietly. Here's what actually mattered for performance.
When we started the RDM Connect rebuild, the legacy ASP.NET system had been running since 2005 against an MSSQL database with 10M+ rows of retail-merchandising data. The brief was simple: rebuild as a multi-tenant Django SaaS, migrate without losing a row, and keep API responses under 200ms.
The single biggest performance lever was not Django at all — it was the data model. The legacy schema had grown organically for two decades and contained dozens of denormalized columns that downstream reports depended on. Re-normalizing aggressively let us replace half a dozen multi-join queries with indexed lookups, and the rest fell into place.
select_related and prefetch_related are the boring answer to most Django performance questions, and that is still true at 10M rows. What changes at scale is that you need to be ruthless about which queries run on the request path. The DRF serializer is where most teams quietly lose this fight — a nested serializer that looks innocent in code can fan out into a hundred queries per response if the underlying relationship isn't prefetched.
For the migration itself, we ran a dual-database setup: legacy MSSQL as the source of truth, new Postgres as the staging target, and a Celery pipeline that incrementally backfilled tables in dependency order. Pandas handled the bulk transforms; bulk_create with batch_size=1000 handled the inserts. Zero data loss came from one habit: every batch wrote a checkpoint row, and every nightly verification job did a row-count and checksum comparison.
On the Azure side, the unsexy wins compounded. We cut the Docker image from 2GB to 400MB by switching to python:3.12-slim and trimming dev dependencies — deploys went from 6 minutes to under 90 seconds. Celery on Azure Container Apps with horizontal scaling kept background tasks decoupled from the request path. Redis as a query cache for the heaviest read endpoints picked up the long tail.
If I had to summarize the whole project in one sentence: most production Django performance problems are data-model problems wearing an ORM costume. Fix the model and the framework gets out of your way.