Case Study · Confidential analytics platform, North America

10x Ingest Throughput on a PostgreSQL Analytics Pipeline Using UNLOGGED Tables

How UnlockLive used PostgreSQL UNLOGGED tables, batched COPY ingestion, and a disciplined two-table pattern to push a high-volume analytics pipeline from 8K to 80K events/sec — without losing the data integrity guarantees the audited path required.

  • IndustryData / Analytics
  • Year2024
  • CountryUSA
  • Duration3 months
10x Ingest Throughput on a PostgreSQL Analytics Pipeline Using UNLOGGED Tables hero screenshot

At-a-glance results

  • 10xSustained ingest throughput (8K → 80K+ events/sec)
  • 60%Lower RDS IOPS cost on the same instance class
  • <50msp95 stage-1 insert latency under sustained load
  • 0Data-integrity regressions on the audited reporting path

The challenge

An analytics platform was ingesting product events from a growing customer base into a single "raw_events" table. The original schema was correct but slow: every event hit a logged table with seven indexes, the WAL was the bottleneck, and ingest throughput had topped out around 8,000 events per second. RDS IOPS were the dominant cost line, and a planned 3x customer expansion was going to break the pipeline before the next contract was signed.

The team had read the usual advice — "use a queue," "shard the table," "move to a dedicated TSDB" — but those answers all meant a multi-quarter migration. The real question was: can PostgreSQL keep up if we stop fighting it? The catch was a hard requirement that anything that reached the audited reporting tables had to be durable, consistent, and indexed. We couldn't sacrifice integrity on the analytics side to save the ingest side.

Our solution

We split the pipeline into two clearly separated stages with two different durability contracts.

Stage 1 (ingest): an UNLOGGED staging table that takes the firehose. UNLOGGED tables in PostgreSQL skip WAL writes for inserts, which is exactly the right trade-off for short-lived staging data — we get 5-10x raw insert throughput in exchange for losing the staged rows if the server crashes. Combined with batched COPY (not INSERT) from a Python ingest service and a deliberate "no indexes on the staging table" rule, raw ingest jumped from 8K to 80K+ events/sec on the same RDS instance.

Stage 2 (durable): a Celery worker drains the staging table in micro-batches into the real, fully-logged, indexed `events` table inside a single transaction with idempotent upserts. The audited reporting path only ever reads from the durable table. If the server crashes mid-ingest, we lose at most a few seconds of in-flight events from the staging table — and the upstream producers retry, so the durable table converges back to correct.

We paired this with a small but careful operational layer: a Grafana dashboard for stage-1 fill rate, a hard alert if the drainer falls behind, partition rotation on the durable table, and a written runbook for the three failure modes that matter.

  • UNLOGGED staging table with no indexes — engineered specifically for raw insert throughput
  • Batched COPY ingest from a Python/FastAPI service, not row-at-a-time INSERTs
  • Celery drainer moving micro-batches into the durable, fully-indexed events table
  • Idempotent upserts (ON CONFLICT DO NOTHING) so producer retries are safe
  • Monthly partitioning on the durable table to keep vacuum and index work bounded
  • End-to-end lag dashboards and alerts in Grafana / Datadog
  • Written runbook covering RDS failover, drainer crash, and producer back-pressure
  • Audit sign-off on the durability profile before launch — no surprises

How we built it

  1. 01

    Profiling the real bottleneck

    Before touching the schema we ran a week of pg_stat_statements, RDS Performance Insights, and a custom WAL throughput tracker. The data was unambiguous: WAL writes and index maintenance on the single fat events table accounted for 78% of write time. The CPU and memory weren't the problem — durability was.

  2. 02

    Designing the two-table contract

    We wrote a short design doc that defined exactly what UNLOGGED bought us, what it cost us, and what guarantees the upstream producers had to provide (at-least-once delivery, idempotent event IDs). The contract made the loss profile explicit so the audit team could sign off in advance — "can lose up to N seconds of staged events on RDS failover; durable table is unaffected."

  3. 03

    Batched COPY, micro-batched drainer

    We replaced row-at-a-time inserts with a Python ingest service that buffers events in-memory for up to 250ms and flushes via PostgreSQL COPY into the UNLOGGED staging table. A Celery drainer pulls 5,000-row micro-batches into the durable table inside a single transaction with ON CONFLICT DO NOTHING for idempotency.

  4. 04

    Operations, partitioning, runbook

    We added monthly partitioning to the durable table (so vacuum and index rebuilds stay cheap), Grafana dashboards for end-to-end lag, alerts on staging fill and drainer lag, and a written runbook covering the three real failure modes: drainer crash, RDS failover, and producer back-pressure. Handed off to the client's on-call team with shadowing.

Tech stack

  • PostgreSQL 16
  • Python
  • FastAPI
  • Celery
  • Redis
  • AWS RDS
  • AWS S3
  • Grafana
  • Datadog
  • Backend Performance Engineering
  • Python & FastAPI
  • Cloud Solutions
  • Database Architecture
We expected to spend a quarter migrating to a dedicated time-series database. Instead UnlockLive showed us how to make Postgres do the job — and our auditors signed off on the new design before we shipped it.
Head of Data Engineering · Analytics platform (name confidential)

Frequently asked questions

Are PostgreSQL UNLOGGED tables safe to use in production?

Yes — for the right job. UNLOGGED tables skip WAL writes, which gives you 5-10x raw insert throughput in exchange for a documented loss profile: rows are dropped on a server crash and not replicated. They're a great fit for short-lived staging data when an upstream system can replay events. They're a terrible fit for anything you need to read back authoritatively. We always pair them with a durable, logged table the audited path actually reads from.

Why not just use a dedicated time-series database like TimescaleDB or ClickHouse?

Because the migration cost and operational surface area are real. If your team already runs PostgreSQL well and your throughput target is in the tens of thousands of events per second, the UNLOGGED + batched COPY pattern often gets you there in weeks instead of quarters. We recommend a dedicated TSDB when the workload genuinely needs columnar storage, time-partitioned indexes, or millisecond range queries at scale.

Can I add indexes to a PostgreSQL UNLOGGED staging table?

You can, but you almost shouldn't. The whole point of the staging tier is raw insert speed; every index you add costs you ingest throughput. We keep the staging table index-free and put all the indexes on the downstream durable table that the analytics queries actually run against.

How do you guarantee no data loss when using UNLOGGED tables in a pipeline?

Two rules. First, the upstream producers must give you at-least-once delivery with idempotent event IDs so retries are safe. Second, the durable side of the pipeline only ever reads from the fully-logged downstream table, never from the UNLOGGED staging table. The worst case becomes a few seconds of replayable staged events on a crash — never a missing report.

What ingest throughput can you get out of a single PostgreSQL instance with this pattern?

It depends on row size, network, and instance class, but on a modest AWS RDS db.r6g.xlarge with this exact two-table pattern we routinely see 60K-100K events per second sustained, with p95 stage-1 insert latency under 50ms. Beyond that we'd partition the staging table or scale horizontally with a write router.

Want a result like this?

Talk to the same team that built 10x Ingest Throughput on a PostgreSQL Analytics Pipeline Using UNLOGGED Tables. We’ll scope your project, give you a fixed-price proposal, and show you the closest analog from our portfolio.

Book a strategy call