/ Shayon Mukherjee / blog

Scaling with PostgreSQL without boiling the ocean

February 9, 2025
~9 mins

“Postgres was great when we started but now that our service is being used heavily we are running into a lot of ‘weird’ issues”

This sentiment is frequently echoed by CTOs and senior engineers at high-growth startups when I speak with them.

Scaling PostgreSQL successfully doesn’t always require a full team of DBAs and experts. The beauty of PostgreSQL is that solutions often lie within the database itself - by rethinking your data access patterns from first principles, you can solve many business problems at scale.

In this post, I’ll address some common “weird” issues I’ve encountered and explore solutions that work at scale.

ℹī¸ This is not an exhaustive list of all scaling challenges I have come across and is meant to be fairly high level, so YMMV. If you’d like to chat more, my DMs and email are always open.

1. Lock Contention

Lock contention becomes a significant bottleneck as write operations increase, especially with foreign key relationships. Long-running transactions can block other operations, creating a cascade of waiting processes.

This is likely a no-brainer at this point: Solutions

Considerations

2. Advisory Locks

If you find yourself needing to coordinate read/write access to certain resources in your application, advisory locks can be super handy. Unlike table or row locks that Postgres manages automatically, advisory locks give you explicit control over your locking strategy at the application level.

Solution

Example (Ruby): https://github.com/ClosureTree/with_advisory_lock

Considerations

3. Foreign Keys (đŸŒļī¸)

My simple take is - While Foreign Keys are great at upholding data integrity, their impact on performance, migrations, and system complexity is hard to ignore. So, take a moment to ponder your database schema. Are all those constraints pulling their weight, or could shedding a few lighten your system’s load?

Solutions

Considerations

If you’d like to read more, I wrote more at: Do You Really Need Foreign Keys?

4. Index Bloat

When rows are updated or deleted in PostgreSQL, the old versions aren’t immediately removed from indexes. This “bloat” accumulates over time, causing indexes to grow larger than necessary and potentially impacting query performance.

Solutions

Considerations

5. TOAST (The Oversized-Attribute Storage Technique)

PostgreSQL uses TOAST (The Oversized-Attribute Storage Technique) to handle large field values like JSON blobs or long text by storing them in separate tables. Values larger than 2KB are automatically moved to TOAST storage. While this clever mechanism helps manage large data efficiently, it can introduce unexpected performance characteristics, especially when these values are frequently accessed.

Solutions

Considerations

6. Vertical vs Horizontal Scaling

As your database grows, vertical scaling (bigger machines) eventually becomes cost-prohibitive or hits hardware limits. Before jumping to sharding, consider a staged approach to scaling your PostgreSQL infrastructure.

Solutions

  1. Federation

    • Split into new DB clusters by functionality (auth, billing, analytics, etc)
    • Simpler to implement than sharding
    • Natural service boundaries
    • Each database can be optimized independently
    • Easier to maintain and backup
  2. Partitioning Within Cluster

    • Table partitioning for large tables (by time, id ranges, or hash)
    • Improves query performance and manageability
    • Easier maintenance operations (bulk deletes, selective backups)
    • Stays within single database instance
    • Good for time-series or tenant-based like data
  3. Sharding Last

    • Partition data across multiple instances
    • Complex but necessary for massive scale
    • Requires careful key design
    • Consider data locality and access patterns
    • Plan for rebalancing and migration strategies

Considerations

7. Append-Only vs Update-Heavy Tables

Update-heavy tables often lead to write amplification, increased VACUUM overhead, and index fragmentation. In contrast, append-only tables with time-based access patterns can scale remarkably well, as they avoid these issues and enable efficient partitioning strategies.

Solutions

Considerations

8. DDL Changes Under Load

Schema changes can block other operations, potentially causing application downtime. As your database grows, even simple alterations like adding an index can take hours if not handled carefully.

Solutions

Considerations

9. Zero-Downtime Major Version Upgrades

Major version upgrades traditionally required downtime, which isn’t acceptable for many businesses. Using logical replication for blue/green deployments can help achieve near-zero downtime upgrades, typically reducing downtime from hours to just seconds.

There is a lot of different ways to achieve, so I will just like to some of my content and and content that I have found useful on other places on the internet

10. Counting

Simple COUNT queries become problematic at scale, especially for frequently accessed counts. This remains hard at scale too, however I have found in some cases moving the computational cost from read-time to write-time be valuable.

Solutions

Considerations

11. Multi-tenancy

Multi-tenant databases require careful consideration of data isolation, performance, and scalability. There are several approaches, each with their own tradeoffs.

Solutions

  1. Schema-based Separation

    • Each tenant gets their own schema
    • Strong isolation
    • Can be challenging to manage many schemas
    • Requires connection management
  2. Row-level Separation

    • Single schema, tenant_id column
    • Simpler to manage
    • Requires careful indexing
    • Need to ensure tenant isolation in queries
    • Tools like activerecord-multi-tenant can help
  3. Hybrid Approaches

    • Separate databases for large tenants
    • Row-level for smaller tenants
    • Balance between management overhead and performance
    • More complex application logic

Considerations

I have found activerecord-multi-tenant gem (for Rails apps) provides a nice abstraction for row-level multi-tenancy, handling the tenant context in your queries automatically. At its core, it injects WHERE tenant_id = ? conditions into your queries, ensuring proper data isolation at the application level.

Parting thoughts

As I mentioned, these are by no means an exhaustive list and it’s meant to be fairly high level - if you’d like to chat more, my DMs and email are always open.

last modified February 9, 2025