Executive Summary
This document analyzes a three-month production deployment of PostgreSQL JSONB columns, documenting performance issues encountered, indexing strategies implemented, and architectural patterns that proved effective. The findings are relevant for teams considering JSONB for schema flexibility in high-traffic applications.
Key Metrics:
- Initial table size: 2M records
- Average JSONB column size: 50KB
- Write throughput: 10K updates/minute
- Query degradation: 200ms → 14s (70x regression)
Background and Context
In Q4 2025, our engineering team migrated user preference storage from normalized columns to JSONB format. Primary motivation was reducing schema migration overhead as product requirements evolved.
Initial Schema Design
CREATE TABLE users (
user_id BIGINT PRIMARY KEY,
email VARCHAR(255),
created_at TIMESTAMP,
settings JSONB
);
The settings column consolidated 23 previously discrete preference fields.
Performance Analysis
Phase 1: Initial Deployment (Weeks 1-2)
Initial performance metrics were acceptable:
- SELECT queries: <50ms p95
- INSERT operations: <30ms p95
- Table scan performance: Adequate for 2M records
Phase 2: Query Degradation (Week 3)
Performance degradation occurred when analytics queries began executing against JSONB fields:
SELECT
settings->>'signup_source',
COUNT(*)
FROM users
WHERE settings->>'created_year' = '2025'
GROUP BY settings->>'signup_source';
Observed Metrics:
- Query execution time: 14,000ms
- Sequential scan initiated
- CPU utilization: 85%+
Root Cause: JSONB columns lack automatic indexing. Queries using -> and ->> operators performed full table scans.
Indexing Strategies Implemented
Strategy 1: GIN Indexes
General Inverted (GIN) indexes provide containment query support:
CREATE INDEX idx_user_settings_gin
ON users USING GIN (settings);
Performance Characteristics:
- Optimizes: Containment queries (
@>,@?,?) - Does not optimize: Value extraction (
->>,#>) - Index size overhead: ~15% of column data
Strategy 2: Expression Indexes
Path-specific indexes for frequently queried fields:
CREATE INDEX idx_newsletter_sub
ON users ((settings->>'newsletter_subscribed'));
CREATE INDEX idx_region
ON users ((settings->>'region'));
Results:
- Query time reduction: 14s → 180ms (98% improvement)
- Total indexes created: 8
- Storage overhead: 30% increase
Strategy 3: Partial Indexes
For queries filtering on specific JSON values:
CREATE INDEX idx_premium_users
ON users ((settings->>'premium'))
WHERE (settings->>'premium') = 'true';
Write Amplification Issue
Problem Description
JSONB updates rewrite the entire column value, not individual fields:
UPDATE users
SET settings = jsonb_set(settings, '{last_login}', '"2026-01-08"')
WHERE user_id = 12345;
Impact Analysis:
- Average JSONB size: 50KB
- Update frequency: 10K/minute
- WAL generation: 500MB/minute
- Disk I/O saturation observed
Solution Architecture
Implemented frequency-based column segregation:
ALTER TABLE users
ADD COLUMN session_data JSONB,
ADD COLUMN profile_settings JSONB;
-- Hot data (frequent updates): session_data (~2KB)
-- Cold data (rare updates): profile_settings (~48KB)
Performance Improvement:
- Write amplification: -70%
- WAL generation: 500MB/min → 150MB/min
- Disk I/O utilization: 90% → 35%
Use Case Analysis
Recommended Use Cases
JSONB is appropriate for:
Variable User Preferences
- Schema varies significantly per user
- Read-heavy workload
- Infrequent updates
Audit Logging
- Variable metadata per event
- Write-once, read-rarely pattern
- No complex querying requirements
API Response Caching
- External API responses with variable structure
- TTL-based data lifecycle
- Simple key-value retrieval
Prototype Development
- Schema in active development
- Temporary flexibility requirement
- Migration to normalized schema planned
Not Recommended
JSONB is inappropriate for:
- High-frequency updates to large documents
- Complex aggregations across JSONB fields
- Frequently queried fields that could be normalized
- Critical business data requiring strict schema validation
Final Architecture
Our production system evolved to a hybrid model:
CREATE TABLE users (
-- Core fields (normalized)
user_id BIGINT PRIMARY KEY,
email VARCHAR(255) NOT NULL,
created_at TIMESTAMP NOT NULL,
-- Volatile state (small JSONB, frequent updates)
session_data JSONB,
-- User preferences (indexed JSONB, rare updates)
profile_settings JSONB
);
-- Separate table for historical data
CREATE TABLE user_settings_history (
history_id BIGSERIAL PRIMARY KEY,
user_id BIGINT REFERENCES users(user_id),
settings_snapshot JSONB,
created_at TIMESTAMP NOT NULL
) PARTITION BY RANGE (created_at);
Current Performance Metrics
- Query latency: 45ms p95 (restored to baseline)
- Write throughput: 10K/min (maintained)
- WAL generation: 150MB/min (70% reduction)
- Storage efficiency: Acceptable with indexing overhead
Lessons Learned
- Index Strategy is Critical: JSONB requires explicit indexing strategy from day one
- Write Amplification Real: Update frequency must inform schema design
- Hybrid Approach Optimal: Combine normalized columns with JSONB selectively
- Monitor from Start: Implement comprehensive monitoring before production deployment
- Documentation Essential: PostgreSQL JSONB documentation is comprehensive; invest time upfront
Technical Debt Assessment
Migration cost comparison:
- Original schema migration (5 columns): ~30 minutes with pt-online-schema-change
- JSONB refactoring effort: ~6 weeks of engineering time
- Production incident response: ~40 hours
Recommendation: Evaluate schema flexibility requirements carefully. Traditional normalized schemas remain appropriate for most use cases.
References
- PostgreSQL JSONB Documentation: https://www.postgresql.org/docs/current/datatype-json.html
- GIN Indexes: https://www.postgresql.org/docs/current/gin.html
- JSONB Indexing Best Practices: PostgreSQL Wiki
Document Version: 1.0
Last Updated: 2026-01-08
Author: Database Engineering Team
Review Status: Production Validated