Database Optimization: From Slow to Lightning Fast

Slow database queries are one of the most common performance bottlenecks in web applications. In this guide, we'll explore practical techniques to optimize your database performance.

Identifying Slow Queries

Before optimizing, you need to find the bottlenecks:

Enable Query Logging

-- PostgreSQL: Log slow queries
ALTER SYSTEM SET log_min_duration_statement = '100ms';
SELECT pg_reload_conf();

Analyze Query Plans

EXPLAIN ANALYZE SELECT * FROM posts WHERE author_id = 123;

Look for:

Seq Scan on large tables (often bad)
Index Scan (usually good)
Nested Loop with high row counts (can be problematic)

Indexing Strategies

Basic Index Types

-- B-tree (default, good for most queries)
CREATE INDEX idx_posts_author ON posts(author_id);

-- Composite index (for multi-column queries)
CREATE INDEX idx_posts_author_date ON posts(author_id, created_at DESC);

-- Partial index (for filtered queries)
CREATE INDEX idx_published_posts ON posts(created_at) WHERE published = true;

When to Create Indexes

Create indexes for:

Foreign keys - Almost always beneficial
Columns in WHERE clauses - Especially with high selectivity
Columns in ORDER BY - For sorted queries
Columns in JOIN conditions - Critical for join performance

Index Anti-Patterns

-- ❌ Too many indexes slow down writes
-- ❌ Indexes on low-cardinality columns (e.g., boolean)
-- ❌ Unused indexes waste space

-- Check for unused indexes in PostgreSQL
SELECT
    schemaname || '.' || relname AS table,
    indexrelname AS index,
    pg_size_pretty(pg_relation_size(i.indexrelid)) AS size,
    idx_scan AS scans
FROM pg_stat_user_indexes ui
JOIN pg_index i ON ui.indexrelid = i.indexrelid
WHERE idx_scan < 50
ORDER BY pg_relation_size(i.indexrelid) DESC;

Query Optimization

Avoid SELECT *

-- ❌ Fetches unnecessary data
SELECT * FROM users WHERE id = 1;

-- ✅ Only fetch what you need
SELECT id, name, email FROM users WHERE id = 1;

Use Proper JOINs

-- ❌ Subquery in SELECT (N+1 problem)
SELECT 
    p.*,
    (SELECT name FROM users WHERE id = p.author_id) as author_name
FROM posts p;

-- ✅ Use JOIN
SELECT p.*, u.name as author_name
FROM posts p
JOIN users u ON p.author_id = u.id;

Pagination Done Right

-- ❌ OFFSET is slow for large values
SELECT * FROM posts ORDER BY id LIMIT 10 OFFSET 10000;

-- ✅ Use cursor-based pagination
SELECT * FROM posts 
WHERE id > 10000 
ORDER BY id 
LIMIT 10;

ORM Best Practices

Avoiding N+1 Queries

// ❌ N+1 problem
const posts = await prisma.post.findMany();
for (const post of posts) {
  const author = await prisma.user.findUnique({ 
    where: { id: post.authorId } 
  });
}

// ✅ Use include/eager loading
const posts = await prisma.post.findMany({
  include: { author: true },
});

Selecting Only Needed Fields

// ✅ Select specific fields
const users = await prisma.user.findMany({
  select: {
    id: true,
    name: true,
    email: true,
  },
});

Connection Pooling

Why It Matters

Database connections are expensive to create. Pooling reuses connections:

// Using pg-pool
import { Pool } from 'pg';

const pool = new Pool({
  max: 20,           // Maximum connections
  min: 5,            // Minimum connections
  idleTimeoutMillis: 30000,
  connectionTimeoutMillis: 2000,
});

Prisma Connection Pool

# .env
DATABASE_URL="postgresql://user:pass@host:5432/db?connection_limit=10"

Caching Strategies

Query Result Caching

import { Redis } from 'ioredis';

const redis = new Redis();

async function getUser(id: string) {
  const cacheKey = `user:${id}`;
  
  // Check cache first
  const cached = await redis.get(cacheKey);
  if (cached) return JSON.parse(cached);
  
  // Query database
  const user = await prisma.user.findUnique({ where: { id } });
  
  // Cache for 5 minutes
  await redis.setex(cacheKey, 300, JSON.stringify(user));
  
  return user;
}

Invalidation Strategy

async function updateUser(id: string, data: UserData) {
  // Update database
  const user = await prisma.user.update({ 
    where: { id }, 
    data 
  });
  
  // Invalidate cache
  await redis.del(`user:${id}`);
  
  return user;
}

Performance Monitoring

Key Metrics to Track

Query execution time - Average and p95/p99
Connection pool utilization - Are you running out?
Lock waits - Contention issues
Cache hit rate - Is caching working?

PostgreSQL Stats

-- Most time-consuming queries
SELECT 
    substring(query, 1, 50) as query,
    calls,
    round(total_exec_time::numeric, 2) as total_ms,
    round(mean_exec_time::numeric, 2) as avg_ms
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;

Quick Wins Checklist

[ ] Add indexes to foreign keys
[ ] Enable query logging for slow queries
[ ] Use eager loading to avoid N+1
[ ] Implement connection pooling
[ ] Add caching for frequently accessed data
[ ] Use cursor pagination instead of OFFSET
[ ] Select only needed columns
[ ] Monitor and review query performance regularly

Conclusion

Database optimization is an ongoing process. Start by measuring, identify the worst offenders, and optimize incrementally. Remember: premature optimization is the root of all evil, but slow databases are the root of user frustration!