DatabasePerformancePostgreSQL
Database Optimization: From Slow to Lightning Fast
Essential database optimization techniques that every developer should know to improve application performance.
ContentHub Team•
Database Optimization: From Slow to Lightning Fast
Slow database queries are one of the most common performance bottlenecks in web applications. In this guide, we'll explore practical techniques to optimize your database performance.
Identifying Slow Queries
Before optimizing, you need to find the bottlenecks:
Enable Query Logging
-- PostgreSQL: Log slow queries
ALTER SYSTEM SET log_min_duration_statement = '100ms';
SELECT pg_reload_conf();
Analyze Query Plans
EXPLAIN ANALYZE SELECT * FROM posts WHERE author_id = 123;
Look for:
- Seq Scan on large tables (often bad)
- Index Scan (usually good)
- Nested Loop with high row counts (can be problematic)
Indexing Strategies
Basic Index Types
-- B-tree (default, good for most queries)
CREATE INDEX idx_posts_author ON posts(author_id);
-- Composite index (for multi-column queries)
CREATE INDEX idx_posts_author_date ON posts(author_id, created_at DESC);
-- Partial index (for filtered queries)
CREATE INDEX idx_published_posts ON posts(created_at) WHERE published = true;
When to Create Indexes
Create indexes for:
- Foreign keys - Almost always beneficial
- Columns in WHERE clauses - Especially with high selectivity
- Columns in ORDER BY - For sorted queries
- Columns in JOIN conditions - Critical for join performance
Index Anti-Patterns
-- ❌ Too many indexes slow down writes
-- ❌ Indexes on low-cardinality columns (e.g., boolean)
-- ❌ Unused indexes waste space
-- Check for unused indexes in PostgreSQL
SELECT
schemaname || '.' || relname AS table,
indexrelname AS index,
pg_size_pretty(pg_relation_size(i.indexrelid)) AS size,
idx_scan AS scans
FROM pg_stat_user_indexes ui
JOIN pg_index i ON ui.indexrelid = i.indexrelid
WHERE idx_scan < 50
ORDER BY pg_relation_size(i.indexrelid) DESC;
Query Optimization
Avoid SELECT *
-- ❌ Fetches unnecessary data
SELECT * FROM users WHERE id = 1;
-- ✅ Only fetch what you need
SELECT id, name, email FROM users WHERE id = 1;
Use Proper JOINs
-- ❌ Subquery in SELECT (N+1 problem)
SELECT
p.*,
(SELECT name FROM users WHERE id = p.author_id) as author_name
FROM posts p;
-- ✅ Use JOIN
SELECT p.*, u.name as author_name
FROM posts p
JOIN users u ON p.author_id = u.id;
Pagination Done Right
-- ❌ OFFSET is slow for large values
SELECT * FROM posts ORDER BY id LIMIT 10 OFFSET 10000;
-- ✅ Use cursor-based pagination
SELECT * FROM posts
WHERE id > 10000
ORDER BY id
LIMIT 10;
ORM Best Practices
Avoiding N+1 Queries
// ❌ N+1 problem
const posts = await prisma.post.findMany();
for (const post of posts) {
const author = await prisma.user.findUnique({
where: { id: post.authorId }
});
}
// ✅ Use include/eager loading
const posts = await prisma.post.findMany({
include: { author: true },
});
Selecting Only Needed Fields
// ✅ Select specific fields
const users = await prisma.user.findMany({
select: {
id: true,
name: true,
email: true,
},
});
Connection Pooling
Why It Matters
Database connections are expensive to create. Pooling reuses connections:
// Using pg-pool
import { Pool } from 'pg';
const pool = new Pool({
max: 20, // Maximum connections
min: 5, // Minimum connections
idleTimeoutMillis: 30000,
connectionTimeoutMillis: 2000,
});
Prisma Connection Pool
# .env
DATABASE_URL="postgresql://user:pass@host:5432/db?connection_limit=10"
Caching Strategies
Query Result Caching
import { Redis } from 'ioredis';
const redis = new Redis();
async function getUser(id: string) {
const cacheKey = `user:${id}`;
// Check cache first
const cached = await redis.get(cacheKey);
if (cached) return JSON.parse(cached);
// Query database
const user = await prisma.user.findUnique({ where: { id } });
// Cache for 5 minutes
await redis.setex(cacheKey, 300, JSON.stringify(user));
return user;
}
Invalidation Strategy
async function updateUser(id: string, data: UserData) {
// Update database
const user = await prisma.user.update({
where: { id },
data
});
// Invalidate cache
await redis.del(`user:${id}`);
return user;
}
Performance Monitoring
Key Metrics to Track
- Query execution time - Average and p95/p99
- Connection pool utilization - Are you running out?
- Lock waits - Contention issues
- Cache hit rate - Is caching working?
PostgreSQL Stats
-- Most time-consuming queries
SELECT
substring(query, 1, 50) as query,
calls,
round(total_exec_time::numeric, 2) as total_ms,
round(mean_exec_time::numeric, 2) as avg_ms
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;
Quick Wins Checklist
- [ ] Add indexes to foreign keys
- [ ] Enable query logging for slow queries
- [ ] Use eager loading to avoid N+1
- [ ] Implement connection pooling
- [ ] Add caching for frequently accessed data
- [ ] Use cursor pagination instead of OFFSET
- [ ] Select only needed columns
- [ ] Monitor and review query performance regularly
Conclusion
Database optimization is an ongoing process. Start by measuring, identify the worst offenders, and optimize incrementally. Remember: premature optimization is the root of all evil, but slow databases are the root of user frustration!