GraphQL makes it stupidly easy to write queries that murder your database. Our homepage was hitting the database 2,031 times per page load. Yeah.
Here’s how we got it down to 3 queries without changing the API.
The query that looked innocent
query Homepage {
posts {
id
title
author {
id
name
avatar
}
comments {
id
text
author {
name
}
}
}
}
Looks fine. Perfectly reasonable GraphQL query. Returns 10 posts with their authors and comments.
The database explosion
Our naive resolver implementation:
const resolvers = {
Query: {
posts: () => db.query("SELECT * FROM posts LIMIT 10")
},
Post: {
author: (post) => db.query("SELECT * FROM users WHERE id = ?", [post.author_id]),
comments: (post) => db.query("SELECT * FROM comments WHERE post_id = ?", [post.id])
},
Comment: {
author: (comment) => db.query("SELECT * FROM users WHERE id = ?", [comment.author_id])
}
};
For 10 posts with average 20 comments each:
1 query: SELECT * FROM posts LIMIT 10
10 queries: SELECT * FROM users WHERE id = ? (one per post author)
10 queries: SELECT * FROM comments WHERE post_id = ? (one per post)
200 queries: SELECT * FROM users WHERE id = ? (one per comment author)
---
221 total queries
And that’s the N+1 problem. Actually it’s more like N+N*M+1. Brutal.
Page load time: 3.2 seconds. Most of it waiting on database queries.
DataLoader: The solution everyone mentions
Facebook’s DataLoader batches and caches database calls:
import DataLoader from 'dataloader';
const userLoader = new DataLoader(async (userIds) => {
const users = await db.query(
"SELECT * FROM users WHERE id IN (?)",
[userIds]
);
// Return users in same order as requested IDs
return userIds.map(id => users.find(u => u.id === id));
});
const resolvers = {
Post: {
author: (post) => userLoader.load(post.author_id) // Batched!
},
Comment: {
author: (comment) => userLoader.load(comment.author_id) // Batched!
}
};
Now instead of 210 individual SELECT * FROM users WHERE id = ? queries, we get:
-- One batched query for all unique user IDs
SELECT * FROM users WHERE id IN (1, 2, 3, 4, 5, 6, 7...)
Down to 13 queries total. Big improvement.
Page load: 800ms.
But DataLoader has gotchas
DataLoader caches per request. This bit us:
// User updates their profile
await db.query("UPDATE users SET name = ? WHERE id = ?", ["New Name", userId]);
// Immediately query for that user
const user = await userLoader.load(userId);
// Still returns old name! Cached from earlier in the request
Had to manually clear the cache:
await db.query("UPDATE users SET name = ? WHERE id = ?", ["New Name", userId]);
userLoader.clear(userId); // Clear cache for this ID
Easy to forget. Caused subtle bugs where data appeared stale.
The better solution: SQL joins
Why are we even doing separate queries? SQL can join this data:
const Query = {
posts: async () => {
const postsWithAuthors = await db.query(`
SELECT
p.*,
u.id as author_id,
u.name as author_name,
u.avatar as author_avatar
FROM posts p
LEFT JOIN users u ON p.author_id = u.id
LIMIT 10
`);
return postsWithAuthors.map(row => ({
id: row.id,
title: row.title,
author: {
id: row.author_id,
name: row.author_name,
avatar: row.author_avatar
}
}));
}
};
One query instead of 11. But comments are still separate queries. Can we join those too?
const Query = {
posts: async () => {
const rows = await db.query(`
SELECT
p.*,
author.id as author_id,
author.name as author_name,
c.id as comment_id,
c.text as comment_text,
commenter.id as commenter_id,
commenter.name as commenter_name
FROM posts p
LEFT JOIN users author ON p.author_id = author.id
LEFT JOIN comments c ON c.post_id = p.id
LEFT JOIN users commenter ON c.author_id = commenter.id
WHERE p.id IN (SELECT id FROM posts LIMIT 10)
`);
// Group rows by post
const postsMap = new Map();
rows.forEach(row => {
if (!postsMap.has(row.id)) {
postsMap.set(row.id, {
id: row.id,
title: row.title,
author: {
id: row.author_id,
name: row.author_name
},
comments: []
});
}
if (row.comment_id) {
postsMap.get(row.id).comments.push({
id: row.comment_id,
text: row.comment_text,
author: {
id: row.commenter_id,
name: row.commenter_name
}
});
}
});
return Array.from(postsMap.values());
}
};
Down to 1 query. But this approach has problems:
- Cartesian explosion - 10 posts × 20 comments = 200 rows returned for 10 posts
- Overfetching - We’re returning all comment data even if the client only asked for
comments { id } - Maintenance nightmare - Add a field to the schema? Update the SQL join and mapping logic
The actual solution: Prisma
Switched to Prisma ORM which handles this automatically:
const posts = await prisma.post.findMany({
take: 10,
include: {
author: true,
comments: {
include: {
author: true
}
}
}
});
Prisma generates this SQL:
-- Query 1: Get posts with authors
SELECT p.*, u.*
FROM posts p
LEFT JOIN users u ON p.author_id = u.id
LIMIT 10;
-- Query 2: Get all comments for these posts
SELECT c.*
FROM comments c
WHERE c.post_id IN (1, 2, 3, 4, 5, 6, 7, 8, 9, 10);
-- Query 3: Get all comment authors
SELECT u.*
FROM users u
WHERE u.id IN (SELECT DISTINCT author_id FROM comments WHERE post_id IN (...));
Three queries. No N+1 problem. No manual batching. Prisma figures it out.
Handling dynamic field selection
What if the client doesn’t request comments?
query Homepage {
posts {
id
title
# No comments field!
}
}
Prisma still fetches comments. Overfetching.
Solution: Use GraphQL field info:
import { GraphQLResolveInfo } from 'graphql';
import graphqlFields from 'graphql-fields';
const Query = {
posts: async (parent, args, context, info: GraphQLResolveInfo) => {
const fields = graphqlFields(info);
return prisma.post.findMany({
take: 10,
include: {
author: 'author' in fields,
comments: 'comments' in fields ? {
include: {
author: fields.comments?.author !== undefined
}
} : false
}
});
}
};
Now Prisma only fetches data the client actually requested. No overfetching.
Query complexity limits
Even with perfect resolvers, clients can still abuse GraphQL:
query Evil {
posts {
comments {
author {
posts {
comments {
author {
posts {
# Infinite nesting!
}
}
}
}
}
}
}
}
This crashes our server. Solution: Query complexity limits.
import { createComplexityLimitRule } from 'graphql-validation-complexity';
const server = new ApolloServer({
typeDefs,
resolvers,
validationRules: [
createComplexityLimitRule(1000, {
onCost: (cost) => console.log('Query cost:', cost),
scalarCost: 1,
objectCost: 2,
listFactor: 10
})
]
});
Complexity calculation:
- Each scalar field: 1 point
- Each object field: 2 points
- Each list: × 10 multiplier
Our posts query with authors and comments: ~250 points. Acceptable.
The evil nested query: ~50,000 points. Rejected.
Query depth limits
Also limit nesting depth:
import depthLimit from 'graphql-depth-limit';
const server = new ApolloServer({
validationRules: [
depthLimit(5) // Max 5 levels of nesting
]
});
Now the infinite nesting query fails validation before execution.
Monitoring query performance
Added Apollo Studio integration:
const server = new ApolloServer({
apollo: {
key: process.env.APOLLO_KEY,
graphRef: process.env.APOLLO_GRAPH_REF
},
plugins: [
ApolloServerPluginUsageReporting({
sendVariableValues: { all: true },
sendHeaders: { all: true }
})
]
});
Now we see:
- Which queries are slowest
- Which resolvers take the most time
- Query distribution (how often each query is called)
Discovered our searchPosts query was called 1000× more than expected. Client was polling it every second. Fixed by adding a subscription instead.
Caching with Redis
For expensive queries that don’t change often:
import { createRedisCache } from '@apollo/utils.keyvaluecache';
const server = new ApolloServer({
cache: createRedisCache({
host: 'redis',
port: 6379
})
});
const Query = {
popularPosts: async (_, __, { cache }) => {
const cacheKey = 'popularPosts';
const cached = await cache.get(cacheKey);
if (cached) {
return JSON.parse(cached);
}
const posts = await prisma.post.findMany({
orderBy: { views: 'desc' },
take: 10
});
await cache.set(cacheKey, JSON.stringify(posts), { ttl: 300 }); // 5min
return posts;
}
};
Popular posts change slowly. Cache for 5 minutes, save database load.
What actually fixed our performance
- Prisma - Handles N+1 automatically with smart query batching
- Field selection - Only fetch data the client requested
- Query limits - Prevent abusive queries from reaching resolvers
- Monitoring - Apollo Studio showed us which queries needed optimization
- Caching - Redis for expensive, cacheable queries
Current stats:
- Homepage: 3 database queries (down from 2,031)
- Page load: 180ms (down from 3.2s)
- Database CPU: 5% (down from 85%)
The key insight: GraphQL’s flexibility is great for clients, terrible for databases. You need an ORM or dataloading layer that understands GraphQL’s execution model and can batch queries intelligently.
DataLoader works but requires manual setup for every relationship. Prisma just handles it. We stopped fighting the framework and let the tools do the work.
Also turns out some of our slow queries were slow because of missing indexes, not N+1 problems. But that’s a different article.