Django’s select_related Didn’t Fix My N+1 Queries

Django’s select_related Didn’t Fix My N+1 Queries

Last month I had a slow page loading posts with their authors and comments. Django Debug Toolbar showed 847 database queries for 50 posts.

Classic N+1 problem. I knew the fix: use select_related() .

# Added select_related
posts = Post.objects.select_related('author', 'comments').all()

Ran the page again. Still 800+ queries. The optimization didn’t work.

Turns out, I used the wrong optimization. select_related() only works for foreign keys and one-to-one relationships. For reverse foreign keys (like comments on a post), you need prefetch_related() .

# The correct fix
posts = Post.objects.select_related('author').prefetch_related('comments')

Queries dropped from 847 to 3. Page load time went from 2.3 seconds to 0.15 seconds.

The problem? I didn’t understand when to use which optimization.

Understanding The Difference

Django has two query optimizations that look similar but work completely differently:

select_related: SQL JOIN

# One query with JOIN
posts = Post.objects.select_related('author')

# SQL: SELECT * FROM posts JOIN users ON posts.author_id = users.id
How it works:
Creates a SQL JOIN
Gets related data in a single query
Returns one big result set
Use for:
ForeignKey relationships (post → author)
OneToOneField relationships (user → profile)
Following relationships “forward” (from child to parent)

prefetch_related: Separate Queries

# Two queries total
posts = Post.objects.prefetch_related('comments')

# Query 1: SELECT * FROM posts
# Query 2: SELECT * FROM comments WHERE post_id IN (1, 2, 3, ...)
How it works:
Runs separate queries
Gets related data in batches
Joins data in Python
Use for:
ManyToManyField relationships (post ↔ tags)
Reverse ForeignKey relationships (post ← comments)
Following relationships “backward” (from parent to children)

Mistake 1: Using It For Reverse Relationships

The problem:

# Get posts with their comments
posts = Post.objects.select_related('comments').all()  # ❌ Doesn't work

for post in posts:
    for comment in post.comments.all():  # Still N+1 queries!
        print(comment.text)

What happens: select_related silently ignores 'comments' because it's a reverse relationship. You still get N+1 queries.

the fix:

# Use prefetch_related for reverse relationships
posts = Post.objects.prefetch_related('comments').all()  # ✅ Correct

for post in posts:
    for comment in post.comments.all():  # No extra queries
        print(comment.text)

Mistake 2: Using It For ManyToMany

The problem:

# Get posts with their tags
posts = Post.objects.select_related('tags').all()  # ❌ Doesn't work

for post in posts:
    for tag in post.tags.all():  # N+1 queries!
        print(tag.name)

What happens: select_related doesn't work with ManyToMany fields. Silently ignored.

the fix:

posts = Post.objects.prefetch_related('tags').all()  # ✅ Correct

Mistake 3: Forgetting Nested Relationships

The problem:

# Get posts with comments, but forget comment authors
posts = Post.objects.prefetch_related('comments').all()  # ❌ Incomplete

for post in posts:
    for comment in post.comments.all():
        print(comment.author.username)  # N+1 on comment authors!

What happens: You fixed post → comments, but not comments → authors.

the fix:

# Use double underscore for nested relationships
posts = Post.objects.prefetch_related(
    'comments',
    'comments__author'  # ✅ Fetch comment authors too
).all()

# Or chain them:
posts = Post.objects.prefetch_related('comments__author').all()

Mistake 4: Mixing Them Wrong

The problem:

# Post has author (FK) and tags (M2M)
posts = Post.objects.prefetch_related('author', 'tags').all()  # ❌ Wrong

# This works, but uses 3 queries when 2 would do:
# Query 1: Get posts
# Query 2: Get authors (should use JOIN instead)
# Query 3: Get tags

the fix:

# Use select_related for FK, prefetch_related for M2M
posts = Post.objects.select_related('author').prefetch_related('tags').all()  # ✅

# Query 1: Get posts with authors (JOIN)
# Query 2: Get tags

Mistake 5: Filtering After Prefetch

The problem:

posts = Post.objects.prefetch_related('comments').all()

for post in posts:
    # This creates new queries!
    recent_comments = post.comments.filter(created_at__gte=yesterday)  # ❌ N queries

What happens: Filtering on a prefetched queryset makes Django ignore the prefetch and query again.

the fix:

from django.db.models import Prefetch

# Filter in the Prefetch object
posts = Post.objects.prefetch_related(
    Prefetch(
        'comments',
        queryset=Comment.objects.filter(created_at__gte=yesterday),
        to_attr='recent_comments'
    )
).all()

for post in posts:
    for comment in post.recent_comments:  # ✅ Uses prefetched data
        print(comment.text)

Real-World Examples

Example 1: Blog Post List

Models:

class Author(models.Model):
    name = models.CharField(max_length=100)
    email = models.EmailField()

class Category(models.Model):
    name = models.CharField(max_length=50)

class Post(models.Model):
    title = models.CharField(max_length=200)
    author = models.ForeignKey(Author, on_delete=models.CASCADE)
    category = models.ForeignKey(Category, on_delete=models.CASCADE)
    tags = models.ManyToManyField('Tag', related_name='posts')

class Tag(models.Model):
    name = models.CharField(max_length=30)

class Comment(models.Model):
    post = models.ForeignKey(Post, on_delete=models.CASCADE, related_name='comments')
    author = models.ForeignKey(Author, on_delete=models.CASCADE)
    text = models.TextField()

Bad query (100+ queries):

def post_list(request):
    posts = Post.objects.all()  # ❌ No optimization
    
    # In template or loop:
    for post in posts:
        print(post.author.name)       # Query per post
        print(post.category.name)     # Query per post
        for tag in post.tags.all():   # Query per post
            print(tag.name)
        for comment in post.comments.all():  # Query per post
            print(comment.author.name)       # Query per comment

Good query (5 queries total):

def post_list(request):
    posts = Post.objects.select_related(
        'author',      # ✅ FK: Use JOIN
        'category'     # ✅ FK: Use JOIN
    ).prefetch_related(
        'tags',                    # ✅ M2M: Separate query
        'comments__author'         # ✅ Reverse FK + nested FK
    ).all()
    
    # Now all data is loaded:
    # Query 1: Posts + authors + categories (JOIN)
    # Query 2: Tags
    # Query 3: Comments
    # Query 4: Comment authors

Example 2: E-commerce Order Page

Models:

class Customer(models.Model):
    name = models.CharField(max_length=100)
    email = models.EmailField()

class Order(models.Model):
    customer = models.ForeignKey(Customer, on_delete=models.CASCADE)
    created_at = models.DateTimeField(auto_now_add=True)

class OrderItem(models.Model):
    order = models.ForeignKey(Order, on_delete=models.CASCADE, related_name='items')
    product = models.ForeignKey('Product', on_delete=models.CASCADE)
    quantity = models.IntegerField()
    price = models.DecimalField(max_digits=10, decimal_places=2)

class Product(models.Model):
    name = models.CharField(max_length=200)
    category = models.ForeignKey('Category', on_delete=models.CASCADE)

Optimized query:

def order_detail(request, order_id):
    order = Order.objects.select_related(
        'customer'  # ✅ Get customer with order (FK)
    ).prefetch_related(
        'items__product__category'  # ✅ Get items → products → categories
    ).get(id=order_id)
    
    # Renders complete order with all data in 3 queries:
    # Query 1: Order + customer (JOIN)
    # Query 2: Order items
    # Query 3: Products + categories (JOIN)
    
    return render(request, 'order_detail.html', {'order': order})

Example 3: Social Media Feed

Models:

class User(models.Model):
    username = models.CharField(max_length=50)
    followers = models.ManyToManyField('self', symmetrical=False, related_name='following')

class Post(models.Model):
    user = models.ForeignKey(User, on_delete=models.CASCADE, related_name='posts')
    content = models.TextField()
    likes = models.ManyToManyField(User, related_name='liked_posts')

class Comment(models.Model):
    post = models.ForeignKey(Post, on_delete=models.CASCADE, related_name='comments')
    user = models.ForeignKey(User, on_delete=models.CASCADE)
    text = models.TextField()

Optimized feed query:

def user_feed(request):
    user = request.user
    
    # Get posts from people user follows
    posts = Post.objects.filter(
        user__in=user.following.all()
    ).select_related(
        'user'  # ✅ Post author (FK)
    ).prefetch_related(
        'likes',           # ✅ Users who liked (M2M)
        'comments__user'   # ✅ Comments + comment authors
    ).order_by('-created_at')[:20]
    
    return render(request, 'feed.html', {'posts': posts})

Advanced: Custom Prefetch Queries

Filtering Prefetched Data

from django.db.models import Prefetch
from datetime import timedelta
from django.utils import timezone

# Only prefetch approved comments from last 7 days
recent_date = timezone.now() - timedelta(days=7)

posts = Post.objects.prefetch_related(
    Prefetch(
        'comments',
        queryset=Comment.objects.filter(
            approved=True,
            created_at__gte=recent_date
        ).select_related('author'),  # Nested optimization
        to_attr='recent_approved_comments'
    )
).all()

for post in posts:
    for comment in post.recent_approved_comments:  # Uses filtered prefetch
        print(comment.text)

Multiple Prefetch Queries

# Get both all comments AND top comments separately
posts = Post.objects.prefetch_related(
    'comments',  # All comments
    Prefetch(
        'comments',
        queryset=Comment.objects.filter(likes__gte=10).order_by('-likes')[:5],
        to_attr='top_comments'
    )
).all()

for post in posts:
    print(f"Total: {post.comments.count()}")  # All comments
    print("Top comments:")
    for comment in post.top_comments:  # Only top 5
        print(comment.text)

Annotating Prefetched Data

from django.db.models import Count

# Prefetch comments with like counts
posts = Post.objects.prefetch_related(
    Prefetch(
        'comments',
        queryset=Comment.objects.annotate(
            like_count=Count('likes')
        ).order_by('-like_count')
    )
).all()

for post in posts:
    for comment in post.comments.all():
        print(f"{comment.text} ({comment.like_count} likes)")

Performance Comparison

Let me show you real numbers from a test with 100 posts:

No optimization:
posts = Post.objects.all()
# 1 + 100 + 100 + 500 = 701 queries
# Load time: 3.2 seconds
Only select_related:
posts = Post.objects.select_related('author', 'category')
# 1 + 100 + 500 = 601 queries (only saved 100)
# Load time: 2.8 seconds
Only prefetch_related:
posts = Post.objects.prefetch_related('tags', 'comments')
# 1 + 100 + 2 = 103 queries (better, but still N queries for authors)
# Load time: 0.5 seconds
Both optimizations:
posts = Post.objects.select_related(
    'author', 'category'
).prefetch_related(
    'tags', 'comments__author'
)
# 1 + 2 + 1 = 4 queries (optimal!)
# Load time: 0.08 seconds

Debugging Query Optimization

Use Django Debug Toolbar

pip install django-debug-toolbar
# settings.py
INSTALLED_APPS += ['debug_toolbar']
MIDDLEWARE += ['debug_toolbar.middleware.DebugToolbarMiddleware']

# Shows:
# - Number of queries
# - Duplicate queries
# - Slow queries
# - Exact SQL executed

Use QuerySet.explain()

model:

class Publisher(models.Model):
    name = models.CharField(max_length=300)

    def __str__(self):
        return self.name

class Book(models.Model):
    name = models.CharField(max_length=300)
    price = models.IntegerField(default=0)
    publisher = models.ForeignKey(Publisher, on_delete=models.CASCADE)

    class Meta:
        default_related_name = 'books'

    def __str__(self):
        return self.name

class Store(models.Model):
    name = models.CharField(max_length=300)
    books = models.ManyToManyField(Book)

    class Meta:
        default_related_name = 'stores'

    def __str__(self):
        return self.name
books = Book.objects.select_related('publisher')
print(books.explain())
# Output shows:
# - Join operations
# - Index usage
# - Estimated rows

# Output:
3 0 216 SCAN app_bookstore_book
5 0 45 SEARCH app_bookstore_publisher USING INTEGER PRIMARY KEY (rowid=?)

Count Queries Manually

from django.test.utils import override_settings
from django.db import connection
from django.db import reset_queries

@override_settings(DEBUG=True)
def test_queries():
    reset_queries()
    
    # Your code here
    posts = Post.objects.select_related('author').all()
    list(posts)  # Force evaluation
    
    print(f"Queries: {len(connection.queries)}")
    for query in connection.queries:
        print(query['sql'])

Quick Decision Guide

Use select_related when:
Accessing a ForeignKey: post.author
Accessing a OneToOneField: user.profile
You control the relationship (not a reverse relation)
You want one query with JOIN
Use prefetch_related when:
Accessing reverse ForeignKey: post.comments
Accessing ManyToManyField: post.tags
You need to filter the related data
You want separate queries
Use both when:
You have mixed relationship types
Example: select_related('author').prefetch_related('tags')

Common Patterns

Pattern 1: List View

# List of items with related data
items = Item.objects.select_related(
    'owner',
    'category'
).prefetch_related(
    'tags',
    'images'
)

Pattern 2: Detail View

# Single item with all related data
item = Item.objects.select_related(
    'owner__profile',  # Nested: owner and their profile
    'category__parent'  # Nested: category and parent category
).prefetch_related(
    'tags',
    'images',
    'reviews__author'  # Reviews and review authors
).get(id=item_id)

Pattern 3: API Serialization

# Optimize for JSON serialization
def get_posts_json(request):
    posts = Post.objects.select_related(
        'author'
    ).prefetch_related(
        'tags',
        'comments__author'
    ).all()
    
    data = [{
        'title': post.title,
        'author': post.author.name,
        'tags': [tag.name for tag in post.tags.all()],
        'comments': [{
            'text': c.text,
            'author': c.author.name
        } for c in post.comments.all()]
    } for post in posts]
    
    return JsonResponse({'posts': data})

Testing Query Counts

from django.test import TestCase
from django.test.utils import override_settings

class QueryOptimizationTest(TestCase):
    @override_settings(DEBUG=True)
    def test_post_list_queries(self):
        # Create test data
        author = Author.objects.create(name='Test')
        post = Post.objects.create(author=author, title='Test')
        Comment.objects.create(post=post, author=author, text='Test')
        
        from django.db import connection
        from django.db import reset_queries
        
        reset_queries()
        
        # Test optimized query
        posts = Post.objects.select_related('author').prefetch_related('comments')
        list(posts)  # Force evaluation
        
        # Should be 2 queries: posts+authors, comments
        self.assertEqual(len(connection.queries), 2)

Key Takeaways

  • select_related = SQL JOIN = ForeignKey and OneToOne = Forward relationships
  • prefetch_related = Separate queries = ManyToMany and reverse FK = Backward relationships
  • Use double underscore for nested relationships: comments__author
  • Combine both for mixed relationships: select_related('author').prefetch_related('tags')
  • Use Prefetch object when you need to filter or customize related data
  • Always test with Django Debug Toolbar to verify optimization worked
  • Profile in production to find real bottlenecks

The Bottom Line

Query optimization isn’t optional. A page with N+1 queries will get slower as your data grows.

The rules are simple:
ForeignKey going forward? select_related()
Reverse relationships or ManyToMany? prefetch_related()
Mixed? Use both
Need filtering? Use Prefetch()

Learn these patterns once. Apply them everywhere. Your database will thank you.

SUBSCRIBE FOR NEW ARTICLES

@
comments powered by Disqus