How to use select_related and prefetch_related to optimize performance in Django Rest Framework (DRF)

November 23, 2020 Saumil Patel 4 min read

If you've ever used Django you're aware of the excessive amount of queries that are made, specifically when you are accessing related objects.

Thankfully, Django has an answer for that in the form prefetch_related and select_related.

So what do these functions do?

As described by Django documentation:

select_related works by creating an SQL join and including the fields of the related object in the SELECT statement. For this reason, select_related gets the related objects in the same database query. However, to avoid the much larger result set that would result from joining across a ‘many’ relationship, select_related is limited to single-valued relationships - foreign key and one-to-one.

prefetch_related, on the other hand, does a separate lookup for each relationship, and does the ‘joining’ in Python. This allows it to prefetch many-to-many and many-to-one objects, which cannot be done using select_related, in addition to the foreign key and one-to-one relationships that are supported by select_related. It also supports prefetching of GenericRelation and GenericForeignKey, however, it must be restricted to a homogeneous set of results. For example, prefetching objects referenced by a GenericForeignKey is only supported if the query is restricted to one ContentType.

How do we use these functions?

These functions can / should be applied to a QuerySet where you are fetching related objects. Let's take some simple models for example:

from django.db import models

class Musician(models.Model):
    first_name = models.CharField(max_length=50)
    last_name = models.CharField(max_length=50)
    instrument = models.CharField(max_length=100)

class Album(models.Model):
    artist = models.ForeignKey(Musician, on_delete=models.CASCADE)
    name = models.CharField(max_length=100)
    release_date = models.DateField()
    num_stars = models.IntegerField()

With the above models of Musician and Album you'd be using select_related when retrieving the artist in an Album QuerySet: Album.objects.select_related("artist")

And prefetch_related when retrieving artist_set in a Musician QuerySet:

Implementation in Django Rest Framework

A common practice I've seen used in Django Rest Framework is to inherit from an EagerLoadingMixin as defined below.

class EagerLoadingMixin:
    def setup_eager_loading(cls, queryset):
        This function allow dynamic addition of the related objects to
        the provided query.
        @parameter param1: queryset

        if hasattr(cls, "select_related_fields"):
            queryset = queryset.select_related(*cls.select_related_fields)
        if hasattr(cls, "prefetch_related_fields"):
            queryset = queryset.prefetch_related(*cls.prefetch_related_fields)
        return queryset

Once inherited into your Serializer you would define the necessary prefetch_related_fields and select_related_fields.

class AlbumSerializer(serializers.ModelSerializer, EagerLoadingMixin):
    artist = MusicianSerializer(many=False)
    select_related_fields = ('artist',)
    prefetch_related_fields = ()  # Only necessary if you have fields to prefetch
    class Meta:
        model = Album
        fields = ('id', 'artist', 'name', 'release_date', 'num_stars',)

The above implementation will allow you to then quickly gather all required related fields in your get_queryset method in your View like so.

def get_queryset(self):
    queryset = Album.objects.all()
    return self.get_serializer_class().setup_eager_loading(queryset)

Congratulations, you've just implemented one of the simplest ways to improve performance in Django.

If you have any further questions feel free to reach out to me on twitter @RealSaumilP.