How to use select_related and prefetch_related to optimize performance in Django Rest Framework (DRF)
November 23, 2020 Saumil Patel 4 min read
If you've ever used Django you're aware of the excessive amount of queries that are made, specifically when you are accessing related objects.
Thankfully, Django has an answer for that in the form prefetch_related
and select_related
.
So what do these functions do?
As described by Django documentation:
select_related
works by creating an SQL join and including the fields of the related object in the SELECT statement. For this reason, select_related
gets the related objects in the same database query. However, to avoid the much larger result set that would result from joining across a ‘many’ relationship, select_related
is limited to single-valued relationships - foreign key and one-to-one.
prefetch_related
, on the other hand, does a separate lookup for each relationship, and does the ‘joining’ in Python. This allows it to prefetch many-to-many and many-to-one objects, which cannot be done using select_related
, in addition to the foreign key and one-to-one relationships that are supported by select_related
. It also supports prefetching of GenericRelation and GenericForeignKey, however, it must be restricted to a homogeneous set of results. For example, prefetching objects referenced by a GenericForeignKey is only supported if the query is restricted to one ContentType.
https://docs.djangoproject.com/en/3.1/ref/models/querysets/#select-related
https://docs.djangoproject.com/en/3.1/ref/models/querysets/#prefetch-related
How do we use these functions?
These functions can / should be applied to a QuerySet where you are fetching related objects. Let's take some simple models for example:
from django.db import models
class Musician(models.Model):
first_name = models.CharField(max_length=50)
last_name = models.CharField(max_length=50)
instrument = models.CharField(max_length=100)
class Album(models.Model):
artist = models.ForeignKey(Musician, on_delete=models.CASCADE)
name = models.CharField(max_length=100)
release_date = models.DateField()
num_stars = models.IntegerField()
With the above models of Musician
and Album
you'd be using select_related
when retrieving the artist
in an Album
QuerySet: Album.objects.select_related("artist")
And prefetch_related
when retrieving artist_set
in a Musician
QuerySet:Musician.objects.prefetch_related("artist_set")
Implementation in Django Rest Framework
A common practice I've seen used in Django Rest Framework is to inherit from an EagerLoadingMixin
as defined below.
class EagerLoadingMixin:
@classmethod
def setup_eager_loading(cls, queryset):
"""
This function allow dynamic addition of the related objects to
the provided query.
@parameter param1: queryset
"""
if hasattr(cls, "select_related_fields"):
queryset = queryset.select_related(*cls.select_related_fields)
if hasattr(cls, "prefetch_related_fields"):
queryset = queryset.prefetch_related(*cls.prefetch_related_fields)
return queryset
Once inherited into your Serializer you would define the necessary prefetch_related_fields
and select_related_fields
.
class AlbumSerializer(serializers.ModelSerializer, EagerLoadingMixin):
artist = MusicianSerializer(many=False)
select_related_fields = ('artist',)
prefetch_related_fields = () # Only necessary if you have fields to prefetch
class Meta:
model = Album
fields = ('id', 'artist', 'name', 'release_date', 'num_stars',)
The above implementation will allow you to then quickly gather all required related fields in your get_queryset
method in your View like so.
def get_queryset(self):
queryset = Album.objects.all()
return self.get_serializer_class().setup_eager_loading(queryset)
Congratulations, you've just implemented one of the simplest ways to improve performance in Django.
If you have any further questions feel free to reach out to me on twitter @RealSaumilP.