Performance EngineeringDjangoSQLPerformanceBackend

Optimizing a Critical API from 4s to 0.5s

How I investigated Django ORM bottlenecks, refactor query, and cut API latency by 87% at a fintech platform.

Impact: Reduced average latency from ~4s to 0.5s

2026-05-01

The Up And Running List API

At a fintech payment gateway platform, I found a list API when I explpore and learn more about the system. Looking at the log, it was being used many time, and pay more attention on it, I found that it take around 4 to 5 seconds process. That might be a potental bottleneck that we can improve. I called that API multiple times, with different set of query params, and the result is the same, approximately 4 seconds.

Carefully look at the code for that API, it is 184 lines long View and the present of the "bad old" friend, the N+1 query issue.

Sample ORM models and query

For demonstration purpose, we will use the following Django ORMs. You can expand to view the full definition.

We will have a sample View to list books, for each book, we need to fetch its publisher, authors, genres and the number of orders for that book. Below is the serializer definition. As you can see, we implement sub-serializer for publisher, authors, genres, and a custom field for total_orders.

Serializer
text
from rest_framework import serializers
 
from book.models import Author, Book, Genre, Publisher
 
 
class AuthorSerializer(serializers.ModelSerializer):
    class Meta:
        model = Author
        fields = ["id", "name", "email"]
 
 
class GenreSerializer(serializers.ModelSerializer):
    class Meta:
        model = Genre
        fields = ["id", "name", "description"]
 
 
class PublisherSerializer(serializers.ModelSerializer):
    class Meta:
        model = Publisher
        fields = ["id", "name", "address", "website"]
 
 
class BookSerializer(serializers.ModelSerializer):
    authors = AuthorSerializer(many=True, read_only=True)
    genres = GenreSerializer(many=True, read_only=True)
    publisher = PublisherSerializer(read_only=True)
    total_orders = serializers.IntegerField(read_only=True)
 
    class Meta:
        model = Book
        fields = [
            "id",
            "title",
            "published_date",
            "isbn",
            "authors",
            "genres",
            "publisher",
            "total_orders",
        ]

For the view, we implement the _attach_total_orders method on purpose to show N+1 query issue in a clearer context.

View
text
from django.db.models import Count
from rest_framework.generics import ListAPIView
from rest_framework.pagination import PageNumberPagination
from rest_framework.response import Response
from book.models import Book
from book.serializers import BookSerializer
 
 
class BookPagination(PageNumberPagination):
    page_size = 5
    page_size_query_param = "page_size"
    max_page_size = 50
 
 
class BookListView(ListAPIView):
    queryset = Book.objects.all().order_by("id")
    serializer_class = BookSerializer
    pagination_class = BookPagination
 
    def _attach_total_orders(self, books):
        for book in books:
            book.total_orders = book.order_items.aggregate(total=Count("id"))["total"]
 
    def list(self, request, *args, **kwargs):
        queryset = self.filter_queryset(self.get_queryset())
        page = self.paginate_queryset(queryset)
        if page is not None:
            self._attach_total_orders(page)
            serializer = self.get_serializer(page, many=True)
            return self.get_paginated_response(serializer.data)
 
        self._attach_total_orders(queryset)
        serializer = self.get_serializer(queryset, many=True)
        return Response(serializer.data)
 

Finally, we make a request to the endpoint with the page size of 5, and get the following log:

Code
text
--- First part
(0.000) SELECT COUNT(*) AS "__count" FROM "book_book"; args=(); alias=default
(0.000) SELECT "book_book"."id", "book_book"."title", "book_book"."published_date", "book_book"."isbn", "book_book"."publisher_id" FROM "book_book" ORDER BY "book_book"."id" ASC LIMIT 5; args=(); alias=default
 
--- Second part
(0.000) SELECT COUNT("book_orderitem"."id") AS "total" FROM "book_orderitem" WHERE "book_orderitem"."book_id" = 1; args=(1,); alias=default
(0.000) SELECT COUNT("book_orderitem"."id") AS "total" FROM "book_orderitem" WHERE "book_orderitem"."book_id" = 2; args=(2,); alias=default
(0.000) SELECT COUNT("book_orderitem"."id") AS "total" FROM "book_orderitem" WHERE "book_orderitem"."book_id" = 3; args=(3,); alias=default
(0.000) SELECT COUNT("book_orderitem"."id") AS "total" FROM "book_orderitem" WHERE "book_orderitem"."book_id" = 4; args=(4,); alias=default
(0.000) SELECT COUNT("book_orderitem"."id") AS "total" FROM "book_orderitem" WHERE "book_orderitem"."book_id" = 5; args=(5,); alias=default
 
--- Third part
(0.001) SELECT "book_author"."id", "book_author"."name", "book_author"."email" FROM "book_author" INNER JOIN "book_book_authors" ON ("book_author"."id" = "book_book_authors"."author_id") WHERE "book_book_authors"."book_id" = 1; args=(1,); alias=default
(0.000) SELECT "book_genre"."id", "book_genre"."name", "book_genre"."description" FROM "book_genre" INNER JOIN "book_book_genres" ON ("book_genre"."id" = "book_book_genres"."genre_id") WHERE "book_book_genres"."book_id" = 1; args=(1,); alias=default
(0.000) SELECT "book_publisher"."id", "book_publisher"."name", "book_publisher"."address", "book_publisher"."website" FROM "book_publisher" WHERE "book_publisher"."id" = 2 LIMIT 21; args=(2,); alias=default
 
(0.000) SELECT "book_author"."id", "book_author"."name", "book_author"."email" FROM "book_author" INNER JOIN "book_book_authors" ON ("book_author"."id" = "book_book_authors"."author_id") WHERE "book_book_authors"."book_id" = 2; args=(2,); alias=default
(0.000) SELECT "book_genre"."id", "book_genre"."name", "book_genre"."description" FROM "book_genre" INNER JOIN "book_book_genres" ON ("book_genre"."id" = "book_book_genres"."genre_id") WHERE "book_book_genres"."book_id" = 2; args=(2,); alias=default
(0.000) SELECT "book_publisher"."id", "book_publisher"."name", "book_publisher"."address", "book_publisher"."website" FROM "book_publisher" WHERE "book_publisher"."id" = 3 LIMIT 21; args=(3,); alias=default
 
(0.000) SELECT "book_author"."id", "book_author"."name", "book_author"."email" FROM "book_author" INNER JOIN "book_book_authors" ON ("book_author"."id" = "book_book_authors"."author_id") WHERE "book_book_authors"."book_id" = 3; args=(3,); alias=default
(0.000) SELECT "book_genre"."id", "book_genre"."name", "book_genre"."description" FROM "book_genre" INNER JOIN "book_book_genres" ON ("book_genre"."id" = "book_book_genres"."genre_id") WHERE "book_book_genres"."book_id" = 3; args=(3,); alias=default
(0.000) SELECT "book_publisher"."id", "book_publisher"."name", "book_publisher"."address", "book_publisher"."website" FROM "book_publisher" WHERE "book_publisher"."id" = 4 LIMIT 21; args=(4,); alias=default
 
(0.000) SELECT "book_author"."id", "book_author"."name", "book_author"."email" FROM "book_author" INNER JOIN "book_book_authors" ON ("book_author"."id" = "book_book_authors"."author_id") WHERE "book_book_authors"."book_id" = 4; args=(4,); alias=default
(0.000) SELECT "book_genre"."id", "book_genre"."name", "book_genre"."description" FROM "book_genre" INNER JOIN "book_book_genres" ON ("book_genre"."id" = "book_book_genres"."genre_id") WHERE "book_book_genres"."book_id" = 4; args=(4,); alias=default
(0.000) SELECT "book_publisher"."id", "book_publisher"."name", "book_publisher"."address", "book_publisher"."website" FROM "book_publisher" WHERE "book_publisher"."id" = 5 LIMIT 21; args=(5,); alias=default
 
(0.000) SELECT "book_author"."id", "book_author"."name", "book_author"."email" FROM "book_author" INNER JOIN "book_book_authors" ON ("book_author"."id" = "book_book_authors"."author_id") WHERE "book_book_authors"."book_id" = 5; args=(5,); alias=default
(0.000) SELECT "book_genre"."id", "book_genre"."name", "book_genre"."description" FROM "book_genre" INNER JOIN "book_book_genres" ON ("book_genre"."id" = "book_book_genres"."genre_id") WHERE "book_book_genres"."book_id" = 5; args=(5,); alias=default
(0.000) SELECT "book_publisher"."id", "book_publisher"."name", "book_publisher"."address", "book_publisher"."website" FROM "book_publisher" WHERE "book_publisher"."id" = 1 LIMIT 21; args=(1,); alias=default

Looking at this log for queries that are being executed, we can see that:

  • The first part contains two queries, one for counting total number of books, and fetch all the actual books. These two queries is neccesary, and we can not reduce them.
  • The second part is 5 queries come from the _attach_total_orders method. As we are fetching 5 books, we loop through them, each time we make an extra query to database.
  • In the third part, we can see that we have a group of queries to author, genre and publisher table. This group appear 5 times. The reason behind this is that the serializer class will access to these relation to have data using in the nested-serializer class, and because we don't fetch these data in advance, each time we access a relation, an addtional query will be executed. Note that the number of queries will grow as the number of items fetched in the first part. If there are N items in the main query, there will be N queries correspondly.

Given this pattern, we can see that the number of queries will correspond with the number of records in the main query. The more instances fetched in the main query, the more auxilary queries will be made.

N+1 Query

N+1 Query issue happen when we have a main query to fetch a list of records, then for each record, we make subsequent query. Given 1 main query to fetch N records, we will have N other queries for each records. All of that sum up equal to N+1 queries have been executed.

There are many reasons behind this issue. In the context of Django, it mostly happen because of accessing relationship column, such as foreign key or many-to-many relationship. It can also happen on intention, such as when developer do aggregation for each record within the list. We will discuss about how to avoid this issue from Django perspective.

Tackle this problem

select_related

The first solution is for one-to-many (ForeignKey) and one-to-one (OneToOneField) relationship. In our example, it is the relationship between Book and Publisher. It is very common to access the publisher relation from book object without reload it, making addtional queries to be executed.

In our example where the ForeignKey has null=False, select_related will apply an inner join operation from the main model to the target model within the select_related clause. If the ForeignKey or OneToOneField has null=True, using select_related will generate a left outer join instead.

Here is the modification to apply select_related:

View
text
...
class BookListView(ListAPIView):
    queryset = Book.objects.select_related("publisher").all().order_by("id")
    serializer_class = BookSerializer
...

The new query for query all the book will change to:

Code
text
SELECT
    "book_book"."id",
    "book_book"."title",
    "book_book"."published_date",
    "book_book"."isbn",
    "book_book"."publisher_id",
    "book_publisher"."id",
    "book_publisher"."name",
    "book_publisher"."address",
    "book_publisher"."website"
FROM
    "book_book"
    INNER JOIN "book_publisher" ON (
        "book_book"."publisher_id" = "book_publisher"."id"
    )
ORDER BY
    "book_book"."id" ASC
LIMIT
    5;

prefetch_related

The second approach is prefetch_related, which is use for many-to-many relationships. In this approach, a new query will be generated for each relation within the prefetch_related function, filter by all the records that fetched from the main list query.

We will use prefetch_related for genres and authors in our example. By apply this function, we can fix the number of query to 2, one for genres and another for authors. It will not grow correspond with the number of record in the main list anymore.

Here is how we apply prefetch_related:

View
text
...
class BookListView(ListAPIView):
    queryset = (
        Book.objects.select_related("publisher")
        .prefetch_related("authors", "genres")
        .all()
        .order_by("id")
    )
    serializer_class = BookSerializer
...

Here are our two new queries for fetching genres and authors in advance:

Code
text
SELECT
    ("book_book_authors"."book_id") AS "_prefetch_related_val_book_id",
    "book_author"."id",
    "book_author"."name",
    "book_author"."email"
FROM
    "book_author"
    INNER JOIN "book_book_authors" ON (
        "book_author"."id" = "book_book_authors"."author_id"
    )
WHERE
    "book_book_authors"."book_id" IN (1, 2, 3, 4, 5);
 
SELECT
    ("book_book_genres"."book_id") AS "_prefetch_related_val_book_id",
    "book_genre"."id",
    "book_genre"."name",
    "book_genre"."description"
FROM
    "book_genre"
    INNER JOIN "book_book_genres" ON ("book_genre"."id" = "book_book_genres"."genre_id")
WHERE
    "book_book_genres"."book_id" IN (1, 2, 3, 4, 5);

Subquery and annotations

The final approach will resolve the issue within _attach_total_orders method. In this case, annotations can be used to aggregate data from other relation into the main query. In our example, it is the number of orders for that book. Here is the revise version:

Code
text
...
class BookListView(ListAPIView):
    queryset = (
        Book.objects.select_related("publisher")
        .prefetch_related("authors", "genres")
        .annotate(total_orders=Count("order_items"))
        .all()
        .order_by("id")
    )
    serializer_class = BookSerializer
...

And the generated query will be as follow:

Code
text
SELECT
    "book_book"."id",
    "book_book"."title",
    "book_book"."published_date",
    "book_book"."isbn",
    "book_book"."publisher_id",
    COUNT("book_orderitem"."id") AS "total_orders",
    "book_publisher"."id",
    "book_publisher"."name",
    "book_publisher"."address",
    "book_publisher"."website"
FROM
    "book_book"
    LEFT OUTER JOIN "book_orderitem" ON ("book_book"."id" = "book_orderitem"."book_id")
    INNER JOIN "book_publisher" ON (
        "book_book"."publisher_id" = "book_publisher"."id"
    )
GROUP BY
    "book_book"."id",
    "book_book"."title",
    "book_book"."published_date",
    "book_book"."isbn",
    "book_book"."publisher_id",
    "book_publisher"."id",
    "book_publisher"."name",
    "book_publisher"."address",
    "book_publisher"."website"
ORDER BY
    "book_book"."id" ASC
LIMIT
    5;

As you can see, the above annotation fix the N+1 issue for the aggregation logic, but let extend it further. The new requirements come:

  1. The total orders within our example will only count number of orders from one year ago until now.
  2. We need to include the latest purchase date for each book, also in the one year window.

With this new requirements, we need a way to include logic within the annotation clause. As you can see from the above example, Count and its neighor such as Sum, Avg is good for aggregation. But for now, we need a way to get data for related record, this is where Subquery shine. We will define a query to fetch the latest OrderItem for each book, retrieve the first created_at value as the latest purchase date, and return null if there is no order at all.

View
text
...
class BookListView(ListAPIView):
    serializer_class = BookSerializer
    pagination_class = BookPagination
 
    def get_queryset(self):
        one_year_ago = timezone.now() - timedelta(days=365)
        queryset = (
            Book.objects.select_related("publisher")
            .prefetch_related("authors", "genres")
            .annotate(
                total_orders=Subquery(
                    OrderItem.objects.filter(book=OuterRef("pk"))
                    .values("book")
                    .filter(created_at__gte=one_year_ago)
                    .annotate(count=Count("id"))
                    .values("count")
                )
            )  # noqa: F821
            .all()
            .order_by("id")
        )
        return queryset
...

And the generated query will be as follow:

sql
sql
 

annotation equips us with a lot of tools for query and retrieve extra data. Depend on the requirements, we can use Count, Sum, Avg, ... for aggregation, Subquery for retrieving data from related record, or even related object using Prefetch class.

Unit test for query count

The solutions dicussed so far will let us

An intesting finding about composite index

When troubleshoot the issue further, I found an interesting finding about composite index and its usage. As in our example, books table has a composite index of (id, isbn, publisher_id).

Assume that we will have a query against every subset of this index in that order, the index will be in used. For example, a query with where clause using (id + isbn + publisher_id), or (isbn + publisher_id) or

Final thought