Many small things can slow your backend server; And while we would love to cover them all, that would be impossible in one single article.
So, today we will focus on the most usual suspect: the n+1 queries problem.
Let’s start with the beginning, what’s an n+1 query and why is it an actual problem?
The n+1 issue is a pretty common mistake that can slow down your queries pretty hard. The problem happens when your code executes N extra statements to fetch data that was available and could’ve been fetched through the first statement.
Obviously, this means that the larger the N, the slower your machine will get.
To give you an example, when having a one-to-many relationship and we want to get all parent entries with the child field(s), ORMs and poorly designed queries will create one query to retrieve all the parent entries, and then for each entry a different query to retrieve child data.
This issue will generate very many queries, which will increase latency time and slow down performance a lot. Besides that, they are hard to point to, as each individual query will run fast enough and won’t be detected by benchmarking tools.
Without considering a specific technology, ORM, or platform, the general idea to avoid this issue would be to use
JOINS will force the lazy-loaded object (child) to be fetched in the initial query, instead of needing a new query for each entry for that.
Many frameworks already have libraries that can either detect n+1 queries or override ORM relationships to be lazy.
For the next few paragraphs we are going to use a very simple example in Rails, that is prone to n+1 queries, but can also be easily handled. We created a
Parent class and a
Child class, the association between those 2 being pretty obvious: a one-to-many relationship.
class Parent < ApplicationRecord
class Child < ApplicationRecord
Just as mentioned above,
JOINS is a good fix for avoiding n+1 queries.
Fortunately, Rails and ActiveRecord have a great method for this:
includes. The core functionality of this method is to eager-load the mentioned associations. So in our Rails serializers and controllers, it would be natural for us to do something like:
class ParentSerializer < ActiveModel::Serializer
attributes :id, :string_field
render json: Parent.all
Our serializer’s goal is to also render the list of
Parent has. When doing this, because of how ActiveRecord works, a new query will be executed for each
Parent to serialize the
For this example, our database was seeded with 21
Parent instances, and as we can see this logic will create a new query for each one of them. Imagine having 10k entries in the database
The simplest fix of them all will be to add just a few more characters in our method:
# old: render json: Parent.all
render json: Parent.includes(:children).all
This code will now have the log output:
Now, this is an obvious improvement. Just when watching the queries executed in the logs, we can see that we have only 1 instead of N (number of
Parent entries) and the initial one to get all Parents.
How does includes work?
The includes method works in 2 ways, depending on the query that will be executed.
Most of the time it will use preload, but in certain cases, it can also use eager_load. ActiveRecord will automatically use preload unless the associations are also used in another clause (like where).
The nice thing about includes is that you can use it to preload multiple relations, like
Parent.includes(:child1, :child2, :child3). It can also load nested associations, using
Parent.includes(child: :grandchild), and even
Parent.includes(child: [grandchild: :great_grandchild])
There are 3 main methods that Rails and ActiveRecord provide for us when we are handling references: * includes * references * joins
We already mentioned
includes, the wonder method that helps us avoid n+1 queries by preloading the associated models. It is mostly used when wanting to retrieve these models, like in the example above, where the parent serializer also includes the children.
The references method is used together with includes, as it triggers the tables to be JOINed rather than loaded separately. The joins method is used when the associated table is needed in a where or order clause, but they don’t need to be preloaded as they won’t be rendered.
Catching n+1 queries
I get it, we aren’t thinking about n+1 queries with each line of code written. And we shouldn’t.. we have third-party libraries for that.
Obviously, we can also do an eye test: just like in the screenshots above, debugging a slow controller method to see how many queries are done might be a good start, they will usually be very obvious.
For the rest of us that want an easy way to catch n+1 queries, many libraries can help with that.
The one that I use the most is Bullet, which is also the most popular Rails option.
The setup is very easy, as the basic setup only requires these few lines:
Bullet.enable = true
Bullet.bullet_logger = true
Bullet.raise = true # raise an error & fail test if an n+1 query occurs end
We want to add this only to the test environment, so we can find them when running the tests locally and in our CI workflow.
Obviously, Bullet does allow many other options to be added, even whitelisting associations to avoid tests failing.
Does it matter? (benchmarks)
There are many benchmarking resources on n+1 queries already available.
For this article I will just try to locally benchmark a few queries, so we can see how much this issue can slow even when talking about just a few entries.
Examples & benchmarking
For these benchmarks, we will use the same example as above: a Parent and a Child models.
The tests that I plan to do are having 20, 100 and 1000 Parent entries and compare only the implementation that doesn’t use any of the methods mentioned above (we will see the n+1 queries issue) and the one that uses includes.
For each of the examples, I used FactoryBot and Faker like this:
FactoryBot.create_list(:child, number * 10, parent_id: Parent.all.pluck(:id).sample)
This way, we will have a random distribution of children to parents and we will be able to see if any n+1 queries are executed.
In order to make sure that n+1 queries reflect a real situation and we will be able to see performance issues, the testing includes 10 times more children than parents.
20 Parents, 200 Children - ~4 times slower
includes : 540ms
includes : 136ms
100 Parents, 1000 Children - ~2 times slower
includes : 716ms
includes : 339ms 1000 Parents, 10000 Children - ~2 times slower Without includes : 3.38s With includes : 1.85s