Less Is More: A PostgreSQL Query Optimized from Minutes to Milliseconds

Published in

Flexport Engineering

10 min readMar 21, 2022

Introduction

Using more indexes does not always improve database performance. The “Less Is More” principle works in the database performance area. In this post, we’ll discuss our experience in optimizing a PostgreSQL query to make it thousands of times faster, taking it from a few minutes to a few milliseconds, by using less indexes.

Problem

The Flexport platform lets Flexport customers, operators, and other users send messages within our applications. The messages table, which acts as basic content storage for this feature, has millions of rows in a PostgreSQL database on AWS Aurora.

The main query on this table is usually very fast, but recently it started recording intermittent timeouts. The slow query was not only harming one particular page user experience, but also as a side effect, the slow query was impacting other database operations and parts of the application.

The query looks like this:

The query uses some PostgreSQL-specific syntax. So, what does this query mean?

The @> symbol is PostgreSQL’s containment operator. The official documentation provides a great explanation, but in short it tests if its left-hand side “contains” its right-hand side. The operator can be applied to structural types like array, range, or JSONB (a binary representation of JSON in PostgreSQL).

The messages table has a number-typed “id” column, a JSONB-typed “context” column and some other columns. The JSONB object in the “context” column contains properties to describe a message, e.g., the “topic” property describes the topic of the message and the “involved_parties” property describes the list of legal entities allowed to view the message.

There are two indexes related to the “context” column:

A GIN index on the “context” column.
A BTREE expression index on jsonb_extract_path_text(context, ‘topic’).

GIN is a composite value indexing engine provided by PostgreSQL, generally used for data structures like arrays, JSON, or full-text, where searching on values within the structure is required. A good rule of thumb is that GIN is intended to index data that is subdivisible and when you’re looking to retrieve individual component values in the data. BTREE is the default indexing engine provided by PostgreSQL, capable of handling basic equality or range queries on sortable data. An expression index is a powerful index type provided by PostgreSQL that allows indexing an expression instead of a column. Usually a JSONB type can only use an indexing engine like GIN because BTREE only supports scalar types. But, since the expression jsonb_extract_path_text(context, ‘topic’) returns a string type, it can use a BTREE index. A GIN index can have very different index contents depending on which data type and operator class you are using, unlike the uniformly consistent representation of data in a default BTREE index. Because of this, and any potential levels of selectivity of requested data, a GIN index can be of more value to specifically formed queries, rather than a BTREE index that works on a very generalized equality/range retrieval.

Initial investigation

A query usually performs table reads after an index scan (an exception is that an index scan covers all the data to retrieve and no table read is needed). For best performance, indexes should have good selectivity to help filter data quickly before table reads, then fewer (or no) table reads will be needed. The condition context @> ‘{“involved_parties”:[{“id”:1,”type”:1}]}’::jsonb can use the GIN index on the “context” column but it is not a great fit, because the value {“id”:1,”type”:1} is a special value that is present in most records. As a result, the GIN index has very bad selectivity for this condition. Actually, indexes for other conditions in the query already provide good selectivity so we shouldn’t ever need to use this index for this condition. What we need to do is check whether the GIN index is used in this query.

We can get query plans by executing “EXPLAIN ANALYZE {the query statement}” in the SQL console. This executes the query and returns the chosen query plan annotated with actual time costs.

When we tried this, we only got a fast query plan which did not use the suspected GIN index. Our investigation is in trouble.

Technical deep dive

First, let’s learn to read the fast query plan with a visual representation.

Fast query plan

Fast query plan visual representation

In the following diagram, a query plan is like a call stack. The query is expanded from top down into a tree, and the tree is evaluated from the bottom up. For example, 2 leaf nodes (Bitmap Index Scan) are evaluated first, and then their results are merged by their parent node (BitmapOr), and then returned to their grandparent node (Bitmap Heap Scan), and so on. Note that index 1 and index 2 are the same BTREE index on “topic” but are scanned with different conditions.

A PostgreSQL query can scan multiple indexes and merge the results, with the slowest index determining the total execution time. This is usually more efficient than scanning only one index because more indexes usually mean better reduction of the filtered data set and fewer related disk reads (the data set filtered by index scan must be loaded from the table space, which may require disk reads if not in the in-memory cache). Index scan results are merged using a Bitmap Heap Scan, so you can see “Bitmap Heap Scan” in the query plan.

Further investigation

Based on the available information, we focused on the following points:

The cause is unlikely to be resource contention because contention would impact all statements equally, not just SELECT (but we only observed slow SELECT?).
The cause is likely to be different plans chosen occasionally even with identical query parameters.
The cause is also likely to be an issue related to GIN. Gitlab once faced a GIN related issue.

One theory could be “data changes and statistics updates make the query planner occasionally choose a slow query plan using the GIN index“.

Regardless of the theory, the best course of action was to improve observability. Did we lack some data? Okay, collect it! We added code to time query executions and capture slow query plans only when the query was slow. This proved the cause was unstable query planning: the query planner usually chose a fast query plan, but occasionally chose a slower plan.

Slow query plan

This query plan is from EXPLAIN because EXPLAIN ANALYZE timed out.

Slow query plan visual representation

As you can see in the following diagram for the slow plan, it is more complex than the fast one. It has additional “Bitmap And” and “Bitmap Index Scan” nodes to scan index 3 (the GIN index on the “context” column) and merge results. If index 3 is inefficient, it will degrade the total performance.

The query planner can work well when cost estimation is accurate. But the cost estimation of the GIN index on JSONB was inaccurate. As observed, it thought the selectivity was 0.001 (this is a hard-coded value), meaning that any related query is assumed to select 0.1% of all rows, but this is false in our case where 90% of all rows can be selected. This false assumption makes the query planner underestimate the cost of a suboptimal plan. There were some statistics of JSONB-typed columns in the database, but they seemed to be of no use.

Solution

We couldn’t remove the GIN index because other queries rely on it. But, there was a simple solution to avoid the GIN index being used for this query. Generally, if the left-hand side of an operator is not a simple column name but an expression, no index can be used unless there is an expression index for that expression. For example, if a table has an indexed column “id”, the condition “id = ?” will use the index but the condition “id + 0 = ?” will not use the index. In our case, the “@>” operator’s left-hand side is the “context” column name, so we can change it to the path access expression “context->‘involved_parties’” (and update the parameter value accordingly).

The original query condition is

context @> ‘{“involved_parties”:[{“id”:1,“type”:1}]}’::jsonb

The changed query condition is

context->’involved_parties’ @> ‘[{“id”:1,”type”:1}]’::jsonb

Experiment

We experimented in a simplified case to filter by the problematic condition. The experiment reproduced the difference and made us confident to ship the optimization.

Compare these 2 simple queries:

1. (Statement 1, unoptimized)

SELECT * FROM messagesWHERE context @> ‘{“involved_parties”:[{“id”:1,”type”:1}]}’::jsonb LIMIT 100

2. (Statement 2, optimized)

SELECT * FROM messagesWHERE context->”involved_parties” @> ‘[{“id”:1,”type”:1}]’::jsonb LIMIT 100

Because Statement 2 performs a table scan with no index scan, we use “LIMIT 100” to ensure it stops scanning the entire table as soon as the first 100 matching rows are found. In Statement 1, a GIN index scan is always a bitmap index scan that is non-sequential and cannot utilize “LIMIT 100”. Note that a limit value like 10 is in a threshold that makes Statement 1 give up index scan, so that a larger limit value like 100 is required (the threshold was subject to the cost estimation and seemed to be 20 in our tests).

The query plans were:

(Statement 1, unoptimized)

2. (Statement 2, optimized)

In this case, the optimized query is 50K times faster.

Real production improvement

The following diagrams show great performance improvements in the production environment.

There are similar API request numbers for each day (5~10k requests per 6-hour time window) — Similar API request numbers for each day (5~10k requests per 6-hour time window)

There is significant reduction of errors as of Dec 19. — Significant reduction of errors as of Dec 19

There is significant reduction of latency as of Dec 19. — Significant reduction of latency as of Dec 19

Some Useful Principles

We summarize some principles for database performance (especially for PostgreSQL).

Principle 1: Less Is More

Manage indexes

More indexes do not mean better performance. In fact, each additional index degrades write operation performance. Queries can still be slow if the query planner picks an inefficient index instead of an efficient index.

Do not pile up indexes. Try to remove indexes when possible. And monitor the performance impact when doing each index change.

Prefer a simple database design

RDBMS (Relational Database Management System) prefers a normalized form. JSON or JSONB is a NoSQL style denormalized form of data.

Normalization or denormalization, which is better? From a business perspective, it depends. From the RDBMS perspective, normalization is usually simpler and better, while denormalization can be supplementary in some cases.

Tip 1: Consider designing your data model from a Domain-Driven-Design perspective.

Entities are usually modeled as tables while value objects are usually embedded in entities (sometimes large value objects can be modeled as tables for performance).
The target entities of associations should not be embedded if they are aggregate roots. But because source aggregate roots are self-contained, target entities can be embedded if they are not aggregate roots.

Tip 2: Nullable columns are efficient in modern RDBMS, do not hesitate to use them when they are the most straightforward way, and do not use JSONB as an “optimization” for nullable columns.

Principle 2: Make statistics accurate

PostgreSQL maintains statistics of each table, including but not limited to tuple number, page number, most common values, histogram bounds and number of distinct values. Some statistics are sampled and not very accurate. The query planner determines a query plan by generating multiple possible plans, estimating their costs using the statistics and some rules, and choosing the most cost-effective plan. The quality of a query plan depends on the accuracy of statistics data. Accurate data leads to good execution (also a good principle in data science and data-driven businesses).

As mentioned, the cost estimation of GIN on JSONB is inaccurate. The cost estimation of BTREE on scalar types is much more accurate. JSONB is not suitable for some cases. As a workaround, you can create an expression index using BTREE on a scalar type entry of JSONB. This article from ScaleGrid also provides a good introduction on how to use JSONB and GIN efficiently.

Tip: PostgreSQL features like expression indexes and partial indexes are powerful and cost-effective. Analyze your data and use them if possible.

Principle 3: Improve observability

No matter whether we have a theory about the potential root cause or not, the best option is often to improve observability. Query logs can prove that the slow requests are caused by slow queries, not by slow application code or connection waiting. Automatic query EXPLAINing can capture real query plans used by slow queries.

APM (Application Performance Management) like Datadog is also an important tool. It gives us many insights:

Is the issue caused by lack of resources? No, lack of resources should impact any SQL CRUD statement, but we only observed slow SELECTs.
Does the issue happen at the same time everyday? No, it can happen anytime.
Is each occurrence of the issue independent? No, occurrences tend to accumulate in small time windows. Something must have happened at that time to cause the issue.