Blog: Co se nevešlo na Twitter - Tag Performance

Optimizing PostgreSQL queries with Multicolumn and Partial Indexes

I have an application that does asynchronous data processing, and at the core of the application are simulated queues in a PostgreSQL table. Each row in that queue represents a task and also contains the result of that task. You can imagine this table as a sort of multi-tenant where the rows belong to a data_source and queue. There are multiple DataSources, and each can have multiple queues. Some of the combinations contain very few rows, and some of them contain several million.

This uneven distribution of rows caused that while some of the queues can be queried rather quickly, the largest queue has slowly grown in size to the point where the job iterating over it took around 9 hours.

Pokračovat ve čtení ...