Dead rows in a materialized view

栏目: IT技术 · 发布时间: 4年前

内容简介：In the application that I work on, we have a materialized view. It was created from several joined tables. It’s used to speed up searching for data without joining seven or eight tables in every query. At least it should make searching faster but in practi

In the application that I work on, we have a materialized view. It was created from several joined tables. It’s used to speed up searching for data without joining seven or eight tables in every query. At least it should make searching faster but in practice, it didn’t and I will describe to you why.

Queries to that view were really slow, but only on the production server, on staging everything was fine. That view was exactly the same as on staging server, but queries were ~50 times slower on production.

The first step was to check execution plans for any query on both servers.

EXPLAIN ANALYZE SELECT COUNT(*) FROM 'table_name';

On staging execution plan involved parallel scan, on production index scan.

When I discovered that, the first thing that came to my mind was the possibility of different configurations that disabled parallel queries on production. I checked that and it turned out to be false. My next attempts were to check various Postgres views that kept some data about tables. After looking through many numbers that didn’t help I finally discovered that our table (materialized view actually, but in this context, it makes no difference) on the production server it took 7GB of disc space and on staging only 300MB.

SELECT pg_size_pretty( pg_total_relation_size('table_name') );

I expected that row count had to be totally different but unexpectedly it was ~320k on production and ~310k on staging.

After a really long investigation, I found the reason why those numbers were so different, the table on production contained an enormous number of dead rows.

SELECT n_dead_tup 
FROM pg_stat_all_tables
WHERE relname = 'table_name'

At this time I was almost sure that’s the reason for our whole problem (or at least I hoped so). I started researching those dead rows, what they are, and why they appear. The major defect of that approach is that after every single update on one of the records from joined tables, the whole materialized view is refreshed. Refreshed i.e. whole view is dropped and then it’s building query is run. We have two types of a refresh in PostgreSQL:

non concurrently

This refresh type does not produce any dead rows, but for the time of refreshing view is locked, and no data can be read from it. This is an unacceptable solution for my project.

concurrently

It allows reading data during the process. It’s possible thanks to duplicating all data before deleting it. During refresh, all SELECT queries see that duplicated data, and after the process, all queries have access to newly created view, and duplicates remain as dead rows. It’s the way how the view is bloated with tons of unnecessary data.

And here comes VACUUM mechanism that is used to remove all dead rows from the table or materialized view.

VACUUM table_name

This command removes all dead queries from the given table, but it has to be run ‘manually’, or by some application code. There is also a mechanism called autovacuum . PostgreSQL has some workers (quantity set in configuration) that all the time search our database and run VACUUM on tables that are in need. What do we know now? We have many dead rows in the materialized view, we also have a mechanism that should clean them. Why isn’t it? In order to check that I added query that logs last auto and vacuum time with dead rows count.

SELECT n_dead_tup, last_autovacuum, last_vacuum 
FROM pg_stat_all_tables
WHERE relname = 'table_name'

After a few days, I checked that logs. The first thing to notice was that the Autovacuum process is working but it’s triggered only in evenings. All dead rows are cleaned and the next day from something like 7 a.m. they appear and it’s count is growing. I checked our search in the morning and indeed it was as fast as it should be, but within a day it was getting much slower. Following really long research why is that happening I found that Autovacuum cannot be run on the table when it’s locked by SHARE UPDATE EXCLUSIVE lock. What locks a table that way? Obviously it’s REFRESH MATERIALIZED VIEW CONCURRENTLY. When that view is refreshed in our application? Other logs added and the answer is: the view is refreshed almost whole time (during a workday from morning to evening). Refresh is as I mentioned triggered by every data update on each of the tables that problematic view is made of. And here comes our final answer. Dead rows aren’t cleaned because Autovacuum cannot be run during refresh and refresh is running continuously from morning to evening.

Possible solutions:

Architectural changes that would prevent an application from refreshing whole materialized view during every data update. This is definitely possible to just update those rows that really changed without dropping and building from the scratch whole table.

This is definitely the best solution and I would choose that if I had much more time to spend on fixing that problem, but it would be too long to refactor every place in the application that can update one of the included tables.
The second possible fix is to append some breaks between refreshes that AUTOVACUUM process could be fired on our view. Theoretically, it is a fine idea, it would reduce database overload, it should be quite fast to implement, but I think there is too much room for unforeseen consequences and finally, I decided to give up that idea. Even though adding breaks would not take much time to implement, testing it carefully would and as I noticed above there were no more time after that really long investigation
The third and final solution I anticipated was the simplest one. In the job used to refresh the materialized view, I added code that ‘manually’ vacuums view from dead rows. During that whole research on I run vacuum manually many times and I knew it worked. Going this way assures us that there is no possibility that dead rows stay in view because they are vacuumed as soon as refresh (that creates them) ends.

Weighing up the pros and cons I finally chose the third solution. Time was a problem and it was the only one that could be tested immediately. It worked, but I think refactoring the application to not refresh the whole view would be the best option.

We create dedicated software for companies, bring ideas to life and enjoy what we do.

Looking for professional team?

以上就是本文的全部内容，希望对大家的学习有所帮助，也希望大家多多支持码农网

查看所有标签

猜你喜欢:

Dead rows in a materialized view

本站部分资源来源于网络，本站转载出于传递更多信息之目的，版权归原作者或者来源机构所有，如转载稿涉及版权问题，请联系我们。

码农书籍

Python标准库

Doug Hellmann / 刘炽 / 机械工业出版社华章公司 / 2012-6-15 / 139.00元

本书由资深Python专家亲自执笔，Python语言的核心开发人员作序推荐，权威性毋庸置疑。对于程序员而言，标准库与语言本身同样重要，它好比一个百宝箱，能为各种常见的任务提供完美的解决方案，所以本书是所有Python程序员都必备的工具书！本书以案例驱动的方式讲解了标准库中一百多个模块的使用方法（如何工作）和工作原理（为什么要这样工作），比标准库的官方文档更容易理解（一个简单的示例比一份手册......一起来看看《Python标准库》这本书的介绍吧!

码农工具

Dead rows in a materialized view

We create dedicated software for companies, bring ideas to life and enjoy what we do.

Python标准库

HTML 压缩/解压工具

在线进制转换器

HEX HSV 转换工具