PostgreSQL triggers and isolation levels

栏目: IT技术 · 发布时间: 5年前

内容简介：In this article, we are going to see how the PostgreSQL isolation levels guarantee read and write consistency when executing database triggers.While relational database systems provide strong data integrity guarantees, it’s very important to understand how

Last modified:

Imagine having a tool that can automatically detect if you are using JPA and Hibernate properly. Hypersistence Optimizer is that tool!

Introduction

In this article, we are going to see how the PostgreSQL isolation levels guarantee read and write consistency when executing database triggers.

While relational database systems provide strong data integrity guarantees, it’s very important to understand how the underlying transactional engine works in order to choose the right design for your data access layer.

Database transactions

In a relational database system, transactions haveACID properties, meaning they are Atomic, Consistent, Isolated, and Durable.

Transactions allow the database to move from one consistent state to another. So, all statements executed during the scope of a given transaction must pass all constraint checks (e.g., NULL, Foreign Key, Unique Key, custom CHECK constraints) in order for the transaction to be successfully committed.

Because all transaction changes happen against the latest state of the underlying data (tables and indexes), the database system must employ a mechanism to ensure that uncommitted changes are not visible to other concurrent transactions.

2PL and MVCC

There are two concurrency control mechanisms employed by relational database systems:

The 2PL mechanism was the first one to be employed, and SQL Server still uses it by default (although it can also use MVCC). The 2PL mechanism is very easy to understand. Reads acquire share locks while writes acquire exclusive locks. Locks are only released at the end of the database transactions, be it a commit or a rollback. So, 2PL is a pessimistic locking concurrency control mechanism since it prevents conflicts by locking database records.

Nowadays, Oracle, PostgreSQL, and the MySQL InnoDB engine use the MVCC mechanism since it provides better performance compared to the 2PL mechanism. When using MVCC, share locks are no longer acquired when reading data, and a record that gets modified does not prevent other transactions from reading its previous state. So, instead of locking tuples, MVCC allows storing multiple versions of a given record.

Database model

In this article, we are going to reuse the same entity-relationship model we used in this article about PostgreSQL custom consistency rules .

PostgreSQL triggers and isolation levels

The department table has a single record:

| id | budget | name |
|----|--------|------|
| 1  | 100000 | IT   |

And, there are three employee rows currently working in the IT department:

| id | name  | salary | department_id |
|----|-------|--------|---------------|
| 1  | Alice | 40000  | 1             |
| 2  | Bob   | 30000  | 1             |
| 3  | Carol | 20000  | 1             |

Over-budget prevention

Now, let’s consider we have two users, Alice and Bob, who both want to change the sum of salaries, as follows:

Alice wants to give a 10% end-of-the-year raise to all employees in the IT department, which should raise the budget from 90000 to 99000
Bob wants to hire Dave with a salary of 9000 , which should also raise the budget from 90000 to 99000

If both Alice and Bob are allowed to commit, then we will risk going over the budget. So, we need to define a check_department_budget trigger-based function that ensures the sum of salaries in a given department does not exceed the pre-defined budget:

CREATE OR REPLACE FUNCTION check_department_budget()
  RETURNS TRIGGER AS $$
DECLARE
  allowed_budget BIGINT;
  new_budget     BIGINT;
BEGIN
  SELECT INTO allowed_budget budget
  FROM department
  WHERE id = NEW.department_id;
 
  SELECT INTO new_budget SUM(salary)
  FROM employee
  WHERE department_id = NEW.department_id;
 
  IF new_budget > allowed_budget
  THEN
    RAISE EXCEPTION 'Overbudget department [id:%] by [%]',
    NEW.department_id,
    (new_budget - allowed_budget);
  END IF;
  RETURN NEW;
END;
$$ LANGUAGE plpgsql;

The check_department_budget function is executed on every INSERT and UPDATE in the employee table via the check_department_budget_trigger PostgreSQL TRIGGER.

CREATE TRIGGER check_department_budget_trigger
AFTER INSERT OR UPDATE ON employee
FOR EACH ROW EXECUTE PROCEDURE check_department_budget();

READ COMMITTED isolation level

When using the default READ COMMITTED isolation level, we can see that the check_department_budget database function prevents theWrite Skew anomaly that, otherwise, would cause an over-budgeting issue:

PostgreSQL triggers and isolation levels

In READ COMMITTED, the MVCC database engine allows the database transactions to read the latest committed state of records. So, even if our currently running transaction has previously read the version N of a given record if the current version of this record is now N+1 because other concurrent transaction has just changed it and committed, our transaction will read the version N+1 via a subsequent SELECT statement.

When using READ COMMITTED, a query will see the database as of the beginning of the query.

For this reason, the UPDATE statement will fail because the check_department_budget_trigger detected that the UPDATE would go over the budget. Even if Alice read the sum of salaries at the beginning of her transaction, the second SELECT executed by the check_department_budget function will read the latest employee salary sum, meaning that it will take Bob’s INSERT into consideration.

REPEATABLE READ isolation level

When switching to REPEATABLE READ and rerunning our previous example, we can see that the check_department_budget_trigger is no longer able to prevent theWrite Skew anomaly:

PostgreSQL triggers and isolation levels

Unlike READ COMMITTED, when using REPEATABLE READ, a query will see the database as of the beginning of the transaction. So, when the check_department_budget_trigger is executed due to Alice’s UPDATE statement, the sum of salaries will be 90 000 as it were at the beginning of Alice’s transaction.

So, both Bob and Alice are allowed to commit, and the sum of salaries goes over the budget. The REPEATABLE READ isolation level in PostgreSQL is, in reality, the Snapshot Isolation consistency model. While Snapshot Isolation can prevent thePhantom Read anomaly, it cannot prevent the Write Skew phenomenon.

SERIALIZABLE isolation level

When switching to the SERIALIZABLE isolation level and rerunning our example, we can see that Bob’s transaction proceeds while Alice’s transaction is rolled back.

PostgreSQL triggers and isolation levels

Just like REPEATABLE READ, when using the SERIALIZABLE isolation level, a query will see the database as of the beginning of the transaction. However, unlike REPEATABLE READ, Alice’s transaction is rolled back because the transaction engine has detected a dependency cycle between Alice’s read of salaries and Bob’s write.

So, being the first one to commit, Bob’s transaction succeeds. On the other hand, Alice’s transaction fails as Alice assumes a database state that’s stale at the end of her transaction. The SERIALIZABLE isolation level in PostgreSQL uses an enhanced version of the standard Snapshot Isolation algorithm that can detect Write Skew anomalies. This enhanced MVCC Snapshot Isolation mechanism is called Serializable Snapshot Isolation , and it’s based on Michael James Cahill Ph.D. thesis .

If you enjoyed this article, I bet you are going to love my SQL Master Class for Java Developers workshop , as well.

So, if you are in Málaga on the 13th of May , then you should definitely join my SQL Master Class training at J on the Beach .

And, if you are in Oslo on the 25th - 26th of May , then you have the chance to attend my SQL Master Class training .

PostgreSQL triggers and isolation levels

Conclusion

Understanding the isolation level guarantees provided by the underlying database system is very important when designing a data access layer.

When defining a trigger-based function that enforces a certain constraint, it’s better to test it against the isolation level you are going to use in production, as, otherwise, you might end up with data integrity issues that are very hard to spot after the fact.

以上就是本文的全部内容，希望对大家的学习有所帮助，也希望大家多多支持码农网

查看所有标签

猜你喜欢:

PostgreSQL triggers and isolation levels

本站部分资源来源于网络，本站转载出于传递更多信息之目的，版权归原作者或者来源机构所有，如转载稿涉及版权问题，请联系我们。

码农书籍

程序员的数学2

平冈和幸、堀玄 / 陈筱烟 / 人民邮电出版社 / 2015-8-1 / CNY 79.00

本书沿袭《程序员的数学》平易近人的风格，用通俗的语言和具体的图表深入讲解程序员必须掌握的各类概率统计知识，例证丰富，讲解明晰，且提供了大量扩展内容，引导读者进一步深入学习。本书涉及随机变量、贝叶斯公式、离散值和连续值的概率分布、协方差矩阵、多元正态分布、估计与检验理论、伪随机数以及概率论的各类应用，适合程序设计人员与数学爱好者阅读，也可作为高中或大学非数学专业学生的概率论入门读物。一起来看看《程序员的数学2》这本书的介绍吧!

码农工具

PostgreSQL triggers and isolation levels

Introduction

Database transactions

2PL and MVCC

Database model

Over-budget prevention

READ COMMITTED isolation level

REPEATABLE READ isolation level

SERIALIZABLE isolation level

Conclusion

程序员的数学2

在线进制转换器

UNIX 时间戳转换

正则表达式在线测试