Using SKIP LOCK in MySQL For Queue Processing

栏目: IT技术 · 发布时间: 5年前

内容简介:A small thing that brings huge help.The other day I was writing some code to process a very large amount of items coming from a social media API. My items were ending in a queue in MySQL and then needed to be processed and eventually moved.The task was not

A small thing that brings huge help.

The other day I was writing some code to process a very large amount of items coming from a social media API. My items were ending in a queue in MySQL and then needed to be processed and eventually moved.

The task was not so strange,  but what I have to do is to develop a queue processor.  Now when you need to process a queue you have two types of queue: static and dynamic .

The static comes in a batch of N number of items in a given time interval and is normally easier to process given you have a defined number of items that you can split in chunks and process in parallel.

The dynamic is… well... more challenging. One option is to wait to have a predefined number of items, and then process them as if they were a static queue.

But this approach is not very good, given it is possible that it will delay a lot the processing of an item for all the time it has to wait to reach the desired queue’s size.

The other possibility is to have the processing jobs work on a single item and not in chunk/batch. But, this is not optimal when given the chance to process a queue in batch to speed up the processing time.

My incoming queue is a bit unpredictable, a mix of fixed sizes and a few thousand coming sparse, without a clear interval.  So I was there thinking on how to process this and already starting to design a quite complex mechanism to dynamically calculate the size of the possible chunks and the number of jobs, when…

An aside: some colleagues know my habit to read the whole MySQL manual, from A to Z, at least once a year. It's a way for me to review what is going on and sometimes to dig in more in some aspects. This normally also gives me a good level of confidence about new features and other changes on top of reading the release notes.

...When … looking at the documentation for something else, my attention was captured by:

“To avoid waiting for other transactions to release row locks, NOWAIT and SKIP LOCKED options may be used with SELECT ... FOR UPDATE or SELECT ... FOR SHARE locking read statements.”

Wait -  what???

Let me dig in a bit:

“SKIP LOCKED. A locking read that uses SKIP LOCKED never waits to acquire a row lock. The query executes immediately, removing locked rows from the result set.”

Wow, how could I have missed that?

It was also not new but in MySQL 8.0.1, the milestone release. Having experience with Oracle, I knew what SKIP LOCKED does and how to use it. But I was really not aware that it was also available in MySQL.

In short, SKIP LOCKED allows you to lock a row (or set of them), bypassing the rows already locked by other transactions.

The classic example is:

# Session 1:
mysql> CREATE TABLE t (i INT, PRIMARY KEY (i)) ENGINE = InnoDB;
mysql> INSERT INTO t (i) VALUES(1),(2),(3);
mysql> START TRANSACTION;
mysql> SELECT * FROM t WHERE i = 2 FOR UPDATE;
+---+
| i |
+---+
| 2 |
+---+

# Session 2:
mysql> START TRANSACTION;
mysql> SELECT * FROM t FOR UPDATE SKIP LOCKED;
+---+
| i |
+---+
| 1 |
| 3 |
+---+

But what is important for me is that given an N number of jobs running, I can bypass all the effort of calculating the dynamic chunks, given that using SKIP LOCKED that will happen as well, if in a different way.

All good, but what performance will I have using SKIP LOCK in comparison to the other two solutions?

I have run the following tests on my laptop, so not a real server, and used a fake queue processor I wrote on the fly to test the things you can find on GitHub here .  

What I do is to create a table like this:

CREATE TABLE `jobs` (
  `jobid` int unsigned NOT NULL AUTO_INCREMENT,
  `time_in` bigint NOT NULL,
  `time_out` bigint DEFAULT '0',
  `worked_time` bigint DEFAULT '0',
  `processer` int unsigned DEFAULT '0',
  `info` varchar(255) DEFAULT NULL,
  PRIMARY KEY (`jobid`),
  KEY `idx_time_in` (`time_in`,`time_out`),
  KEY `idx_time_out` (`time_out`,`time_in`),
  KEY `processer` (`processer`)
) ENGINE=InnoDB AUTO_INCREMENT=0 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci

Then I will do three different methods of processing:

  1. Simple process, reading and writing a row a time
  2. Use chunks to split my queue
  3. Use SKIP LOCKED

To clarify the difference existing between the 3 different way of processing the queue let us use 3 simple images:

Using SKIP LOCK in MySQL For Queue Processing

In simple processing each row represent a possible lock that the other processes must wait for.

Using SKIP LOCK in MySQL For Queue Processing

In chunk processing given each process knows what records to lock they can go in parallel.

Using SKIP LOCK in MySQL For Queue Processing

In SKIP LOCKED also if each process have no idea of what rows they need to lock, is enough to say the size of the chunk, and MySQL will return the records available.

I will repeat the tests for a static queue, and after for a dynamic queue for 20 seconds. Let's see what happens.

Test one - simple processing and static queue:

queue Processor report:
=========================
Start Time                = Tue Jul 28 13:28:25 CEST 2020
End Time                  = Tue Jul 28 13:28:31 CEST 2020
Total Time taken(ms)      =     6325.0
Number of jobs            =          5
Number of Items processed =      15000
Avg records proc time     =     220308

Test 2 use the chunks and static queue

Chunk no loop
--------------
queue Processor report:
=========================
Start Time                = Tue Jul 28 13:30:18 CEST 2020
End Time                  = Tue Jul 28 13:30:22 CEST 2020
Total Time taken(ms)      =     4311.0
Number of jobs            =          5
Number of Items processed =      15000
Avg records proc time     =     391927

Test three - use SKIP LOCKED and static queue:

SKIP LOCK - no loop
------------------------
queue Processor report:
=========================
Start Time                = Tue Jul 28 13:32:07 CEST 2020
End Time                  = Tue Jul 28 13:32:11 CEST 2020
Total Time taken(ms)      =     4311.0
Number of jobs            =          5
Number of Items processed =      15000
Avg records proc time     =     366812

So far, so good.

Time is almost the same (actually in this test, it's exactly the same), normally fluctuating a bit up and down by a few ms.

Using SKIP LOCK in MySQL For Queue Processing

Average execution by commit fluctuates a bit:

Using SKIP LOCK in MySQL For Queue Processing

Here, the base execution is faster for the simple reason that the application is processing one record against a batch of records of the other two.

Now it is time to see what happens if instead of a static batch, I have a process that fills the queue constantly. If you want picture what will happen on each test, just imagine this:

  • Some pipes will put water in a huge tank
  • The base solution will try to empty the tank using a small glass of water but acting very fast at each run
  • With Chunk it will wait for the water to reach a specific level, then will use a fixed-size bucket
  • Using SKIP LOCK, it will constantly look at the tank and will choose the size of the bucket based on the quantity of the water present.

To simulate that, I will use five threads to write new items, five to process the queue, and will run the test for 20 seconds.

Here we will have some leftovers; that is how much water remains in the tank because the application was not emptied with the given bucket. We can say it is a way to measure the efficiency of the processing, where the optimal sees the tank empty.

Test one -  simple processing and static queue:

Basic no loop
--------------
queue Processor report:
=========================
Start Time                = Tue Jul 28 13:42:37 CEST 2020
End Time                  = Tue Jul 28 13:43:25 CEST 2020
Total Time taken(ms)      =    48586.0
Number of jobs            =          5
Number of Items processed =      15000
Total Loops executed (sum)=         85
Avg records proc time     =     243400

Leftover
+----------+------------+
| count(*) | max(jobId) |
+----------+------------+
|   143863 |     225000 |
+----------+------------+

Test 2 use the chunks and static queue:

Chunk no loop
--------------
queue Processor report:
=========================
Start Time                = Tue Jul 28 13:44:56 CEST 2020
End Time                  = Tue Jul 28 13:45:44 CEST 2020
Total Time taken(ms)      =    47946.0
Number of jobs            =          5
Number of Items processed =      15000
Total Loops executed (sum)=         70
Avg records proc time     =     363559

Leftover
+----------+------------+
| count(*) | max(jobId) |
+----------+------------+
|       53 |     195000 |
+----------+------------+

Test 3 use SKIP LOCKED and static queue:

queue Processor report:
=========================
Start Time                = Tue Jul 28 14:46:45 CEST 2020
End Time                  = Tue Jul 28 14:47:32 CEST 2020
Total Time taken(ms)      =    46324.0
Number of jobs            =          5
Number of Items processed =      15000
Total Loops executed (sum)=       1528
Avg records proc time     =     282658

Leftover
+----------+------------+
| count(*) | max(jobId) |
+----------+------------+
|        0 |       NULL |
+----------+------------+

Here, the scenario is a bit different than the one we had with the static queue.

Using SKIP LOCK in MySQL For Queue Processing

Here, the scenario is a bit different than the one we had with the static queue.

Using SKIP LOCK in MySQL For Queue Processing

Record processing when comparing by chunk and SKIP LOCK is again more efficient in the second one. This is because it optimizes the size of the “bucket” and given that it can sometimes process fewer records per loop.

Using SKIP LOCK in MySQL For Queue Processing

As we can see when using SKIP LOCK, the application was able to execute 1528 loops to process the queue against the 70 of the chunk and 85 of the basic approach.

In the end, the only one that was able to empty the tank was the solution with SKIP LOCK.

Conclusion

Processing queues can be simple when we need to process a fixed number of items, but if you need an adaptive approach, then the situation changes. You can find yourself writing quite complex algorithms to optimize the processing.

Using SKIP LOCK helps you in keeping the code/solution simple and move the burden of identifying the record to process onto the RDBMS.

SKIP LOCK is something that other technologies like Oracle-DB and Postgres already implemented, and their developer communities use.

MySQL implementation comes a bit later, and the option is not widely known or used in the developer’s community using MySQL, but it should.

Give it a try and let us know!

NOTE !!

SKIP LOCK is declared unsafe for statement replication, you MUST use ROW based replication if you use it.

References

MySQL 8.0 Hot Rows with NOWAIT and SKIP LOCKED

MySQL 8.0 Reference Manual: Locking Reads


以上所述就是小编给大家介绍的《Using SKIP LOCK in MySQL For Queue Processing》,希望对大家有所帮助,如果大家有任何疑问请给我留言,小编会及时回复大家的。在此也非常感谢大家对 码农网 的支持!

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

金融数量分析

金融数量分析

郑志勇 / 北京航空航天大学出版社 / 2014-7-1 / CNY 58.00

《金融数量分析——基于MATLAB编程(第3版)》一书中的案例均来源于作者的工作实际,并充分体现“案例的实用性、程序的可模仿性”,程序中附有详细的注释。例如,投资组合管理、KMV模型计算、期权定价模型与数值方法、风险价值VaR的计算等案例程序,读者可以直接使用或根据需要在源代码的基础上修改、完善。 本书共23章。前两章分别对金融市场的基本概况与MATLAB的基础知识进行概述;接下来为20个金......一起来看看 《金融数量分析》 这本书的介绍吧!

HTML 压缩/解压工具
HTML 压缩/解压工具

在线压缩/解压 HTML 代码

在线进制转换器
在线进制转换器

各进制数互转换器

RGB CMYK 转换工具
RGB CMYK 转换工具

RGB CMYK 互转工具