Evaluating Group Replication Scaling for I/O Bound Workloads

栏目: IT技术 · 发布时间: 4年前

内容简介：In this post, I want to evaluate Group Replication Scaling capabilities in cases when we increase the number of nodes and increase user connections. While this setup is identical to that in my post “For this test, I will deploy multi-node bare metal server

In this post, I want to evaluate Group Replication Scaling capabilities in cases when we increase the number of nodes and increase user connections. While this setup is identical to that in my post “ Evaluating Group Replication Scaling Capabilities in MySQL” , in this case, I will use an I/O bound workload.

For this test, I will deploy multi-node bare metal servers, where each node and client are dedicated to an individual server and connected between themselves by a 10Gb network.

Evaluating Group Replication Scaling for I/O Bound Workloads

Also, I will use 3-nodes and 5-nodes Group Replication setup. In both cases, the load is directed only to ONE node, but I expect with five nodes there is some additional overhead from replication.

Hardware specifications:

     System | Supermicro; SYS-F619P2-RTN; v0123456789 (Other)
Service Tag | S292592X0110239C
   Platform | Linux
    Release | Ubuntu 18.04.4 LTS (bionic)
     Kernel | 5.3.0-42-generic
Architecture | CPU = 64-bit, OS = 64-bit
  Threading | NPTL 2.27
    SELinux | No SELinux detected
Virtualized | No virtualization detected
# Processor ##################################################
 Processors | physical = 2, cores = 40, virtual = 80, hyperthreading = yes
     Models | 80xIntel(R) Xeon(R) Gold 6230 CPU @ 2.10GHz
     Caches | 80x28160 KB
# Memory #####################################################
      Total | 187.6G

For the benchmark, I use sysbench-tpcc 1000W prepared database as:

./tpcc.lua --mysql-host=172.16.0.11 --mysql-user=sbtest --mysql-password=sbtest --mysql-db=sbtest --time=300 --threads=64 --report-interval=1 --tables=10 --scale=100 --db-driver=mysql --use_fk=0 --force_pk=1 --trx_level=RC prepare

The configs, scripts, and raw results are available on our GitHub . The workload is IO-bound, that is, data (about 100GB) exceeds innodb_buffer_pool (also 25GB).

For the MySQL version, I use MySQL 8.0.19.

Results

Let’s review the results I’ve got. First, let’s take a look at how performance changes when we increase user threads from 1 to 256 for three nodes.

Evaluating Group Replication Scaling for I/O Bound Workloads

It is interesting to see how the results become unstable when we increase the number of threads. For more detail, let’s draw the chart with the individual scales for each set of threads:

Evaluating Group Replication Scaling for I/O Bound Workloads

As we can see, there are a lot of variations for threads starting with 64. Let’s check the 64 and 128 threads with a 1-sec resolution.

Evaluating Group Replication Scaling for I/O Bound Workloads

It looks as if cyclical processes are going on, with periodical drops to 0, which may be related to this bug .

3 nodes vs. 5 nodes

Now let’s take a look at the performance under five nodes (compared to 3 nodes):

Evaluating Group Replication Scaling for I/O Bound Workloads

There does not seem to be a huge difference, and only when there are stable results with 8-16 threads can we see a decline for five nodes. For threads 64 to 256, when the variance is prevailing, it is hard to notice the difference.

Handling Sustained Incoming Rate

As there are occasional drops in high threads, I wanted to check how the cluster would handle sustained incoming rate, which is about 75% of average throughput. For this, I will test on 64 threads with incoming rate 3100 transactions per sec (the average throughput for three nodes under 64 threads is 4100 tps).

First, let’s see how the system handles throughput.

Evaluating Group Replication Scaling for I/O Bound Workloads

We can see that most of the time, the throughput is 3100 tps, but there are still intermittent drops as well as jumps with the return to the regular incoming rate. Now, more interesting – how does it affect latency?

Evaluating Group Replication Scaling for I/O Bound Workloads

While, again, most of the time, the system shows about 25ms 95% response time, during the drops, it jumps up to 1400ms, which obviously will result in a bad application and user experience.

Conclusion

From my findings, it seems that in I/O bound workload cases, Group Replication also handles extra nodes quite well in this workload, but the multiple threads are still problematic. I am open to suggestions on how the multiple threads performance can be improved, so please leave your comments below.

以上就是本文的全部内容，希望本文的内容对大家的学习或者工作能带来一定的帮助，也希望大家多多支持码农网

查看所有标签

猜你喜欢:

Evaluating Group Replication Scaling for I/O Bound Workloads

本站部分资源来源于网络，本站转载出于传递更多信息之目的，版权归原作者或者来源机构所有，如转载稿涉及版权问题，请联系我们。

码农书籍

C++数值算法（第二版)

William T.Vetterling、Brian P.Flannery、Saul A.Teukolsky / 胡健伟、赵志勇、薛运华 / 电子工业出版社 / 2005年01月 / 68.00

本书选材内容丰富，除了通常数值方法课程的内容外，还包含当代科学计算大量用到的专题，如求特殊函数值、随机数、排序、最优化、快速傅里叶变换、谱分析、小波变换、统计描述和数据建模、常微分方程和偏微分方程数值解、若干编码算法和任意精度的计算等。本书科学性和实用性统一。每个专题中，不仅对每种算法给出了数学分析和比较，而且根据作者的经验对算法做出了评论和建议，并在此基础上给出了用C++语言编写的实用程......一起来看看《C++数值算法（第二版)》这本书的介绍吧!

码农工具

JS 压缩/解压工具

在线压缩/解压 JS 代码

JSON 在线解析

在线 JSON 格式化工具