Evaluating Checkpointing in PostgreSQL

栏目: IT技术 · 发布时间: 5年前

内容简介:To evaluate PostgreSQL I will use a not identical but similar scenario: using sysbench-tpcc with 1000 Warehouses, and as with sysbench you can produce tpcc-like workload for PostgreSQL:

Evaluating Checkpointing in PostgreSQL Continuing with the checkpointing topic I started a month ago with my blog post MongoDB Checkpointing Woes , this time I want to review how PostgreSQL performs in this area. After this, I will be taking a look at MySQL and MariaDB. If anything, it will be fair not only to complain about MongoDB but to review how other databases handle it, as well.

Benchmark

To evaluate PostgreSQL I will use a not identical but similar scenario: using sysbench-tpcc with 1000 Warehouses, and as with sysbench you can produce tpcc-like workload for PostgreSQL:

Sysbench-tpcc Supports PostgreSQL (No, Really This Time)

Tuning PostgreSQL for sysbench-tpcc

The hardware I use is:

System | Supermicro; SYS-F619P2-RTN; v0123456789 (Other)
   Platform | Linux
    Release | Ubuntu 18.04.4 LTS (bionic)
     Kernel | 5.3.0-42-generic
Architecture | CPU = 64-bit, OS = 64-bit
  Threading | NPTL 2.27
    SELinux | No SELinux detected
Virtualized | No virtualization detected
# Processor ##################################################
 Processors | physical = 2, cores = 40, virtual = 80, hyperthreading = yes
     Models | 80xIntel(R) Xeon(R) Gold 6230 CPU @ 2.10GHz
     Caches | 80x28160 KB
# Memory #####################################################
      Total | 187.6G

With the storage on SATA SSD INTEL SSDSC2KB960G8 (Intel Enterprise-grade SSD D3-S4510).

The PostgreSQL config is:

shared_buffers = '140GB'
work_mem = '4MB'
random_page_cost = '1'
maintenance_work_mem = '2GB'
 
wal_level = 'replica'
max_wal_senders = '3'
 
synchronous_commit = 'on'
seq_page_cost = '1'
synchronous_commit = 'on'
 
checkpoint_completion_target = '0.9'
checkpoint_timeout = '900'
 
max_wal_size = '20GB'
min_wal_size = '12GB'
 
autovacuum_vacuum_scale_factor = '0.4'
effective_cache_size = '200GB'
bgwriter_lru_maxpages = '1000'
bgwriter_lru_multiplier = '10.0'
logging_collector = 'ON'
wal_compression = 'ON'
log_checkpoints = 'ON'
archive_mode = 'OFF'
full_page_writes = 'ON'
fsync = 'ON'

The short settings overview:

  • Data will totally fit into memory (The datasize is ~100GB, memory on the server is 188GB, and we allocate 140GB for PostgreSQL shared buffers.)
  • The workload on storage will be mostly write-intensive (reads will be done from memory), with full ACID-compliant and data safe settings on PostgreSQL.
  • I will vary log size from 1GB to 100GB, to see the effect of log sizes on checkpointing.

The benchmark command line is:

./tpcc.lua --pgsql-user=sbtest --pgsql-password=sbtest --pgsql-db=sbtest --time=3600 --threads=56 --report-interval=1 --tables=10 --scale=100 --use_fk=0 --trx_level=RC --db-driver=pgsql --report_csv=yes run 

This means that the benchmark will run for 1 hour, with reporting throughput every 1 sec.

Results

Let’s see what results I’ve got with this setup:

Evaluating Checkpointing in PostgreSQL

That’s an interesting pattern!

Although there are no drops to the floor, we see a saw-like pattern, where throughput raises to ~8000 tps and then drops to ~3000tps (that’s 2.6 times drop!).

It was suggested to check how PostgreSQL would perform with full_page_writes = 'OFF' (this is not a data-safe setting and I would not recommend to use it in production!)

Results with full_page_writes = ‘OFF’

Evaluating Checkpointing in PostgreSQL

This seems to improve the saw-like pattern, but there are micro-drops that are concerning.

If we zoom in only to 50GB WAL size, we can see it in detail:

Evaluating Checkpointing in PostgreSQL

I would be interested to hear ideas on how PostgreSQL results in 1-sec resolution can be improved! If you are interested in the raw results and notebooks, it is available here in GitHub .


以上就是本文的全部内容,希望本文的内容对大家的学习或者工作能带来一定的帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

数据挖掘

数据挖掘

(美)Jiawei Han、(加)Micheline Kamber、(加)Jian Pei / 范明、孟小峰 / 机械工业出版社 / 2012-8 / 79.00元

数据挖掘领域最具里程碑意义的经典著作 完整全面阐述该领域的重要知识和技术创新 这是一本数据挖掘和知识发现的优秀教材,结构合理、条理清晰。本书既保留了相当篇幅讲述数据挖掘的基本概念和方法,又增加了若干章节介绍数据挖掘领域最新的技术和发展,因此既适合初学者学习又适合专业人员和实践者参考。本书视角广阔、资料翔实、内容全面,能够为有意深入研究相关技术的读者提供足够的参考和支持。总之, 强烈推荐......一起来看看 《数据挖掘》 这本书的介绍吧!

正则表达式在线测试
正则表达式在线测试

正则表达式在线测试

HSV CMYK 转换工具
HSV CMYK 转换工具

HSV CMYK互换工具