Building and debugging a high-throughput daemon in Rust

栏目: IT技术 · 发布时间: 4年前

内容简介:The

15 Jul 2020 - 4 minute read

Tags:async-std async rust

The async / await keywords in modern Rust make building high-throughput daemons pretty straightforward, but as I learned that doesn’t necessarily mean “easy.” Last month on the Scribd tech blog wrote about a daemon named hotdog which we deployed into production: Ingesting production logs with Rust . In this post, I would like to write about some of the technical challenges I encountered getting the performance tuned for this async-std based Rust application.

My first Bay Area job used a set of libraries with Python that made it similar to Stackless Python . Thanks to that early initiation, I now have a habit of approaching services with “tasks and channels” in mind. With that in mind, I picked up async-std and started work on hotdog:

Hotdog is a syslog-to-Kafka forwarder which aims to get log entries into Apache Kafka as quickly as possible.
It listens for syslog messages over plaintext or TLS-encrupted TCP connection and depending on the defined Rules it will route and even modify messages on their way into a configured Kafka broker.

What this means as far as application design is fairly simple:

  • Listener for syslog connections
  • When those connections are accepted, spawn a connection specific handler task
  • For each received log line which should be forwarded, pass that along to the task which will publish to Kafka.

Refer to the following amazing diagram:

syslog clients
                      |  |  |  |
                      v  v  v  v
                   +---------------+
                   | Listener Task | (spawning connection tasks
                   +------+--------+  with stream descriptor)
                          |
           +--------------------------------+ 
           |              |                 |
+----------v--------+  +--v---+  +----------v--------+
| Connection Task A |  |  ..  |  | Connection Task N |
+---------------+---+  +---+--+  +----+--------------+
                |          |          |
                +----+     |     +----+ (sending prepared
                     |     |     |       messages over channel)
                +----v-----v-----v-----+
                | Kafka Publisher Task |
                +----------------------+
                   |  |  |  |  |  |
                   v  v  v  v  v  v
                  Apache Kafka Brokers

This basic formula is what I followed with the initial versions (released as 0.2.x) and that was fast . I used perf to profile with real-world traffic from Scribd, all of which were JSON logs, and the vast majority of time spent by the daemon was in parsing and serialization overhead. Since Rust does not have garbage collection or a “runtime” per se, there is very little standing between the code and the work it has to do. When using an async runtime such as async-std, smol, or tokio, there is however some other code running around behind the scenes to be aware of. My profiling against async-std 1.5.0 did indicate some CPU time being burned while polling on Futures, but I considered that more than likely to be “my bug” rather than “async-std’s bug.”

I was shocked at how little there was to actually optimize!

Since I was curious and had time to kill, I rapidly adopted async-std 1.6.0 when it was published. The 1.6.0 release removed the entire async-std runtime in favor of smol a new (small) async runtime for Rust. Having followed the development of smol, I was giddy at the potential of seeing an even faster hotdog after the upgrade was complete. After my rudimentary local testing, I cut a release and deployed it to production.

Performance dropped like a rock.

I was stumped. Stuck, I moved on to other projects and kept an eye on smol’s issue tracker to see if anybody else might replicate the issue.

Sometimes it is however more important to be lucky than smart.

I managed to catch the author of smol on Discord, and between them and the author of async-std I was able to solicit some help in understanding how performance had cratered so dramatically. @stjepang shared this gist which allowed for wrapping futures in Inspect() , printing out debug information when futures take far too long to execute. With their help, we were able to identify a couple of tightly bound CPU-only loops within hotdog that through some fluke had not caused issues under async-std 1.5.0. The more efficient smol under the hood exposed the performance problems, those very same ones which I had observed before as seemingly excessive CPU time spent on polling Futures.

The “trick” to resolving the issues is seemingly as old as the cooperative multi-tasking world itself: throw some yield s on it. In async-std this means task::yield_now().await; , which gives the async reactor a breather to let another task run.

Once the fix ( 0.3.3 ) had been deployed to production, I finally got the giddy excitement I had originally hoped for: the cluster scaled down to levels not previously seen.

In my opinion, Rust’s async runtimes make building high performance applications relatively straightforward. Unfortunately I don’t think they’re easily debuggable at this point. When everything goes great, they’re wonderful. When you find yourself inexplicably blocking somewhere it’s tricky to isolate. I’m sure there will be improvements in this area as time goes on, but the sharp edges can be a little sharper than somebody coming from Node or Java might expect.

hotdog is happily running along processing hundreds of thousands of log lines per minute.

That said, the number of times in my career where a simple yield addressed pesky performance issues is incalculable. :scream_cat:


以上就是本文的全部内容,希望本文的内容对大家的学习或者工作能带来一定的帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

C++程序设计原理与实践

C++程序设计原理与实践

(美)Bjarne Stroustrup / 王刚 等 / 机械工业出版社 / 2010.7 / 108.00元

本书是经典程序设计思想与C++开发实践的完美结合,是C++之父回归校园后对C++编程原理和技巧的全新阐述。书中全面地介绍了程序设计基本原理,包括基本概念、设计和编程技术、语言特性以及标准库等,教你学会如何编写具有输入、输出、计算以及简单图形显示等功能的程序。此外,本书通过对C++思想和历史的讨论、对经典实例(如矩阵运算、文本处理、测试以及嵌入式系统程序设计)的展示,以及对C语言的简单描述,为你呈现......一起来看看 《C++程序设计原理与实践》 这本书的介绍吧!

图片转BASE64编码
图片转BASE64编码

在线图片转Base64编码工具

XML 在线格式化
XML 在线格式化

在线 XML 格式化压缩工具

Markdown 在线编辑器
Markdown 在线编辑器

Markdown 在线编辑器