Cheap tricks for high-performance Rust

栏目: IT技术 · 发布时间: 4年前

内容简介：So you’re writing Rust but it’s not fast enough? Even though you’re usingPlease remember that the following suggestionsLet’s first of all enable some more optimizations for when we do

So you’re writing Rust but it’s not fast enough? Even though you’re using cargo build --release ? Here’s some small things you can do to increase the runtime speed of a Rust project – practically without changing any code!

Please remember that the following suggestions do not replace actual profiling and optimizations! I also think it goes without saying that the only way to detect if any of this helps is having benchmarks that represent how your application behaves under real usage.

Tweaking our `release` profile

Let’s first of all enable some more optimizations for when we do cargo build --release . The deal is pretty simple: We enable some features that make building release builds even slower but get more thorough optimizations as a reward.

We add the flags described below to our main Cargo.toml file, i.e., the top most manifest file in case you are using a Cargo workspace . If you don’t already have a section called profile.release , add it:

[profile.release]

Link-time optimization

The first thing we’ll do is enable link-time optimization (LTO). It’s a kind of whole-program or inter-module optimization as it runs as the very last step when linking the different parts of your binary together. You can think of it as allowing better inlining across dependency boundaries (but it’s of course more complicated that that).

Rust can use multiple linker flavors, and the one we want is “optimize across all crates”, which is called “fat”. To set this, add the lto flag to your profile:

lto = "fat"

Code generation units

Next up is a similar topic. To speed up compile times, Rust tries to split your crates into small chunks and compile as many in parallel as possible. The downside is that there’s less opportunities for the compiler to optimize code across these chunks. So, let’s tell it to do one chunk per crate:

codegen-units = 1

Setting a specific target CPU

By default, Rust wants to build a binary that works on as many machines of the target architecture as possible. However, you might actually have a pretty new CPU with cool new features! To enable those, we add

-C target-cpu=native

as a “Rust flag”, i.e. the environment variable RUSTFLAGS or the target’s rustflags field in your .cargo/config .

Aborting

Now we get into some of the more unsafe options. Remember how Rust by default uses stack unwinding (on the most common platforms)? That costs performance! Let’s skip stack traces and the ability to catch panics for reduced code size and better cache usage:

panic = "abort"

Please note that some libraries might depend on unwinding and will explode horribly if you enable this!

Using a different allocator

One thing many Rust programs do is allocate memory. And they don’t just do this themselves but actually use an (external) library for that: an allocator. Current Rust binaries use the default system allocator by default, previously they included their own with the standard library. (This change has lead to smaller binaries and better debug-abiliy which made some people quite happy).

Sometimes your system’s allocator is not the best pick, though. Not to worry, we can change it! I suggest giving both jemalloc and mimalloc a try.

jemalloc

jemalloc is the allocator that Rust previously shipped with and that the Rust compiler still uses itself. Its focus is to reduce memory fragmentation and support high concurrency. It’s also the default allocator on FreeBSD. If this sounds interesting to you, let’s give it a try!

First off, add the jemallocator crate as a dependency:

[dependencies]
jemallocator = "0.3.2"

Then in your applications entry point ( main.rs ), set it as the global allocator like this:

#[global_allocator]
static GLOBAL: jemallocator::Jemalloc = jemallocator::Jemalloc;

Please note that jemalloc doesn’t support all platforms.

mimalloc

Another interesting alternative allocator is mimalloc . It was developed by Microsoft, has quite a small footprint, and some innovative ideas for free lists.

It also features configurable security features (have a look at its Cargo.toml ). Which means we can turn them off more performance! Add the mimalloc crate as a dependency like this:

[dependencies]
mimalloc = { version = "0.1.17", default-features = false }

and, same as above, add this to your entry point file:

#[global_allocator]
static GLOBAL: mimalloc::MiMalloc = mimalloc::MiMalloc;

Profile Guided Optimization

This is a neat feature of LLVM but I’ve never used it. Please read the docs .

Actual profiling and optimizing your code

Now this is where you need to actually adjust your code and fix all those clone() calls. Sadly, this is a topic for another post! (While you wait another year for me to write it, you can read aboutcows!)

Edit:People keep asking for those actual tips on how to optimize Rust code. And luckily

~~I tricked them~~

they had some good material for me to link to:

The very convenient cargo flamegraph (also works as a standalone tool)
Christopher Sebastian recently published How To Write Fast Rust Code
Robin Freyler’s Fastware Workshop from RustFest 2018

以上就是本文的全部内容，希望对大家的学习有所帮助，也希望大家多多支持码农网

查看所有标签

猜你喜欢:

Cheap tricks for high-performance Rust

本站部分资源来源于网络，本站转载出于传递更多信息之目的，版权归原作者或者来源机构所有，如转载稿涉及版权问题，请联系我们。

码农书籍

C++ 程序设计语言（特别版）（英文影印版）

[美] Bjarne Stroustrup / 高等教育出版社 / 2001-8-1 / 55.00

《C++程序设计语言》(特别版)(影印版)作者是C++的发明人，对C++语言有着全面、深入的理解，因此他强调应将语言作为设计与编程的工具，而不仅仅是语言本身，强调只有对语言功能有了深入了解之后才能真正掌握它。《C++程序设计语言》编写的目的就是帮助读者了解C++是如何支持编程技术的，使读者能从中获得新的理解，从而成为一名优秀的编程人员和设计人员。一起来看看《C++ 程序设计语言（特别版）（英文影印版）》这本书的介绍吧!

码农工具

Cheap tricks for high-performance Rust

Tweaking our `release` profile

Link-time optimization

Code generation units

Setting a specific target CPU

Aborting

Using a different allocator

jemalloc

mimalloc

Profile Guided Optimization

Actual profiling and optimizing your code

C++ 程序设计语言（特别版）（英文影印版）

JS 压缩/解压工具

HTML 编码/解码

UNIX 时间戳转换

Cheap tricks for high-performance Rust

Tweaking our release profile

Link-time optimization

Code generation units

Setting a specific target CPU

Aborting

Using a different allocator

jemalloc

mimalloc

Profile Guided Optimization

Actual profiling and optimizing your code

C++ 程序设计语言（特别版）（英文影印版）

JS 压缩/解压工具

HTML 编码/解码

UNIX 时间戳转换

Tweaking our `release` profile