Cheap tricks for high-performance Rust

栏目: IT技术 · 发布时间: 4年前

内容简介:So you’re writing Rust but it’s not fast enough? Even though you’re usingPlease remember that the following suggestionsLet’s first of all enable some more optimizations for when we do

So you’re writing Rust but it’s not fast enough? Even though you’re using cargo build --release ? Here’s some small things you can do to increase the runtime speed of a Rust project – practically without changing any code!

Please remember that the following suggestions do not replace actual profiling and optimizations! I also think it goes without saying that the only way to detect if any of this helps is having benchmarks that represent how your application behaves under real usage.

Tweaking our release profile

Let’s first of all enable some more optimizations for when we do cargo build --release . The deal is pretty simple: We enable some features that make building release builds even slower but get more thorough optimizations as a reward.

We add the flags described below to our main Cargo.toml file, i.e., the top most manifest file in case you are using a Cargo workspace . If you don’t already have a section called profile.release , add it:

[profile.release]

Link-time optimization

The first thing we’ll do is enable link-time optimization (LTO). It’s a kind of whole-program or inter-module optimization as it runs as the very last step when linking the different parts of your binary together. You can think of it as allowing better inlining across dependency boundaries (but it’s of course more complicated that that).

Rust can use multiple linker flavors, and the one we want is “optimize across all crates”, which is called “fat”. To set this, add the lto flag to your profile:

lto = "fat"

Code generation units

Next up is a similar topic. To speed up compile times, Rust tries to split your crates into small chunks and compile as many in parallel as possible. The downside is that there’s less opportunities for the compiler to optimize code across these chunks. So, let’s tell it to do one chunk per crate:

codegen-units = 1

Setting a specific target CPU

By default, Rust wants to build a binary that works on as many machines of the target architecture as possible. However, you might actually have a pretty new CPU with cool new features! To enable those, we add

-C target-cpu=native

as a “Rust flag”, i.e. the environment variable RUSTFLAGS or the target’s rustflags field in your .cargo/config .

Aborting

Now we get into some of the more unsafe options. Remember how Rust by default uses stack unwinding (on the most common platforms)? That costs performance! Let’s skip stack traces and the ability to catch panics for reduced code size and better cache usage:

panic = "abort"

Please note that some libraries might depend on unwinding and will explode horribly if you enable this!

Using a different allocator

One thing many Rust programs do is allocate memory. And they don’t just do this themselves but actually use an (external) library for that: an allocator. Current Rust binaries use the default system allocator by default, previously they included their own with the standard library. (This change has lead to smaller binaries and better debug-abiliy which made some people quite happy).

Sometimes your system’s allocator is not the best pick, though. Not to worry, we can change it! I suggest giving both jemalloc and mimalloc a try.

jemalloc

jemalloc is the allocator that Rust previously shipped with and that the Rust compiler still uses itself. Its focus is to reduce memory fragmentation and support high concurrency. It’s also the default allocator on FreeBSD. If this sounds interesting to you, let’s give it a try!

First off, add the jemallocator crate as a dependency:

[dependencies]
jemallocator = "0.3.2"

Then in your applications entry point ( main.rs ), set it as the global allocator like this:

#[global_allocator]
static GLOBAL: jemallocator::Jemalloc = jemallocator::Jemalloc;

Please note that jemalloc doesn’t support all platforms.

mimalloc

Another interesting alternative allocator is mimalloc . It was developed by Microsoft, has quite a small footprint, and some innovative ideas for free lists.

It also features configurable security features (have a look at its Cargo.toml ). Which means we can turn them off more performance! Add the mimalloc crate as a dependency like this:

[dependencies]
mimalloc = { version = "0.1.17", default-features = false }

and, same as above, add this to your entry point file:

#[global_allocator]
static GLOBAL: mimalloc::MiMalloc = mimalloc::MiMalloc;

Profile Guided Optimization

This is a neat feature of LLVM but I’ve never used it. Please read the docs .

Actual profiling and optimizing your code

Now this is where you need to actually adjust your code and fix all those clone() calls. Sadly, this is a topic for another post! (While you wait another year for me to write it, you can read aboutcows!)

Edit:People keep asking for those actual tips on how to optimize Rust code. And luckily

I tricked them

they had some good material for me to link to:


以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

数据结构与算法

数据结构与算法

Michael McMillan / 吕秀峰、崔睿 / 人民邮电出版社 / 2009-5 / 49.00元

《数据结构与算法C#语言描述》是在.NET框架下用C#语言实现数据结构和算法的第一本全面的参考书。《数据结构与算法C#语言描述》介绍的方法非常实用,采用了时间测试而非大O表示法来分析算法性能。内容涵盖了数据结构和算法的基本原理,涉及数组、广义表、链表、散列表、树、图、排序搜索算法以及更多概率算法和动态规则等高级算法。此外,书中还提供了.NET框架类库中的C#语言实现的数据结构和算法。 《数据......一起来看看 《数据结构与算法》 这本书的介绍吧!

JS 压缩/解压工具
JS 压缩/解压工具

在线压缩/解压 JS 代码

RGB HSV 转换
RGB HSV 转换

RGB HSV 互转工具