(A Few) Advanced Variable Types in Rust

栏目: IT技术 · 发布时间: 4年前

内容简介：“I haven’t seenI think of these as wrappers that add abilities (and restrictions) to a variable. They give a variable super powers since the Rust compiler is so strict about what you can and can’t do with variables.PROVIDES:

(A Few) Advanced Variable Types in Rust — Keep one eye on your code at all times!

“I haven’t seen Evil Dead II yet”. Much is made about this simple question in the movie adaption of High Fidelity . Does “yet” mean the person does, indeed, intend to see the film? Jack Black’s character is having real trouble with the concept – not only does he know that the speaker, John Cusack’s character, has seen Evil Dead II , but what idiot wouldn’t see it, “ because it’s a brilliant film. It’s so funny, and violent, and the soundtrack kicks so much ass.” I love this exchange, but I’m a fan of the film anyway. It is not always clear to me how to handle advanced variable types Rust, yet.

I think of these as wrappers that add abilities (and restrictions) to a variable. They give a variable super powers since the Rust compiler is so strict about what you can and can’t do with variables.

Box<T>

PROVIDES:

Smart pointer that forces your variable’s value to be stored on the heap instead of the stack. The Box<> variable itself is just a pointer so its size is obvious and can, itself, be stored on the stack.

RESTRICTIONS:

USEFUL WHEN:

If the size of an item cannot be determined at compile time it will complain if the default is to store it on the stack (where a calculable size is necessary). Using Box<> will force the storage on the heap where a fixed size is not needed. For example, a recursive data-structure, including enums, will not work on the stack because a concrete size cannot be calculated. Turning the recursive field into a Box<> means it stores a pointer which CAN be sized. The example in the docs being:

enum List<T> {
    Cons(T, Box<List<T>>),
    Nil,
}

Also useful if you have a very large-sized T, and want to transfer ownership of that variable without it being copied each time.

NOTABLY PROVIDES:

just see the rust-lang.org docs

EXAMPLES/DISCUSSION:

https://doc.rust-lang.org/stable/rust-by-example/std/box.html

https://www.koderhq.com/tutorial/rust/smart-pointer/

https://manishearth.github.io/blog/2017/01/10/rust-tidbits-box-is-special/

Setting the value of a simple Box<> variable is easy enough and getting the

value back looks very normal:

fn main() {
    let answer = Box::new(42);
    println!("The answer is : {}", answer);
}

Cell<T>

PROVIDES:

You can have multiple, shared references to the Cell<> (and thus, access to the value inside with .get() ) and yet still mutate the value inside (with .set() ). This is called interior mutability because the value inside can be changed but mut on the Cell<> itself is not needed. The inner value can only be set by calling a method on the Cell<> .

RESTRICTIONS:

It is not possible to get a reference to what is inside the Cell , only a copy of the value. Also, Cell does not implement sync , so it cannot be given to a different thread, which ensures safety.

USEFUL WHEN:

Usually used for small values, such as counters or flags, where you need multiple shared references to the value AND be allowed to mutate it at the same time, in a guaranteed safe way.

NOTABLY PROVIDES:

.set()to set the value inside

.get()to get a copy of the value inside

.take()to get a copy of the value inside AND reset the value inside to default.

see the rust-lang.org docs

EXAMPLES/DISCUSSION:

https://hub.packtpub.com/shared-pointers-in-rust-challenges-solutions/

https://ricardomartins.cc/2016/06/08/interior-mutability

Setting the inner value of a Cell<> is only possible with a method call which is how it maintains safety:

use std::cell::Cell;
fn main() {
    let answer = Cell::new(0);
    answer.set(42);
    println!("The answer is : {}", answer.get());
}

RefCell<T>

PROVIDES:

RefCell<>is very similar to Cell<> except it adds borrow checking, but at run-time instead of compile time! This means, unlike Cell<> , it is possible to write RefCell<> code which will panic!() . You borrow() a ref to the inner value for read-only or borrow_mut() in order to change it.

RESTRICTIONS:

borrow()will panic if a borrow_mut() is in place, and borrow_mut() will panic if either type is in place.

USEFUL WHEN:

NOTABLY PROVIDES:

.borrow()to get a copy of the value at the ref

.borrow_mut()to set the value at the ref

.try_borrow()and .try_borrow_mut() will return a Result<> or error instead of a panic!() .

see the rust-lang.org docs

EXAMPLES/DISCUSSION:

https://ricardomartins.cc/2016/06/08/interior-mutability (again)

You must successfully borrow_mut() the RefCell<> in order to set the value (by dereferencing) and then simply borrow() it to retrieve the value:

use std::cell::RefCell;
fn main() {
    let answer = RefCell::new(0);
    *answer.borrow_mut() = 42;
    println!("The answer is : {}", answer.borrow());
}

whereas, something as simple as this compiles, but panics at run-time. Imagine how much more obscure this code could be. Remember, any number of read-only references or exactly 1 read-write reference and nothing else – although for RefCell, this is enforced at run-time:

use std::cell::RefCell;
fn main() {
    let answer = RefCell::new(0);
    let break_things = answer.borrow_mut();
    println!("The initial value is : {}", *break_things);
    *answer.borrow_mut() = 42;
    println!("The answer is : {}", answer.borrow());
}

Rc<T>

PROVIDES:

Adds the feature of run-time reference counting to your variable, but this is the simple, lower-cost version – it is not thread safe.

RESTRICTIONS:

Right from the docs “you cannot generally obtain a mutable reference to something inside an Rc . If you need mutability, put a Cell or RefCell inside the Rc “. So while there is a get_mut() method, it’s easy to just use a Cell<> inside.

USEFUL WHEN:

You need run-time reference counting of a variable so it hangs around until the last reference of it is gone.

NOTABLY PROVIDES:

.clone()– get a new copy of the pointer to the same value, upping the reference count by 1.

see the rust-lang.org docs

EXAMPLES/DISCUSSION:

https://blog.sentry.io/2018/04/05/you-cant-rust-that#refcounts-are-not-dirty

Note that in the example below, my_answer is still pointing to valid memory even when correct_answer is dropped, because the Rc<> had an internal count of “2” and drops it to “1”, leaving the storage of “42” still valid.

use std::rc::Rc;
fn main() {
    let correct_answer = Rc::new(42);
    let my_answer = Rc::clone(&correct_answer);

    println!("The correct answer is : {}", correct_answer);
    drop(correct_answer);

    println!("And you got : {}", my_answer);
}

Arc<T>

PROVIDES:

Arc<>is an atomic reference counter, very similar to Rc<> above but thread-safe.

RESTRICTIONS:

More expensive than Rc<> . Also note, the <T> you store must have the Send and Sync traits. So an Arc<RefCell<T>> will not work because RefCell<> is not Sync .

USEFUL WHEN:

Same as Rc<> , You need run-time reference counting of a variable so it hangs around until the last reference of it is gone, but safe across threads as long as the inner <T> is.

NOTABLY PROVIDES:

see the rust-lang.org docs

EXAMPLES/DISCUSSION:

https://medium.com/@DylanKerler1/how-arc-works-in-rust-b06192acd0a6

Same idea as with Rc<> , we just show it working across multiple threads (and then sleep for just 10ms to let those threads finish).

use std::sync::Arc;
use std::thread;
use std::time::Duration;
fn main() {
    let answer = Arc::new(42);

    for threadno in 0..5 {
        let answer = Arc::clone(&answer);
        thread::spawn(move || {
            println!("Thread {}, answer is: {}", threadno + 1, answer);
        });
    }
    let ten_ms = Duration::from_millis(10);
    thread::sleep(ten_ms);
}

Mutex<T>

PROVIDES:

Mutual exclusion lock protecting shared data, even across threads.

RESTRICTIONS:

Any thread which panics will “poison” the Mutex<> and make it inaccessible to all threads. The T stored must allow Send but Sync is not necessary.

USEFUL WHEN:

working on it!

NOTABLY PROVIDES:

see the rust-lang.org docs

EXAMPLES/DISCUSSION:

https://doc.rust-lang.org/book/ch16-03-shared-state.html

use std::sync::{Arc, Mutex};
use std::thread;
use std::time::Duration;
fn main() {
    let answer = Arc::new(Mutex::new(42));

    for thread_no in 0..5 {
        let changer = Arc::clone(&answer);
        thread::spawn(move || {
            let mut changer = changer.lock().unwrap();
            println!("Setting answer to thread_no: {}", thread_no + 1,);
            *changer = thread_no + 1;
        });
    }
    let ten_ms = Duration::from_millis(10);
    thread::sleep(ten_ms);

    if answer.is_poisoned() {
        println!("Mutex was poisoned :(");
    } else {
        println!("Mutex survived :)");
        let final_answer = answer.lock().unwrap();
        println!("Ended with answer: {}", final_answer);
    }
}

RwLock<T>

PROVIDES:

Similar to RefCell, but thread safe. borrow() is read(), borrow_mut is write(). They don’t return an option, they will block until they do get the lock.

RESTRICTIONS:

Any thread which panics while a write lock is in place will “poison” the RwLock<> and make it inaccessible to all threads. A panic! during a read lock does not poison the RwLock . The T stored must allow both Send and Sync .

USEFUL WHEN:

working on it!

NOTABLY PROVIDES:

see the rust-lang.org docs

EXAMPLES/DISCUSSION:

Slightly fancier example, that shows getting both read() and write() locks on the value. If nothing panics, we should see the answer at the end.

use std::sync::{Arc, RwLock};
use std::thread;
use std::time::Duration;
fn main() {
    let answer = Arc::new(RwLock::new(42));

    for thread_no in 0..5 {
        if thread_no % 2 == 1 {
            let changer = Arc::clone(&answer);
            thread::spawn(move || {
                let mut changer = changer.write().unwrap();
                println!("Setting answer to thread_no: {}", thread_no + 1,);
                *changer = thread_no + 1;
            });
        } else {
            let reader = Arc::clone(&answer);
            thread::spawn(move || {
                let reader = reader.read().unwrap();
                println!(
                    "Checking  answer in thread_no: {}, value is {}",
                    thread_no + 1,
                    *reader
                );
            });
        }
    }
    let ten_ms = Duration::from_millis(10);
    thread::sleep(ten_ms);

    if answer.is_poisoned() {
        println!("Mutex was poisoned :(");
    } else {
        println!("Mutex survived :)");
        let final_answer = answer.read().unwrap();
        println!("Ended with answer: {}", final_answer);
    }
}

Checking answer in thread_no: 1, value is 42
Checking answer in thread_no: 3, value is 42
Setting answer to thread_no: 2
Checking answer in thread_no: 5, value is 2
Setting answer to thread_no: 4
Mutex survived :)
Ended with answer: 4

Summary

There are more, plus many custom types, some I’ve even used like the crate once_cell . I started using that for the web app I was (am?) working on and wrote a little about it . Also, as you saw in the last two examples, you can combine types when you need multiple functionalities. I have included these examples in a GitHub repo, pointers .

I’ll probably hear about or (much more slowly) learn about mistakes I’ve made in wording here or come up with much better examples and excuses for using these various types, so I’ll try to update this post as I do. I see using this myself as a reference until I am really familiar with each of these types. Obviously, any mistakes here are mine alone as I learn Rust and not from any of the links or sources I listed!

Also, lots of help from 3 YouTubers I’ve been watching – the best examples can been seen as they write code and explain why they need something inside an Rc<> or in a Mutex<> . Check out their streams and watch over their shoulder as they code!!

Jon Gjengset especially, in this case, his video on this same topic !
David Pedersen
Ryan Levick who just had a video on once_cell !

很遗憾的说，推酷将在这个月底关闭。人生海海，几度秋凉，感谢那些有你的时光。

以上就是本文的全部内容，希望对大家的学习有所帮助，也希望大家多多支持码农网

查看所有标签

猜你喜欢:

(A Few) Advanced Variable Types in Rust

本站部分资源来源于网络，本站转载出于传递更多信息之目的，版权归原作者或者来源机构所有，如转载稿涉及版权问题，请联系我们。

码农书籍

解密搜索引擎技术实战

罗刚 / 2011-6 / 69.80元

《解密搜索引擎技术实战-Lucene&Java精华版(附盘)》，本书主要包括总体介绍部分、爬虫部分、自然语言处理部分、全文检索部分以及相关案例分析。爬虫部分介绍了网页遍历方法和如何实现增量抓取，并介绍了从网页等各种格式的文档中提取主要内容的方法。自然语言处理部分从统计机器学习的原理出发，包括了中文分词与词性标注的理论与实现以及在搜索引擎中的实用等细节，同时对文档排重、文本分类、自动聚类、句法分析树......一起来看看《解密搜索引擎技术实战》这本书的介绍吧!

码农工具

(A Few) Advanced Variable Types in Rust

Box<T>

Cell<T>

RefCell<T>

Rc<T>

Arc<T>

Mutex<T>

RwLock<T>

Summary

解密搜索引擎技术实战

HTML 编码/解码

URL 编码/解码

HEX HSV 转换工具