Understanding String and &str in Rust

栏目: IT技术 · 发布时间: 4年前

内容简介:Most likely, soon after you’ve started your Rust journey, you ran into this scenario where you tried to work with string types (or should I say, you thought you were?), and the compiler refused to compile your code because of something that looks like a st

Most likely, soon after you’ve started your Rust journey, you ran into this scenario where you tried to work with string types (or should I say, you thought you were?), and the compiler refused to compile your code because of something that looks like a string, actually isn’t a string.

For example, let’s take a look at this super simple function greet(name: String) which takes something of type String and prints it to screen using the println!() macro:

fn main() {
  let my_name = "Pascal";
  greet(my_name);
}

fn greet(name: String) {
  println!("Hello, {}!", name);
}

Compiling this code will result in a compile error that looks something like this:

error[E0308]: mismatched types
 --> src/main.rs:3:11
  |
3 |     greet(my_name);
  |           ^^^^^^^
  |           |
  |           expected struct `std::string::String`, found `&str`
  |           help: try using a conversion method: `my_name.to_string()`

error: aborting due to previous error

For more information about this error, try `rustc --explain E0308`.

You can see this behaviour in action here . Just hit the “Run” button and look at the compiler output.

Luckily, Rust’s compiler is very good at telling us what’s the problem. Clearly, we’re dealing with two different types here: std::string::String , or short String , and &str . While greet() expects a String , apparently what we’re passing to the function is something of type &str . The compiler even provides a hint on how it can be fixed. Changing line 3 to let my_name = "Pascal".to_string(); fixes the issue.

What’s going on here? What is a &str ? And why do we have to perform an explicit conversion using to_string() ?

Understanding the String type

To answer these questions, it’s beneficial to have a good understanding of how Rust stores data in memory. If you haven’t read our article on Taking a closer look at Ownership in Rust yet, I highly recommend checking it out first.

Let’s take the example from above and look at how my_name is stored in memory, assuming that it’s of type String (e.g we’ve used .to_string() as the compiler suggested):

buffer
                   /   capacity
                 /   /  length
               /   /   /
            +–––+–––+–––+
stack frame │ • │ 8 │ 6 │ <- my_name: String
            +–│–+–––+–––+
              │
            [–│–––––––– capacity –––––––––––]
              │
            +–V–+–––+–––+–––+–––+–––+–––+–––+
       heap │ P │ a │ s │ c │ a │ l │   │   │
            +–––+–––+–––+–––+–––+–––+–––+–––+

            [––––––– length ––––––––]

Rust will store the String object for my_name on the stack. The object comes with a pointer to a heap-allocated buffer which holds the actual data, the buffer’s capacity and the length of the data that is being stored. Given this, the size of the String object itself is always fixed and three words long .

One of the things that make a String a String , is the capability of resizing its buffer if needed. For example, we could use its .push_str() method to append more text, which potentially causes the underlying buffer to increase in size (notice that my_name needs to be mutable to make this work):

let mut my_name = "Pascal".to_string();
my_name.push_str( " Precht");

In fact, if you’re familiar with Rust’s Vec<T> type, you already know what a String is because it’s essentially the same in behaviour and characteristics, just with the difference that it comes with guarantees of only holding well-formed UTF-8 text.

Understanding string slices

String slices (or str ) are what we work with when we either reference a range of UTF-8 text that is “owned” by someone else, or when we create them using string literals .

If we were only interested in the last name stored in my_name , we can get a reference to that part of the string like this:

let mut my_name = "Pascal".to_string();
my_name.push_str( " Precht");

let last_name = &my_name[7..];

By specifying the range from the 7th byte (because there’s a whitespace) until the end of the buffer (”..”), last_name is now a string slice referencing text owned by my_name . It borrows it. Here’s what it looks like in memory:

my_name: String   last_name: &str
            [––––––––––––]    [–––––––]
            +–––+––––+––––+–––+–––+–––+
stack frame │ • │ 16 │ 13 │   │ • │ 6 │ 
            +–│–+––––+––––+–––+–│–+–––+
              │                 │
              │                 +–––––––––+
              │                           │
              │                           │
              │                         [–│––––––– str –––––––––]
            +–V–+–––+–––+–––+–––+–––+–––+–V–+–––+–––+–––+–––+–––+–––+–––+–––+
       heap │ P │ a │ s │ c │ a │ l │   │ P │ r │ e │ c │ h │ t │   │   │   │
            +–––+–––+–––+–––+–––+–––+–––+–––+–––+–––+–––+–––+–––+–––+–––+–––+

Notice that last_name does not store capacity information on the stack. This is because it’s just a reference to a slice of another String that manages its capacity. The string slice, or str itself, is what’s considered ” unsized ”. Also, in practice string slices are always references so their type will always be &str instead of str .

Okay, this explains the difference between String , &String and str and &str , but we haven’t actually created such a reference in our original example, did we?

Understanding string literals

As mentioned earlier, there are two cases when we’re working with string slices: we either create a reference to a sub string, or we use string literals .

A string literal is created by surrounding text with double quotes, just like we did earlier:

let my_name = "Pascal Precht"; // This is a `&str` not a `String`

The next question is, if a &str is a slice reference to a String owned by someone else, who is the owner of that value given that the text is created in place?

It turns out that string literals are a bit special. They are string slices that refer to “preallocated text” that is stored in read-only memory as part of the executable. In other words, it’s memory that ships with our program and doesn’t rely on buffers allocated in the heap.

That said, there’s still an entry on the stack that points to that preallocated memory when the program is executed:

my_name: &str
            [–––––––––––]
            +–––+–––+
stack frame │ • │ 6 │ 
            +–│–+–––+
              │                 
              +––+                
                 │
 preallocated  +–V–+–––+–––+–––+–––+–––+
 read-only     │ P │ a │ s │ c │ a │ l │
 memory        +–––+–––+–––+–––+–––+–––+

With a better understanding of the difference between String and &str , there’s probably another question that comes up.

Which one should be used?

Obviously, this depends on a number of variables, but generally, it’s safe to say that, if the API we’re building doesn’t need to own or mutate the text it’s working with, it should take a &str instead of a String . This means, an improved version of the original greet() function would look like this:

fn greet(name: &str) {
  println!("Hello, {}!", name);
}

Wait, but what if the caller of this API really only has a String and can’t convert it to a &str for unknown reasons? No problem at all. Rust has this super powerful feature called deref coercing which allows it to turn any passed String reference using the borrow operator, so &String , to a &str before the API is executed. This will be covered in more detail in another article.

Our greet() function therefore will work with the following code:

fn main() {
  let first_name = "Pascal";
  let last_name = "Precht".to_string();

  greet(first_name);
  greet(&last_name); // `last_name` is passed by reference
}

fn greet(name: &str) {
  println!("Hello, {}!", name);
}

See it in action here !

That’s it! I hope this article was useful. There’s an interesting discussion on Reddit about this content as well! Let me know what you think or what you would like to learn about next on twitter or sign up for the Rust For JavaScript Developers mailing list!


以上所述就是小编给大家介绍的《Understanding String and &str in Rust》,希望对大家有所帮助,如果大家有任何疑问请给我留言,小编会及时回复大家的。在此也非常感谢大家对 码农网 的支持!

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

跨越鸿沟

跨越鸿沟

[美] 杰弗里·摩尔(Geoffrey A. Moore) / 赵娅 / 机械工业出版社 / 2009-1 / 36.00元

在真正涉足高科技领域之前,你有必要读一读这本书——在这个节奏飞快、竞争激烈的技术竞技场上,这本书绝对能够帮助你更容易地获得成功。 ——威廉姆·劳森 罗盛软件公司董事会主席兼CEO 最近40年来,本书对高科技营销各个方面所做出的贡献远远超过了其他任何相关书籍。如今已经有无数企业和大学分别在自己的运营和教学过程中引入了鸿沟思想,如果你还不是这些企业或大学中的一员,你可能就要担心自己的未来了......一起来看看 《跨越鸿沟》 这本书的介绍吧!

Base64 编码/解码
Base64 编码/解码

Base64 编码/解码

UNIX 时间戳转换
UNIX 时间戳转换

UNIX 时间戳转换

RGB CMYK 转换工具
RGB CMYK 转换工具

RGB CMYK 互转工具