Problem setup
Recently I've had to write some
nom
code (it is a parser combinator
library
for Rust). To my surprise I discovered that there is no combinator for creating a parser which always succeeds returning a certain given value. At least not without using macros which is
discouraged
in nom v5. That combinator would be something like
pure
in Haskell Parsec. It's not very useful on its own, but can be used as part of other combinators, say providing a default alternative for
alt
.
So I decided to add
success
to nom. After looking at the library code, I realised that it uses closures quite heavily and I didn't use them much in Rust, so I had some questions. Here is my version of
success
basically copy-pasted from a similar combinator
value
:
pub fn success < I : Clone + Slice < RangeTo < usize >> , O : Clone , E : ParseError < I >> ( val : O ) -> impl Fn ( I ) -> IResult < I , O , E >
{
move | input : I | {
Ok ( ( input , val . clone ( ) ) )
}
}
That type signature looks a little scary, eh? However, since we are not going to focus on
I
or
E
arguments here (input and error types), we can just rewrite it like this, omitting irrelevant details:
pub fn success < I , O : Clone , E > ( val : O ) -> impl Fn ( I ) -> IResult < I , O , E >
{
move | input : I | {
Ok ( ( input , val . clone ( ) ) )
}
}
Questions
I had three questions here:
-
Why do we need to clone
val
? After all it looks like l I have a value and just want to pass the ownership to the parser, no need to clone anything. -
Why we have
move
closure, but return type of the function isimpl Fn(something)
and notimpl FnOnce(something)
I thought that when we usemove
then we move the captured environment into the closure andFnOnce
trait matches that behaviour. -
Can we omit
move
or change the type toFnOnce
or removeClone
i.e. to remove any of those things which I didn't understand and still make it work? Are they actually necessary?
TL;DR
move
determines how captured variables are
moved
into the returned closure. Then returned
impl Fn/FnMut/FnOnce
puts restrictions on how they are
used
inside that closure (which in turn defines whether the closure can be used once or more). We can
move
into closure but still only use the captured values by reference and return
impl Fn
to allow multiple calls of the returned closure. And yes, everything in the code above was necessary :)
More detailed answers
I assume here that you know the basics about closures. If not, you can read a corresponding chapter in the Rust book. Also, on top of that I would recommend reading Steven Donovan's post "Why Rust Closures are (Somewhat) Hard" .
That post (and Rust reference) tells you that a closure basically corresponds to an anonymous structure of some concrete but unknown type which has fields corresponding to the captured variables. The capture mode of those fields (i.e. whether they are
&T
,
&mut T
or
T
) is determined by the usage of the captured variables inside the closure. Or it can be forced to be
T
, i.e. to passing the ownership to the closure, by using
move
keyword.
I'll repeat the code above for your convenience:
pub fn success < I , O : Clone , E > ( val : O ) -> impl Fn ( I ) -> IResult < I , O , E >
{
move | input : I | {
Ok ( ( input , val . clone ( ) ) )
}
}
So, in our case we implicitly have the following structure for our
move
closure, it only captures a single variable
val
of generic type
O
:
struct ClosureType < O > {
// where it's O, &O or &mut O is defined by
// `val` usages in the closure and presence of `move` keyword
val : O ;
}
Then we know that this closure should implement
Fn
trait, since this is what is returned from
success
function. As described in the
documentation
, it will look like this (note the helpful comment):
impl < I , O , E > Fn < I > for ClosureType < O > {
type Output = IResult < I , O , E > ;
// This `&self` here is because we implement Fn, not FnOnce!
// In FnOnce it would be `self`, owning the structure fields.
fn call ( & self , input : I ) -> Output {
Ok ( ( input , self . val . clone ( ) ) ) ;
}
}
Now we can answer our
question 1
: why do we have to clone? If we didn't clone, we would be moving out of
self.val
which is behind a reference. So we can't do that for
Fn
.
Another way of resolving this issue (i.e. if we don't want to clone) would be to use FnOnce as the result type which would in effect give us
self
instead of
&self
in the call method, so we can pass the ownership further with
Ok((input, self.val))
. However, using FnOnce means that we can use the resulting closure just once, perhaps that's not we want in a parser. Why? I don't know for sure, but I suspect that we may want to do certain lookaheads while parsing and this means invoking the closure and then backtracking if it doesn't match. But this mean we would have spoiled the parser closure and can't use it again for another run. So let's assume that our parsers should be
Fn
s which is indeed Nom's API.
The
question 2:
how
move
can coexist with returning
Fn
is also clear now.
move
determines how values are
captured
into that closure-structure, i.e. which type they have there (refs, mutable refs or owned values) while
Fn/FnOnce/FnMut
trait is determined by the way they are
used
in that closure (in our
call
method above). For example, we can
move
into a closure but still only use the captured values by reference and return
impl Fn
to allow multiple calls of the returned closure.
The remaining part of
question 3
is why we want to use
move
closure here if we only access the variable by reference anyway. The explanation is that we are
returning
this closure, but
val
will be dropped immediately on return and the closure can't outlive it. In other words, if we didn't use
move
, we would have
val: &'a O
in that closure structure field. But that reference would immediately become invalid since when we return our closure,
val
is dropped, so no references to that are allowed, including inside our returned closure. So that won't work and we would get "val does not live long enough" error message.
Therefore, it looks like all the bells and whistles in that
success
combinator were actually needed.
Summary
I think the main conclusion of this investigation is that explicitly writing out implicit structures and Fn/FnMut/FnOnce implementations is very useful for making sense of compiler errors and understanding what's going on under the hood in the world of closures. At least for the first time.
Further reading
-
Rust reference: "Closure expressions" , "Closure types" (it even has a special note about combination of
move
closures andFn
here ). -
"Why Rust Closure are (Somewhat) Hard" blog post
Update : as /u/CUViper has suggested on Reddit, there is also a very nice blog post "Closures: Magic Functions" which meticulously demonstrates the desugaring of various types of closures.
以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持 码农网
猜你喜欢:本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们。
Blockchain Basics
Daniel Drescher / Apress / 2017-3-16 / USD 20.99
In 25 concise steps, you will learn the basics of blockchain technology. No mathematical formulas, program code, or computer science jargon are used. No previous knowledge in computer science, mathema......一起来看看 《Blockchain Basics》 这本书的介绍吧!