original
Rust
Fearless concurrency
Rust
is designed to solve two trouble
problems:
1. How to safety
system programming
2. How to be fearless about concurrency
Initially, these questions seemed irrelevant
, but surprisingly, the approach turned out to be the same: the same tools
that make Rust
safe can also be used Help solve concurrency issues
.
Memory safety and concurrency
errors are generally considered to be code
accessing data when it should not. Rust
relies on ownership
Static checking
for you.
For memory safety
, you can program when there is no garbage collection
, and you don’t have to worry about segmentation fault
, because Rust
will Discover your mistake.
For Concurrency
, you can choose from (Message passing, shared state, lock-free, purely functional
), and Rust
helps you avoid Common
pitfalls.
The following is concurrency in Rust
:
1. Channel
transfers the ownership of sending message
, so you can send pointer
from one thread to another thread without worrying about thread
Competition.Rust
channels force isolation of threads
.here
2. Lock
knows what data
it protects, and Rust
guarantees that only when holds the lock
, it can Access data. Without sharing state
. Force”lock
data in Rust
, not code
\ “.
3. Between multiple threads, each data type
knows whether it can be safe
sent or accessed, and Rust
enforces it, even for No locks
data structures, and no data races
. Wiresafe
is not only a document; it is also a rules
.
4. Stack frames can even be shared
between threads
. Here, Rust
statically ensures that other
threads use them. , these frames
are still active
. Even the boldest
forms of sharing are safe in Rust
.
These benefits come from Rust
‘s ownership model
. In fact, locks, channels, lock-free data structures
, etc. are all in the library
rather than defined in Core Language
.
That is, Rust
‘s concurrency
method is open
: new libraries
can carry new paradigms
code> and to catch new errors
, just add the API
that uses the Rust
ownership function.
Background:Ownership
In Rust
, each value
has a “owner field
“, which passes or returns
the value Indicates a transfer ("move"
) from old ownership
to new domain
. When ending domain
, this When automatically destroyed
the value it still holds.
Take a look at a simple
example. Suppose you create a vector
and push some elements
onto it:
fn make_vec() {<!-- --> let mut vec = Vec::new(); //Owned by the domain of `make_vec` vec.push(0); vec.push(1); //End of domain, destruct `"vec"` }
The field that creates the value
also initially owns
it. At this point, the body of make_vec
is the owning field of vec
. The owner can use vec
to do work.
When the domain ends, it is still owned by the domain vec
, so it will be automatically released
.
It would be more interesting if return or pass
a vector:
fn make_vec() -> Vec<i32> {<!-- --> let mut vec = Vec::new(); vec.push(0); vec.push(1); vec //Transfer `ownership` to the caller } fn print_vec(vec: Vec<i32>) {<!-- --> //The `"vec"` parameter is part of this domain and therefore owned by `"print_vec"` for i in vec.iter() {<!-- --> println!("{}", i) } //Now, release `"vec"` } fn use_vec() {<!-- --> let vec = make_vec(); //Get vector ownership, print_vec(vec); //Transfer ownership to `"print_vec"` }
Now, before the make_vec
scope ends, vec
returns it out of scope
; without destructing it. Then, like use_vec
This caller takes ownership of the vector
.
On the other hand, the print_vec
function takes the vec
parameter, and its caller transfers the ownership of the vector
to it. Because print_vec
code>No further transfer
ownership occurs, so the vector
is destroyed when its field
ends.
Once you relinquish ownership
, you can no longer use the value
. For example, consider the following use_vec
variant:
fn use_vec() {<!-- --> let vec = make_vec(); //Take ownership of `VectorPass` print_vec(vec); //Transfer ownership to `"print_vec"`, for i in vec.iter() {<!-- --> //Continue to use `"vec"` println!("{}", i * 2) } }
The compiler says that vec
is no longer available; ownership
has been transferred. This is great, so the vector
has been released at this time! Disaster avoided.
Lending
Currently, not satisfied because no intention
of letting print_vec
destruct the vector. What is really wanted is to temporarily grant print_vec
access to the vector and then continue using vector
.
This relies on borrowing
. If you have access to a certain value in Rust
, you can lend this permission
to call
Function.Rust
checks that these lives
will not exceed the borrowed
objects.
To borrow
a value, use the &
symbol to reference
it (a pointer):
fn print_vec(vec: & amp;Vec<i32>) {<!-- --> //`"vec"` parameter is borrowed from `this domain` for i in vec.iter() {<!-- --> println!("{}", i) } //Now, the loan period is over } fn use_vec() {<!-- --> let vec = make_vec(); //Take ownership of the vector print_vec( & amp;vec); //Lend `"print_vec"` permission for i in vec.iter() {<!-- --> //Continue to use `"vec"` println!("{}", i * 2) } //Destruct `VEC` here }
Now print_vec
accepts vector
references, and use_vec
lends vectors by writing & amp;vec
. Because it is temporaryBorrowed
, use_vec
retains vector
ownership;
It can be used by continue
after calling print_vec
and returning.
Each reference
is valid within a limited field
, and the compiler
automatically determines
the field. There are two Quote form:
1. Invariant reference & amp;T
, allowing sharing
but prohibiting
changes. There can be multiple
pairs at the same time &T
references the same value, but the value
cannot be changed while these references
are active.
2. Mutable reference & amp;mut T
, allowing change
but not sharing
. If exists
for a certain & amp;mut T
reference of value, then at this time
cannot have other active references
, but the value
can be changed >.
Rust
checks these rules at compile time; borrowing
has no runtime cost.
Why are there two types
of references? Consider this function:
fn push_all(from: & amp;Vec<i32>, to: & amp;mut Vec<i32>) {<!-- --> for i in from.iter() {<!-- --> to.push(*i); } }
This function
iterates through each element
of the vector, push
it to another vector
. Iterates The device
maintains the vector
pointer at the current and final positions, advancing one by one.
What should I do if I call
this function with two
parameters using the same
vector?
push_all( & amp;vec, & amp;mut vec)
This would be a disaster
! When push
an element onto a vector
, it will occasionally resize
, Allocate
a lot of new memory and copy it into the element
. The iterator
will leave a dangling
pointer to the old memory
>Pointer, causing memory
to be unsafe (segmentation fault
to be worse).
Fortunately, Rust
ensures that whenever a mutable
borrow is active, other borrows
are not active, thus producing
the following information:
Error: Cannot borrow "vec"
as mutable because it also borrows as immutable
.
push_all( & amp;vec, & amp;mut vec); ^~~
Deliver a message
Concurrent programming
has many styles, especially the simple
way in which threads or participants
send messages
to each other to communicateDeliver message
.
Do not communicate through shared memory
; instead, share
memory through communicate
.
Rust
ownership makes it easy to check
rules. Consider the following channel API
(channels in the Rust
standard library are slightly different ):
fn send<T: Send>(chan: & amp;Channel<T>, t: T); fn recv<T: Send>(chan: & amp;Channel<T>) -> T;
Channels
are universal
in the type of data they transfer (the
part of the API
). The Send
part indicates that T
must be safe
and can be sent between threads
;
Vec
is Send
.
As in Rust
, passing T
to the send
function indicates the fact that transfers
its ownership.
Has far-reaching
effects: i.e., the following code
generates a compiler error
.
//Assume `chan:Channel<Vec<i32>>` let mut vec = Vec::new(); //Do some calculations send(&chan, vec); print_vec( & amp;vec);
Here, the thread
creates a vector
, sends
it to another thread
, and then continues
Use it. While the thread
continues to run, the receiving vector
thread may change it, so calling print_vec
may cause Content
, therefore, resulting in a use after free
error.
Instead, when calling print_vec
, the Rust
compiler generates
an error message:
Error: Using move
's "vec"
value.
Disaster averted.
Lock
Lock, a way of communicating in passive
's shared
state.
Shared state
concurrency has a disadvantage
. It is easy
to forget to take the lock, or to change
at the
wrong
time. /code>Incorrect data, leading to disaster
.
Rust
's point of view is:
However, shared state
concurrency is a basic
programming style and is required for system code, maximum performance, and implementation
of other concurrency styles.
The problem is related to unexpected
shared state.
Whether using locked
or lockless
technology, Rust
aims to provide you with the tools to directly conquer shared state
concurrency .
In Rust
, because of ownership
, threads will automatically "isolate"
from each other. Whether it owns
data, Or is it Variable borrowing
data? It will be written
only when the thread
has variable permissions
.
In short, it is guaranteed
that this thread is the only
thread with permission at that time.
Remember, you cannot have mutable
borrows and other
borrows at the same time. Locks are provided same
synchronously via runtime
Guaranteed ("mutual exclusion"
). This causes
to hook directly into the lock API
of the Rust
ownership system .
Here is a simplified version:
//Create a new mutex lock fn mutex<T: Send>(t: T) -> Mutex<T>; //Get the lock fn lock<T: Send>(mutex: & amp;Mutex<T>) -> MutexGuard<T>; //Access locked data fn access<T: Send>(guard: & amp;mut MutexGuard<T>) -> & amp;mut T;
Unusual aspects of this lockAPI
.
1. First of all, in the lock protection
data T type
, the mutually exclusive type
is universal
. When creating a
mutex, transfer the ownership of the data
to the mutex
and immediately give up
the ownership. (When first created lock
when unlocked).
2. Later, you can lock
to block
the thread until the lock
is obtained. In destructing MutexGuard
Automatically release
the lock; there is no separate unlock
function.
3. The lock can only be accessed through the access
function, which converts the guard
's variable borrow
For mutable borrowing of data
(short-term borrowing
):
fn use_lock(mutex: & amp;Mutex<Vec<i32>>) {<!-- --> //Obtain the lock, own the guard; hold the lock on the rest of the domain let mut guard = lock(mutex); //Access data through variable borrowing `Guard` let vec = access( & amp;mut guard); //The type of `vec` is `" & amp;mut Vec<i32>"` vec.push(3); //When destructing `"guard"`, the lock will be automatically released here }
Two
key elements:
1. The mutable reference
returned by access
cannot exceed
than the MutexGuard
it borrowed.
2. Only when MutexGuard
is destroyed, the lock will be released
.
The result is that Rust
enforces a lock rule
: no access
protected by a lock unless
. Otherwise holds
the lock. datagenerates
a compiler error. For example, consider the following flawed "refactoring":
fn use_lock(mutex: & amp;Mutex<Vec<i32>>) {<!-- --> let vec = {<!-- --> //Get the lock let mut guard = lock(mutex); //Try to return the borrowed data access(&mut guard) //Destruct `guard` here and release the lock }; //Try to access data outside the lock. vec.push(3); }
Rust
generates error
to illustrate the problem:
Error: The lifetime of "guard"
is not long enough
access(&mut guard) ^~~~~
Disaster averted.
Line Security"Send"
Generally distinguish
some data types are "line security"
, while other data types are not
.line security
>The data structure is sufficiently synchronized
internally so that multiple threads
can be used safely at the same time.
For example, Rust
comes with two "spiritual needles" for reference counting:
1. Rc
provides reference counting
through normal reading/writing
. It is not wire-safe
.
2. Arc
provides reference counting
through atomic
operations. It is line-safe
.
The Hardware
atomic operations used by Arc
are more expensive than the normal operations used by Rc
, so use Rc
instead of Arc
is advantageous. On the other hand, the point is, never migrate Rc
from one thread to another, as this will lead to breaking
references Counted competition
.
In Rust
, the world is divided into two
data types: one is Send
, which can be safely
passed from one The thread moves
to another thread, and the rest is !Send(unsafe)
.
If all components
of a certain type are Send
, then that type
is also Send
>, it covers
most types. However, some basic types
are not wire-safe, so they can also be marked explicitly by Send
Arc
and other types, say to the compiler
: Trust me; the necessary synchronization has been verified
here.
Of course, Arc
is Send
, but Rc
is not.
It can be seen that Channel
and Mutex API
are only suitable for Send data
. Because they are data
that cross thread boundaries. > points, so they are also Send
mandatory points.
In summary, Rust
can confidently get the benefits of Rc
and other thread-unsafe types, because if you accidentally try sending
one thread to another For a thread, the Rust
compiler will say:
Cannot safely send "Rc
between threads.
This averted disaster.
Shared stack:"scoped"
NOTE: The API
mentioned here is an old API
that has been moved out of the standard library. You can find it in the crossbeam (scope()
documentation ) and scoped_threadpool(scoped()
documentation) to find equivalent functions.
Currently, all modes
involve data structures that are created on the heap
and shared
between threads
. However, if It may be dangerous to start some threads
to use the data in stack frames
:
fn parent() {<!-- --> let mut vec = Vec::new(); //fill vector thread::spawn(|| {<!-- --> print_vec( & amp;vec) }) }
Child thread
accepts vec
reference, and vec
is retained
in the stack frame of the parent thread. Parent When the thread
exits, the stack frame
will pop up, but the child thread does not know it. Oops!
In order to rule out this memory insecurity, the basic thread generation of Rust
is as follows:
fn spawn<F>(f: F) where F: 'static, ...
"Static constraints
" means that borrowing data is prohibited
in closure
. That is, like the parent
above The function generates an error:
Error: The lifetime of "vec"
is not long enough.
Basically, the possibility of popping up the parent stack frame
is captured. Disaster is avoided.
There is another
way to ensure safety: until the child thread completes
, ensure that the parent stack frame
remains in place. This is the point Cross-join
programming mode is generally used for divide and conquer
parallel algorithms.
Rust
supports it by providing thread-generated "domain"
variants:
fn scoped<'a, F>(f: F) -> JoinGuard<'a> where F: 'a, ...
There are two main
differences from the spawn
interface above:
1. Use the 'a
parameter instead of 'static
.
2. JoinGuard
returns the value. That is, JoinGuard
is connected by implicitly
in its destructor
(if it has not been Explicit) to ensure that the parent thread
joins (waits for) its child thread
.
Including 'a
in JoinGuard
ensures that JoinGuard
cannot escape the fields of data borrowed by Closure
. That is, Rust
ensures that the parent thread
waits for the child thread< before
popping
the stack frame
that the child thread may access. /code>Complete.
Therefore, by adjusting the previous example, you can fix
the error and satisfy the compiler
as follows:
fn parent() {<!-- --> let mut vec = Vec::new(); //fill vector let guard = thread::scoped(|| {<!-- --> print_vec( & amp;vec) }); //Destruct `guard` here and merge implicitly }
Therefore, in Rust
, you can freely borrow stack data
into child threads
, and the compiler
will Make sure
to check if there is enough
synchronized.
Data competition
Rust
uses ownership and borrowing
to guarantee:
1. Memory safety, no garbage.
2. No concurrency data competition
.