2311rust fearless concurrency.

original

RustFearless concurrency

Rust is designed to solve two trouble problems:
1. How to safety system programming
2. How to be fearless about concurrency

Initially, these questions seemed irrelevant, but surprisingly, the approach turned out to be the same: the same tools that make Rust safe can also be used Help solve concurrency issues.

Memory safety and concurrency errors are generally considered to be code accessing data when it should not. Rust relies on ownership Static checking for you.

For memory safety, you can program when there is no garbage collection, and you don’t have to worry about segmentation fault, because Rust will Discover your mistake.

For Concurrency, you can choose from (Message passing, shared state, lock-free, purely functional), and Rust helps you avoid Common pitfalls.

The following is concurrency in Rust:

1. Channel transfers the ownership of sending message, so you can send pointer from one thread to another thread without worrying about thread Competition.Rustchannels force isolation of threads.here

2. Lock knows what data it protects, and Rust guarantees that only when holds the lock, it can Access data. Without sharing state. Force”lock data in Rust, not code\ “.

3. Between multiple threads, each data type knows whether it can be safe sent or accessed, and Rust enforces it, even for No locks data structures, and no data races. Wiresafe is not only a document; it is also a rules.

4. Stack frames can even be shared between threads. Here, Rust statically ensures that other threads use them. , these frames are still active. Even the boldest forms of sharing are safe in Rust .

These benefits come from Rust‘s ownership model. In fact, locks, channels, lock-free data structures, etc. are all in the library rather than defined in Core Language.
That is, Rust‘s concurrency method is open: new libraries can carry new paradigms code> and to catch new errors, just add the API that uses the Rust ownership function.

Background:Ownership

In Rust, each value has a “owner field“, which passes or returns the value Indicates a transfer ("move") from old ownership to new domain. When ending domain, this When automatically destroyed the value it still holds.
Take a look at a simple example. Suppose you create a vector and push some elements onto it:

fn make_vec() {<!-- -->
    let mut vec = Vec::new();
 //Owned by the domain of `make_vec`
    vec.push(0);
    vec.push(1);
    //End of domain, destruct `"vec"`
}

The field that creates the value also initially owns it. At this point, the body of make_vec is the owning field of vec. The owner can use vec to do work.
When the domain ends, it is still owned by the domain vec, so it will be automatically released.
It would be more interesting if return or pass a vector:

fn make_vec() -> Vec<i32> {<!-- -->
    let mut vec = Vec::new();
    vec.push(0);
    vec.push(1);
    vec //Transfer `ownership` to the caller
}
fn print_vec(vec: Vec<i32>) {<!-- -->
    //The `"vec"` parameter is part of this domain and therefore owned by `"print_vec"`
    for i in vec.iter() {<!-- -->
        println!("{}", i)
    }
    //Now, release `"vec"`
}
fn use_vec() {<!-- -->
    let vec = make_vec(); //Get vector ownership,
    print_vec(vec); //Transfer ownership to `"print_vec"`
}

Now, before the make_vec scope ends, vec returns it out of scope; without destructing it. Then, like use_vecThis caller takes ownership of the vector.
On the other hand, the print_vec function takes the vec parameter, and its caller transfers the ownership of the vector to it. Because print_vec code>No further transfer ownership occurs, so the vector is destroyed when its field ends.
Once you relinquish ownership, you can no longer use the value. For example, consider the following use_vec variant:

fn use_vec() {<!-- -->
    let vec = make_vec(); //Take ownership of `VectorPass`
    print_vec(vec); //Transfer ownership to `"print_vec"`,
    for i in vec.iter() {<!-- --> //Continue to use `"vec"`
        println!("{}", i * 2)
    }
}

The compiler says that vec is no longer available; ownership has been transferred. This is great, so the vector has been released at this time! Disaster avoided.

Lending

Currently, not satisfied because no intention of letting print_vec destruct the vector. What is really wanted is to temporarily grant print_vec access to the vector and then continue using vector.

This relies on borrowing. If you have access to a certain value in Rust, you can lend this permission to callFunction.Rust checks that these lives will not exceed the borrowed objects.
To borrow a value, use the & symbol to reference it (a pointer):

fn print_vec(vec: & amp;Vec<i32>) {<!-- -->
    //`"vec"` parameter is borrowed from `this domain`
    for i in vec.iter() {<!-- -->
        println!("{}", i)
    }
    //Now, the loan period is over
}
fn use_vec() {<!-- -->
    let vec = make_vec(); //Take ownership of the vector
    print_vec( & amp;vec); //Lend `"print_vec"` permission
    for i in vec.iter() {<!-- --> //Continue to use `"vec"`
        println!("{}", i * 2)
    }
    //Destruct `VEC` here
}

Now print_vec accepts vector references, and use_vec lends vectors by writing & amp;vec. Because it is temporaryBorrowed, use_vec retains vector ownership;

It can be used by continue after calling print_vec and returning.

Each reference is valid within a limited field, and the compiler automatically determines the field. There are two Quote form:

1. Invariant reference & amp;T, allowing sharing but prohibiting changes. There can be multiple pairs at the same time &T references the same value, but the value cannot be changed while these references are active.
2. Mutable reference & amp;mut T, allowing change but not sharing. If exists for a certain & amp;mut T reference of value, then at this time cannot have other active references, but the value can be changed >.

Rust checks these rules at compile time; borrowing has no runtime cost.
Why are there two types of references? Consider this function:

fn push_all(from: & amp;Vec<i32>, to: & amp;mut Vec<i32>) {<!-- -->
    for i in from.iter() {<!-- -->
        to.push(*i);
    }
}

This function iterates through each element of the vector, push it to another vector. Iterates The device maintains the vector pointer at the current and final positions, advancing one by one.
What should I do if I call this function with two parameters using the same vector?

push_all( & amp;vec, & amp;mut vec)

This would be a disaster! When push an element onto a vector, it will occasionally resize, Allocate a lot of new memory and copy it into the element. The iterator will leave a dangling pointer to the old memory >Pointer, causing memory to be unsafe (segmentation fault to be worse).

Fortunately, Rust ensures that whenever a mutable borrow is active, other borrows are not active, thus producing the following information:
Error: Cannot borrow "vec" as mutable because it also borrows as immutable.

push_all( & amp;vec, & amp;mut vec);
                    ^~~

Deliver a message

Concurrent programming has many styles, especially the simple way in which threads or participants send messages to each other to communicateDeliver message.
Do not communicate through shared memory; instead, share memory through communicate.

Rust ownership makes it easy to check rules. Consider the following channel API (channels in the Rust standard library are slightly different ):

fn send<T: Send>(chan: & amp;Channel<T>, t: T);
fn recv<T: Send>(chan: & amp;Channel<T>) -> T;

Channels are universal in the type of data they transfer (the part of the API). The Send part indicates that T must be safe and can be sent between threads;
Vec is Send.

As in Rust, passing T to the send function indicates the fact that transfers its ownership. Has far-reaching effects: i.e., the following code generates a compiler error.

//Assume `chan:Channel<Vec<i32>>`
let mut vec = Vec::new();
//Do some calculations
send(&chan, vec);
print_vec( & amp;vec);

Here, the thread creates a vector, sends it to another thread, and then continues Use it. While the thread continues to run, the receiving vector thread may change it, so calling print_vec may cause Content, therefore, resulting in a use after free error.

Instead, when calling print_vec, the Rust compiler generates an error message:
Error: Using move's "vec" value.
Disaster averted.

Lock

Lock, a way of communicating in passive's shared state.
Shared state concurrency has a disadvantage. It is easy to forget to take the lock, or to change at the wrong time. /code>Incorrect data, leading to disaster.

Rust's point of view is:
However, shared state concurrency is a basic programming style and is required for system code, maximum performance, and implementation of other concurrency styles.
The problem is related to unexpected shared state.

Whether using locked or lockless technology, Rust aims to provide you with the tools to directly conquer shared state concurrency .

In Rust, because of ownership, threads will automatically "isolate" from each other. Whether it owns data, Or is it Variable borrowing data? It will be written only when the thread has variable permissions.

In short, it is guaranteed that this thread is the only thread with permission at that time.
Remember, you cannot have mutable borrows and other borrows at the same time. Locks are provided same synchronously via runtime Guaranteed ("mutual exclusion"). This causes to hook directly into the lock API of the Rust ownership system .
Here is a simplified version:

//Create a new mutex lock
fn mutex<T: Send>(t: T) -> Mutex<T>;
//Get the lock
fn lock<T: Send>(mutex: & amp;Mutex<T>) -> MutexGuard<T>;
//Access locked data
fn access<T: Send>(guard: & amp;mut MutexGuard<T>) -> & amp;mut T;

Unusual aspects of this lockAPI.
1. First of all, in the lock protection data T type, the mutually exclusive type is universal. When creating a mutex, transfer the ownership of the data to the mutex and immediately give up the ownership. (When first created lock when unlocked).
2. Later, you can lock to block the thread until the lock is obtained. In destructing MutexGuard Automatically release the lock; there is no separate unlock function.
3. The lock can only be accessed through the access function, which converts the guard's variable borrow For mutable borrowing of data (short-term borrowing):

fn use_lock(mutex: & amp;Mutex<Vec<i32>>) {<!-- -->
    //Obtain the lock, own the guard; hold the lock on the rest of the domain
    let mut guard = lock(mutex);
    //Access data through variable borrowing `Guard`
    let vec = access( & amp;mut guard);
    //The type of `vec` is `" & amp;mut Vec<i32>"`
    vec.push(3);
    //When destructing `"guard"`, the lock will be automatically released here
}

Two key elements:
1. The mutable reference returned by access cannot exceed than the MutexGuard it borrowed.
2. Only when MutexGuard is destroyed, the lock will be released.

The result is that Rust enforces a lock rule: no access protected by a lock unless holds the lock. data. Otherwise generates a compiler error. For example, consider the following flawed "refactoring":

fn use_lock(mutex: & amp;Mutex<Vec<i32>>) {<!-- -->
    let vec = {<!-- -->
        //Get the lock
        let mut guard = lock(mutex);
        //Try to return the borrowed data
        access(&mut guard)
        //Destruct `guard` here and release the lock
    };
    //Try to access data outside the lock.
    vec.push(3);
}

Rust generates error to illustrate the problem:
Error: The lifetime of "guard" is not long enough

access(&mut guard)
            ^~~~~

Disaster averted.

Line Security"Send"

Generally distinguish some data types are "line security", while other data types are not.line security >The data structure is sufficiently synchronized internally so that multiple threads can be used safely at the same time.
For example, Rust comes with two "spiritual needles" for reference counting:
1. Rc provides reference counting through normal reading/writing. It is not wire-safe.
2. Arc provides reference counting through atomic operations. It is line-safe.

The Hardware atomic operations used by Arc are more expensive than the normal operations used by Rc, so use Rc instead of Arc is advantageous. On the other hand, the point is, never migrate Rc from one thread to another, as this will lead to breaking references Counted competition.

In Rust, the world is divided into two data types: one is Send, which can be safely passed from one The thread moves to another thread, and the rest is !Send(unsafe).

If all components of a certain type are Send, then that type is also Send >, it covers most types. However, some basic types are not wire-safe, so they can also be marked explicitly by SendArc and other types, say to the compiler: Trust me; the necessary synchronization has been verified here.

Of course, Arc is Send, but Rc is not.

It can be seen that Channel and Mutex API are only suitable for Send data. Because they are data that cross thread boundaries. > points, so they are also Send mandatory points.

In summary, Rust can confidently get the benefits of Rc and other thread-unsafe types, because if you accidentally try sending one thread to another For a thread, the Rust compiler will say:
Cannot safely send "Rc>" between threads.
This averted disaster.

Shared stack:"scoped"

NOTE: The API mentioned here is an old API that has been moved out of the standard library. You can find it in the crossbeam (scope() documentation ) and scoped_threadpool(scoped() documentation) to find equivalent functions.

Currently, all modes involve data structures that are created on the heap and shared between threads. However, if It may be dangerous to start some threads to use the data in stack frames:

fn parent() {<!-- -->
    let mut vec = Vec::new();
    //fill vector
    thread::spawn(|| {<!-- -->
        print_vec( & amp;vec)
    })
}

Child thread accepts vec reference, and vec is retained in the stack frame of the parent thread. Parent When the thread exits, the stack frame will pop up, but the child thread does not know it. Oops!

In order to rule out this memory insecurity, the basic thread generation of Rust is as follows:

fn spawn<F>(f: F) where F: 'static, ...

"Static constraints" means that borrowing data is prohibited in closure. That is, like the parent above The function generates an error:
Error: The lifetime of "vec" is not long enough.

Basically, the possibility of popping up the parent stack frame is captured. Disaster is avoided.

There is another way to ensure safety: until the child thread completes, ensure that the parent stack frame remains in place. This is the point Cross-join programming mode is generally used for divide and conquer parallel algorithms.
Rust supports it by providing thread-generated "domain" variants:

fn scoped<'a, F>(f: F) -> JoinGuard<'a> where F: 'a, ...

There are two main differences from the spawn interface above:
1. Use the 'a parameter instead of 'static.
2. JoinGuard returns the value. That is, JoinGuard is connected by implicitly in its destructor (if it has not been Explicit) to ensure that the parent thread joins (waits for) its child thread.

Including 'a in JoinGuard ensures that JoinGuard cannot escape the fields of data borrowed by Closure. That is, Rust ensures that the parent thread waits for the child thread< before popping the stack frame that the child thread may access. /code>Complete.

Therefore, by adjusting the previous example, you can fix the error and satisfy the compiler as follows:

fn parent() {<!-- -->
    let mut vec = Vec::new();
    //fill vector
    let guard = thread::scoped(|| {<!-- -->
        print_vec( & amp;vec)
    });
    //Destruct `guard` here and merge implicitly
}

Therefore, in Rust, you can freely borrow stack data into child threads, and the compiler will Make sure to check if there is enough synchronized.

Data competition

Rust uses ownership and borrowing to guarantee:
1. Memory safety, no garbage.
2. No concurrency data competition.

syntaxbug.com © 2021 All Rights Reserved.