Hands-On Data Structures and Algorithms with Rust

上QQ阅读APP看书，第一时间看更新

Borrowing and ownership

Rust is famous for its memory management model, which replaces runtime garbage collection with compile-time checks for memory safety. The reason why Rust can work without a garbage collector and still free the programmer from error-prone memory management is simple (but not easy): borrowing and ownership.

While the particulars are quite complex, the high-level view is that the compiler inserts any "provide x amounts of memory" and "remove x amounts of memory" (somewhat like malloc() and free() for C programmers) statements for the developer. Yet how can it do that?

The rules of ownership are as follows:

The owner of a value is a variable
At any time, only a single owner is allowed
The value is lost once the owner goes out of scope

This is where Rust's declarative syntax comes into play. By declaring a variable, the compiler knows—at compile time—that a certain amount of memory needs to be reserved. The lifetime is clearly defined too, from the beginning to end of a block or function, or as long as the struct instance lives. If the size of this variable is known at compile time, the compiler can provide exactly the necessary amount of memory to the function for the time required. To illustrate, let's consider this snippet, where two variables are allocated and removed in a deterministic order:

fn my_func() {
    // the compiler allocates memory for x 
    let x = LargeObject::new(); 
    x.do_some_computation();
    // allocate memory for y
    let y = call_another_func();
    if y > 10 {
        do_more_things();
    }
} // deallocate (drop) x, y

Is this not what every other compiler does? The answer is yes—and no. At compile time, the "provide x amounts of memory" part is fairly simple; the tricky part is keeping track of how much is still in use when references can be passed around freely. If, during the course of a function, a particular local reference becomes invalid, a static code analysis will tell the compiler about the lifetime of the value behind the reference. However, what if a thread changes that value at an unknown time during the function's execution?

At compile time, this is impossible to know, which is why many languages do these checks at runtime using a garbage collector. Rust forgoes this, with two primary strategies:

Every variable is owned by exactly one scope at any time
Therefore, the developer is forced to pass ownership as required

Especially when working with scopes, the nature of stack variables comes in handy. There are two areas of memory, stack and heap, and, similar to other languages, the developer uses types to decide whether to allocate heap (Box, Rc, and so on) or stack memory.

Stack memory is usually short-lived and smaller, and operates in a first-in, last-out manner. Consequently, a variable's size has to be known before it is put on the stack:

Heap memory is different; it's a large portion of the memory, which makes it easy to allocate more whenever needed. There is no ordering, and memory is accessed by using an addresses. Since the pointer to an address on the heap has a known size at compile time, it fits nicely on the stack:

Stack variables are typically passed by value in other languages, which means that the entire value is copied and placed into the stack frame of the function. Rust does the same, but it also invalidates further use of that variable in the—now parent—scope. Ownership moves into the new scope and can only be transferred back as a return value. When trying to compile this snippet, the compiler will complain:

fn my_function() {
    let x = 10;
    do_something(x); // ownership is moved here
    let y = x;       // x is now invalid!
}

Borrowing is similar but, instead of copying the entire value, a reference to the original value is moved into the new scope. Just like in real life, the value continues to be owned by the original scope; scopes with a reference are just allowed to use it as it was provided. Of course, this comes with drawbacks for mutability, and some functions will require ownership for technical and semantic reasons, but it also has advantages such as a smaller memory footprint.

These are the rules of borrowing:

Owners can have immutable or mutable references, but not both
There can be multiple immutable references, but only one mutable reference
References cannot be invalid

By changing the previous snippet to borrow the variable to do_something() (assuming this is allowed, of course), the compiler will be happy:

fn my_function() {
    let x = 10;
    do_something(&x); // pass a reference to x
    let y = x;        // x is still valid!
}

Borrowed variables rely heavily on lifetimes. The most basic lifetime is the scope it was created in. However, if a reference should go into a struct field, how can the compiler know that the underlying value has not been invalidated? The answer is explicit lifetimes!