D   A   T   A   W   O   K





Creation: December 08 2020
Modified: January 15 2021

Rust for the C Fossil

After some time spent learning and struggling with some of the peculiar features of the Rust programming language, I've decided to collect together all the fancy stuff I've encountered so far and that may look new to any C fossil like me.

This is not meant to be detailed Rust programming language guide, for a detailed journey through the language you may want to read The Book first.


FORMATTING

Display trait is used when the argument type is left unspecified: {}. Binary trait is used when argument type is binary: {:b}

Full list of formatting traits: https://doc.rust-lang.org/std/fmt/#formatting-traits

PrintType trick

use std::any::type_name;

fn print_type<T>(_: &T) {
    println!("{}", type_name::<T>());
}

TYPES

(): "unit" empty type (like void in C). struct E; unit like struct, it has no fields.

Primitives

Standard doc has separate pages for the type itself (e.g. i32) and the module dedicated to that time (std::i32).

Bools can be explicitly converted to numeric types (but not the other way round)

assert_eq!(false as i32, 0);

Characters (char) are 4 bytes unicode characters. Instead for strings uses UTF-8 (varlen).

We can convert chars to integers (may truncate) but only u8 to chars (using as operator). The bigger integers may contain invalid unicode values.

std::char::from_u32(val: u32) -> Option<char>

Tuples

let t = (1,); // one element tuple
let i = (1);  // just an integer

Arrays

Length is known at compile time and is part of type signature [T; len].

Slices can be used to borrow a section of the array. Type signature is &[T].

All slices methods are called on arrays as well, Rust automatically converts a reference to an array to a slice first.

Struct

Tuple structs are basically named tuples.

struct MyPoint(i32, i32);

A unit struct has no elements, useful for generics.

struct Unit;

A struct can be initialized using an instance of another struct of the same type

let a = A {
    name,
    ..b,        // a and b have the same type.
}

Struct members may be organized as in C using the [repr(C)] attribute.

References

Reference can be taken using the ref keyword

let c = 3;
let r2 = &c;
let ref r1 = c; // equivalent to the previous command

The * operator can be used to dereference, but you don't need to do that to access fields or call methods.

And you can only do it if the type is Copy.

If not Copy the error is something like: error: cannot move out of*p_refwhich is behind a shared reference

Slices

Slices size is not known, thus cannot be allocated on the stack. Are always passed by reference (a fat pointer: addr + len).

We often use the term slice for reference types like &[T] or &str, but that is a bit of shorthand: those are properly called references to slices

String

String is analogous to Vec and &str is analogous to &[T].

Newline and leading spaces are included in the string.

let s = "Hello
    World";

To remove newline and leading spaces:

let s = "Hello \
    World";         // this is equal to s = "Hello World";

String len() is measured in bytes not characters.

To get the chars: s.chars().count()

Raw strings

Escape chars are ignored, everithin is included "as-is" (verbatim).

let s = r"C:\path\to\file";
let rex = Regex::new(r"\d+(\.\d+)*");

Pond signs (used for example to include " character in raw strings) println!(r###" This raw string started with 'r###"'. Therefore it does not end until we reach a quote mark ('"') followed immediately by three pound signs ('###'): "###);

Byte strings

Every char is one byte (array of u8)

let s = b"GET"; // equal to &[b'G', b'E', b'T']
// s is a reference to an array of three bytes,
// s type is: &[u8; 3]

Raw byte strings starts with: br

Type Alias

type MyType = Vec<i32>;

Casting

Only u8 can be casted to char.

let f = 64.0_f32
//let c = f as char; // error
let c = f as u8 as char;

CONVERSIONS

From/Into

From trait specify how to create a type from another

impl From<i32> for Number {
    // a function
    fn from(val: i32) -> Self {
        Number { val }
    }
}

impl Into<i32> for Number {
    // a method
    fn into(&self) -> i32 {
        self.val
    }
}

The following are equivalent and if both defined are in conflict:

// Number -> i32
impl From<Number> for i32 {
    fm from(num: Number) -> i32 {
        num.val
    }
}

// Number -> i32
impl Into<i32> for Number {
    fn into(&self) -> i32 {
        self.val
    }
}

Into can be derived from From, and vice-versa.

Standard library (std::convert) provides a generic Into.

impl<T, U> Into<U> for T
where
    U: From<T>,
{
    fn into(self) -> U {
        U::from(self)
    }
}

TryFrom/TryInto

Fallible conversions, return Result.

ToString

Automatically implemented by fmt::Display.


EXPRESSIONS

Expression

if the last expression of a block terminates with ;, block return value is ().


FLOW OF CONTROL

Loops

Nested loops and labels

fn nested_labels() {
    'outer: loop { // label for the break
        println!("Entered the outer loop");
        loop {
            println!("Entered the inner loop");
            // break; this breaks the inner loop
            break 'outer; // this breaks the outer loop
        }
    }
}

Return from loop

let mut count = 0;
let res = loop {
    count += 1;
    if count == 10 {
        break count * 2 // returns count*2 an expression
    }
};
assert_eq!(result, 20);

For and Iterators

By default the for loop applies the into_iter. Other ways to obtain an iterator of a collection are: iter() and iter_mut()

Note that into_iter consumes the collection.

Match

If value in match is by reference the values in the patters are also captured by reference

fn print(&self: Option<i32, Err>) {
    match self {
        Some(val) => println("{}", val),  // val is a reference
        None => (),
    };

    match *self {
        Some(val) => println("{}", val),  // val is a value
        None => (),
    };

    match *self {
        Some(ref val) => println("{}", val),  // val is a reference
        None => (),
    }
}

Match guard to filter the arm.

match pair {
    (x, y) if x == y => {},
    (x, y) if x + y == 0 => {},
    _ => {},
}

Binding: @ for binding values to names

match age {
    n @ 1..30 => { println!("You have {} years", n); },
    _ => (),
}

match option {
    Some(n @ 1..=30) {},
    _ (),

If Let / While Let

Also works with if let

let c = Foo::Qux(30);
if let Foo::Qux(val @ 1..100) = c {
    println("c is {}", val);
}

FUNCTIONS and CLOSURES

Closures

When defined, the closure immediatelly captures the variables it uses (by default gets an immutable reference).

Environment variables remains captured until the closure is used for the last time.

If the variable is captured as an immutable reference, we can assign it as an immutable reference to other variables (borrow rules). But if we capture as immutable ref, we cannot borrow somewhere until the close is used for the last time.

let s = String::from("Hello");
let consume = || {  // Closure contains a String value, since it moves `s`
    let s2 = s;
    println!("{}", s2);
};
consume();
//consume(); // Error: cannot be called more than once because it moves the
             // variable s out of its environment

let s = String::from("World");
let consume = || {  // Closure contains a String reference
    println!("{}", s);
};
consume();
println!("{}", s);  // OK: consume() borrowed a reference
consume();          // OK: because is an immutable reference

// using `move` before vertical pipes forces closure to take ownership of
// captured variables
let s = String::from("World");
let consume = move || {  // Closure contains a String value (moves s)
    println!("{}", s);
};
consume();
//println!("{}", s); // error: value has been moved into the closure

}

Fn Traits

From the least restrictive:

Fn can be passed where FnMut is requested (FnMut is less restrictive). FnMut can be passed where a FnOnce is requested (FnOnce is less restrictive).

If we need to modify a captured value, we need a FnMut closure.

FnMut is a supertrait of Fn. All types that implement Fn also implement FnMut. Follows that I can pass a Fn where a FnMut is required (not the other way round).

FnOnce closures need to be moved in order to be called.

Compiler tries to capture variables in th least restrictive manner possible, depending on how the capured variables are used in the closure.

If a parameter is annotated as Fn, then capturing variables by &mut T or T is not allowed.

Functions as parameters

If you declare a function that takes a closure as parameter, then any function that satisfies the trait bound of that closure can be passed as a parameter.

Returning closures

All traits are possible (Fn, FnMut, FnOnce) but move keywork must be used because the captured variables lifetime stops when function scope ends.

Higher Order Functions

Functions taking functions and returning functions.

Diverging functions

Never return. Marked using ! (empty type).

fn foo() -> ! {
    panic!("Never returns");
}

This type can be casted to any other one. Thus can be used at places where an exact type is required (e.g. in match branches).

let val: u32 = match res {
    Ok(v) => v,
    Err(e) => panic!("error"), // this is valid because panic! returns !
}

ITERATORS

// `iter()` for vecs yields `&i32`. Destructure to `i32`.
println!("2 in vec1: {}", vec1.iter()     .any(|&x| x == 2));
// `into_iter()` for vecs yields `i32`. No destructuring required.
println!("2 in vec2: {}", vec2.into_iter().any(| x| x == 2));

into_iter is a generic method to obtain an iterator, whether this iterator yields values, immutable references or mutable references is context dependent and can sometimes be surprising.

For example, for &Vec<i32> the into_iter yields &i32, while for Vec<i32> yields i32.


MODULES

Modules doesn't automatically inherit names from their parent modules. Each module must import the name it uses.

If module Y has pub use Z and module X has use Y, then module X can use Z. Thus the used modules with pub keyword are "re-exported" to whom is using us.

Libraries sometimes provide modules named "prelude", where the most commonly used stuff is published using the pub use method.

Good style: import only types, traits and modules.

The self keyword refers to the current module scope.

The super keyword refers to the parent module scope.

The super and self keywords can be used in the path to remove ambiguity when accessing items and to prevent unnecessary hardcoding of paths.

Public

Functions declared using pub(in path) syntax are only visible within the given path. path must be a parent or ancestor module

pub(in crate::my_mod) fn public_function_in_my_mod() { }

Functions declared using pub(self) syntax are only visible within the current module, which is the same as leaving them private

pub(self) fn public_function_in_nested() { }

Functions declared using pub(super) syntax are only visible within the parent module

pub(super) fn public_function_in_super_mod() {  }

Public but only in the current crate

pub(crate) fn public_in_current_crate() {  }

Private parent items will still restrict the visibility of a child item, even if it is declared as visible within a bigger scope.

Submodules can access private items in their parent modules, but they have to import each one by name (super::* only imports pub items).

Implicit std

This is implicit in every rust module:

`use std::prelude::v1::*;`

Structs visibility

Fields visibility only matters when a struct is accessed from outside the module where it is defined.

impl shall not be prefixed with pub.

File Hierarchy

This declaration will look for a file named my.rs or my/mod.rs and will insert its contents inside a module named my under this scope

mod my;

CRATES

You do not need to write extern crate anymore for external dependencies in Rust 2018.

Publishing on crates.io

Make a package containing the sources (and Cargo.toml)

$ cargo package

Authenticate and save the token in ~/.cargo.credentials for future usage:

$ cargo login <token>

Publish

$ cargo publish

The documentation is automatically published and hosted on docs.rs


CARGO

The name field under [package] determines the name of the project. This is used by crates.io.

Dependencies

Cargo.toml dependencies section:

[dependencies]
# from crates.io
clap = "2.27.1"
# from online repo
rand = { git = "https://github.com/rust-lang-nursery/rand", rev="528f19c" }
# from a path in the local filesystem
bar = { path = "../bar" }

Major versions are considered compatible (for semver) thus cargo may decide to use more recent minor versions automatically (regardless of Cargo.toml)

The first time cargo builds a project it outputs a Cargo.lock file that records the exact version of every crate in use. Thus preventing automatic updates to support consistent and reproducible builds across machines.

To update to new dependencies versions that are compatible with the ones specified in Cargo.toml:

`$ cargo update`

Binaries

The default binary name is main but you can add additional binaries by placing them in a bin/ directory.

cargo --bin my_other_bin

Tests

By convention, we place unit tests in the modules they test and integration tests in their own tests/ directory:

You can also run tests whose name matches a pattern:

$ cargo test test_foo

Build script

Is possible to run a script (build.rs) before any other cargo command to satisfy dependencies.

Workspace

Keeps together multiple crates, producing a single Cargo.lock and dependencies download. Create a Cargo.toml in the root specifying all the crates:

[workspace]
# crates residing in foo, bar and baz subdirs
members = ["for", "bar", "baz"]

To build all crates in the workspace:

$ cargo build --all

Cargo will create a shared target directory.


ATTRIBUTES

Metadata applied to some module, crate or item for:

Local notation: attribute applied to a single item.

#[attrib]
item

Global notation: attribute applied to the whole crate (usually at the top of main.rs or lib.rs).

#![attrib]

Some attributes may be used only with the global notation, others only with the local one.

Conditional compilation

Using attributes.

#[cfg(target_os = "linux")]
fn you_are_on_linux() { }

#[cfg(not(target_os = "linux"))]
fn you_are_not_on_linux() { }

Or using conditional compilation in the middle of a function:

if cfg!(target_os = "linux") {
    ...
}

Some conditionals (like target_os are provided by rustc. Custom conditionals shall be passed to rustc via the --cfg flag

Other conditionals:

if cfg!(unix) {

}

User-Defined Features

#[cfg(feature = "foo")]

Enabled with cargo build --feature foo

Inline


GENERICS

Generic functions can be thought of as namespaces, containing an infinity of functions with different concrete types.

Same as with crates, and modules, and types, generic functions can be "explored" (navigated?) using :: (turbofish syntax)

use std::any::type_name;
println("{}", type_name::<i32>());

Function call with explicitly specified type parameters: fun::<A, B>().

<T> Must precede the type to remain generic:

impl<T> GenVal<T> {  }

Bounding

Bounding generic parameters bounded by traits are allowed to access the methods of the specified traits.

A bound can be specidied using a where clause before the opening {.

trait MyTrait<T> {}

impl<T, U> MyTrait<T> for U {
    destroy_all(self: U, other: T) {  }
}

Implement this for any type T where Option implements Debug

impl<T> PrintInOption for T
where
    Option<T>: Debug,
{
    fn print_in_option(self) {
        println!("{:?}", Some(self));
    }
}

New Type idiom

newtype gives compile type guartantees that the right type of value is supplied to a program.

struct Years(i64);
struct Days(i64);

// we are sure that the semantic of the param is years.
fn old_enough(age: &Years) -> bool { age.o >= 18; }

SCOPING

RAII

Rust enforces "Resource Acquisition Is Initialization".

In an object construction, resource acquisition must succeed for initialization to succeed.

RAII ties resources to object lifetime.

Lifetime of the object is bound to the scope of a variable. So when the var goes out of scope the destructor release the resources.

This behavior shields against resource leak bugs.

Destructor is provided through the Drop trait.

Mutability

Mutability of data can be changed when ownership is transferred.

Borrowing

Data can be immutably borrowed any number of times. While mutably borrowed data cannot be borrowed in any way. Data can be borrowed again after the mutable reference has been used for the last time.

Ref pattern

When doing pattern matching or destructuring via let binding, the ref keywork can be used to take references to the fields of a struct/tuple.

let Point { x: ref mut ref_to_x, y: _ } = point;
// ref_to_x is a mutable reference to point.x.
*ref_to_x = 10; // updatex point.x

LIFETIME

Construct used by the "borrow checker".

foo<'a>    // lifetime of foo may not exceed that of 'a

Any input which is borrowed MUST outlive the borrower.

The lifetime of a reference cannot exceed the lifetime of the variable binding it borrows.

There is a special lifetime, named 'static, which is valid for the entire program's lifetime. String literals are 'static.

Functions

Ignoring "elision":

Traits

Lifetime annotations in traints are similar to functions. Note impl may have annotation of lifetime too.

Bounds

Like generics, lifetimes can be bounded.

T: 'a  : All references in T must outlive lifetime 'a
T: Trait + 'a  : Type T must implement trait Trait and all references must
                 outlive 'a

Static

Static reference: the referenced object lives for the entire life of the program. Can still be coherced to a shorter lifetime.

Two ways:

When static string goes out of scope, the reference can no longer be used, but the data remains in the binary.

As a trait bound means that the type doesn't contain any non-static reference.

Elision

When there is a single input lifetime, it doesn't need to be named, and everything has the same lifetime.

If the function returns a reference and has a single reference input, the output lifetime is implicitly set equal to the input.

If it has multiple input references and an output reference, the lifetime shall be explicit.

For methods the output reference lifetime is implicitly set equal to the self lifetime.


INTERIOR MUTABILITY

Cell<T> and RefCel<T> allow to change immutable data (&self).

struct A {
    val: Cell<i32>;
    s: RefCell<String>;
}

impl A {
    fn change_unchangable(&self) {
        val = self.val.get();   // get a copy (val needs to be Copy)
        self.val.set(val + 1);  // replace value in the cell
    }

    fn change_via_reference(&self) {
        r = self.s.borrow_mut();// get a mutable reference (a RefMut<T>)
        r.push_str("!");        // directly modify the value in the Cell
    }
}

RefCel<T>::borrow panics if the value is already mutably borrowed (breaking one of the core Rust rules). With normal references the rule is enforced at compile time instead that at runtime.

Cells are not thread safe. Thus Rust will not allow multiple threads to access them at once.


TRAITS

Feature that any given time may or may not support. A capability.

A trait can access methods declared in the same trait.

Self refers to the implementor type.

When a type implements a trait, to use the methods offered by the trait, the trait itself must be in scope. Otherwise the methods are hidden.

E.g. Vec<T> implements Write but we must explicitly use Write to use the write method with the vector.

Some traits are markers (no methods). Used only to mark that certain thing can be done with a type.

Orphan rules

You can implement

Derive

Derivable traits:

PartialEq

This trait can be used with #[derive]. When derived on structs, two instances are equal if all fields are equal,

PartialOrd: PartialEq

When derived on structs, it will produce a lexicographic ordering based on the top-to-bottom declaration order of the struct's members

Trait Object

A reference to a triat;

let bytes = Vec::new();
let writer: &mut dyn Write = &bytes;

Rust doesn't suport downcasting to the concrete type.

A trait that uses the Self type is incompatible with trait objects.

trait Spliceable {
    fn splice(&self, other: &Self) -> Self;
}

// Error!!! Splice requires the concrete type (Self is used is `splice`)
// Rust needs to knwow at compile time if left and right have the same type.
fn splice_anything(left &dyn Spliceable, right: &dyn Spliceable) {
    left.splice(right);
}

One solution could be to use trait objects in the trait splice function. In this way is not required that the input type matches the output.

Trait objects are allowed for objects implementing traits with static methods.only if we set Self to be a subtrait of Sized.

trait StringSet {
    fn new() -> Self
        where Self: Sized;
}

The static methods are still don't callable, but we can use all the others.

Return dyn Trait object

Similar to polymorphism. We can only return heap allocated trait objs.

// Animal is a trait
fn animal() -> Box<dyn Animal> { ... }

Operator overloading

Operators are syntactic sugar for method calls. For example '+' is the add method of the Add trait. A list of trait is in : core::ops

Drop

One method drop invoked when he object goes out of scope.

Drop can be forced by calling std::mem::drop() function.

Iterators

One method: next.

Automatically defined for arrays and ranges.

The for construct turns some collection in iterator using into_iter() method.

The method iter produces an Iterator over an array/slice/vector.

impl Trait

If a function returns a type that implements MyTrait we can write its return type as -> impl MyTrait.

fn combine_vecs(u: Vec<i32>, v: Vec<i32>) -> impl Iterator<Item = i32> {
    v.into_iter().chain(u.into_iter()).cycle()
}

The alternative is to write the returned type explicitly, that may become complicated.

Copy

Only types for which a simple bit-for-bit copy suffices can be Copy. A tuple or fixed size array of Copy types is itself a Copy type.

Even if composed by only Copy members, by default, struct and enum types are not Copy. If composed only by Copy types, we can make them copy using the following annotation.

#[derive(Copy, Clone)]

Note: Copy: Clone - not the inverse.

Clone

Used to make a deep copy of the data tree.

Supertraits

A trait is a superset of another.

trait Student: Person {}    // Person is Student supertrait

Structures implementing Student trait are also required to implement Person.

Since Clone is more general than Copy, you can automatically make anything Copy be Clone as well.

`pub trait Copy: Clone { }`

Disambiguation

If a struct implements two traits having the same method, we need to disabiguate:

// two ways to disabiguate
let age = <Form as AgeWidget>::get(&form);
let name = NameWidget::get(&form);

In general, fully qualified syntax is able to resolve any ambiguity.

<Type as Trait>::function()

Fully qualified syntax is required when we have an associated function in the trait and we have multipl implementations of the trait. We need to help the compiler figure out what function to call.

SMART POINTERS

Shared Ownership (Rc and Arc)

Rc used for single threaded applicaitons.

let s: Rc<String> = Rc::new("hello".to_string());
let t = s.clone();

Are smart pointers, like Box, so methods of their content can be called transparently.

The referents are held immutably!!! Because it is not allowed to have multiple mutable references as a Rust general rule.

Borrowing

References are nonowning pointers.

Unlike Box or Rc, simple references have no effects on their referent's lifetime. They shall not outlive their referent.

As long as a value is read borrowed, even its owner cannot modify it. The value is locked down. Similarly, if the value is write borrowed, nobody can read it, even its owner.

let r = &val;   // borrowed
//val = 19;     // cannot modify it because is still borrowed
println!("{}", r);

Implicit Dereference

Since references are so widely used, the . operator implicitly dereferences its left operand, if needed.

There can be references to references. The . operator follows as many references as it takes to find its target.

Live . operator, comparison operators "see throuh" any number of references.

std::ptr::eq // compares references as addresses.

Lvalues must explicitly use the * operator.

Iterations

Iterating over a container shared reference (e.g. &Vec) is defined to produce shared references to each entry (&T). Similar on using the iter() method.


MACROS

Unlike C macros are exmanded into abstract syntax trees.

Macros are created using the macro_rules! macro.

All of name!(), name![] or name!{} invoke a macro.

Macro definition

After the macro identifier, follows a list of patterns. The content of the first pattern that matches replaces the macro invocation.

macro_rules! print_expression {
    // arguments list with type (designator)
    ($expression: expr) => {
        // the macro will expand to this block
        println!("{:?} = {:?}", stringify!($expression), $expression);
    };
}

Available designators:

Macros can use + in the argument list to indicate that an argument may repeat at least once, or *, to indicate that the argument may repeat zero or more times.

E.g. $(...),+ will match one or more expression, separated by commas

macro_rules! find_min {
    // Base case:
    ($x:expr) => ($x);
    // `$x` followed by at least one `$y,`
    ($x:expr, $($y:expr),+) => (
        // Call `find_min!` on the tail `$y`
        std::cmp::min($x, find_min!($($y),+))
    )
}

ERRORS

Use panic for unrecoverable errors.

Option when dealing with the lack of value is not an error condition.

Result when the caller has to deal with the error problem.

Use unwrap or expect to get the Option<T>/Result<T> value only during quick prototyping. These returns the value T or panics.

enum Option { Some(T), None, }

If x is an Option then x? returns the value if x is Some, otherwise returns None from the function. We can chain multiple ? together.

Method map allows to transform an Option<T> in an Option<U>

fn process_option(val: Option<i32>) -> Option<f32> {
    let ret = val.map(|i| i as f32); // if None them map returns None but
                                    // function goes on (not like unwrap)
    println!("...");
    ret
}

Method and_then: if option is Some calls its function input with the wrapped value and returns the result; if is None, just returns None.

Instead of:

match have_recipe(food) {
    None       => None,
    Some(food) => match have_ingredients(food) {
        None       => None,
        Some(food) => Some(food),
    },
}

We can write:

have_recipe(food).and_then(have_ingredients)

enum Result<T, E> { Ok(T), Err(E) }

A richer version of Option that describes possible error instead of possible absence.

The Option methods, like map, and, and_then are implemented for Result as well.

use AliasedResult<T> = Result<T, std::num::ParseIntError>;

fn multiply(a: &str, b: &str) -> AliasedResult<i32> {
    // If `Ok(a_val)`, `and_then` calls the closure with `a_val`
    a.parse::<i32>().and_then(|a_val| {
        // If `Ok(b_val)` then transfor from Some(b_val) to Some(a_val * b_val)
        b.parse::<i32>().map(|b_val| {
            a_val * b_val
        })
    })
}

?

? is similar to an unwrap which returns instead of panicking on Err.

In legacy code the try! macro may be found.

let val = val_str.parse::<i32>()?;
let val = try!(val_str.parse::<i32>); // equivalent to `?`

Using Option with Results


TESTING

Assert

assert! are typically used to test conditions.

Assert macros can be also used in ordinary code. Are included in release builds as well. Use debug_assert! to write assertions checked only in debug builds.

Test functions are marked with the #[test] attribute.

Tests can be ignored by marking them with the #[ignore] attribute.

Unit Tests

Typically in the same source file of functionalities to test. Wrapped into a test module and under the #[cfg(test)] conditinal compilation.

Tests that are expected to panic are tagged with the #[should_panic] attribute. The attribute takes an additional string with the panic text for more granular testing: #[should_panic(expected = "Divide by zero")].

Integration Tests

Supposed to use the library as an end-user does. Using the library as an external crate. Thus only the public API can be used.

The integration test files lives in the tests directory, alongside src dir.

Cargo compiles each integration test as a separate, standalone crate.

One way to share code between integration tests is write a module with public functions, importing and using it within tests (e.g. tests/commont.rs).

Documentation Tests

Tests embedded in the documentation.

Blocks of code in a doc comment is compiled as a separate executable crate and linked to the library.

To test only the doc tests: cargo test --doc.

To prevent rust running a piece of code in the doc:

```norun
some code
```

To prevent rust also comiling a piece of code in the doc:

```ignore
some code
```

Test that should panic

```rust,should_panic
some code
```

Test dependencies

Dependencies used only by the tests are inserted in ad-hoc Cargo.toml section:

[dev-dependencies]
foo = "0.0.1"

DOCUMENTATION

Comments starting with /// are used to document specific items. Comments starting with //! are used to document a module or a crate.

Markdown syntax can be used.

To generate HTML documentation:

$ cargo doc --no-deps --open

--no-deps: generate documentation only for the crate and not the dependencies.

Documentation is generated only for the pub features.

Relevant sections

Markdown can be used to emphasize important sections:


RAW POINTERS

A raw pointer is like a C/C++ pointer.

Can be obtained by casting a reference to a value.

let val = 3;
let ptr = &val as *const i32; // const shall be explicit

To get a raw pointer from a Box:

let b = Box::new(3);
let ptr = &*b as *const i32;

Raw pointers can be dereferenced onli in unsafe blocks/functions.

A raw pointer to an unsized type is a fat pointer (like the corresponding reference or Box type). Thus it contains a length along with the address.

let r = &b"Hello";
*const [u8];

A raw pointer to a trait object carries a vtable.

In contrast to safe pointers (as Box), raw pointers dereference shall always be explicit.

(*raw).method();

Raw pointers do not implement Deref, so deref coercions do not apply to them.

Operators like == and < compare raw pointers as addresses.

Unlike references, raw pointers are neither Send nor Sync. As a result, any type that includes raw pointers does not implement these traits by default.


MISC

Unsafe Rust

Primary things requiring unsafe blocks:

Raw identifier

Allows to use reserved language "keywords" as identifiers.

foo::r#try(); // `try` is a keyword

Sized trait

Some types are unsized.

The str and [T] (without an &) are unsized. They denote a sequence of values that may have any length.

The same applies to the referent of a "trait object". Since the concrete type can be of any type implementing the trait, the trait referent (i.e. the pointed object) is considered unsized.

Unsized types cannot be stored in variables or be passed as arguments. We can deal with then via pointers (like &str or Box), which themselves are sized.

Pointers to unsized types are always "fat pointers". For example, str pointers contains a length, trait objects contains a vtable pointer.

All sized types implement the std::marker::Sized trait. This trait is eventually automatically implemented, it cannot be implemented manually.

For generics, Sized type bound is the implicit default. That is, if you write struct S<T> { ... } Rust understands you to mean struct S<T: Sized> { ... }. If you do not want to constrain T this way, you must explicitly opt out, writing struct S<T: ?Sized> (not necessarily Sized).

struct S<T: ?Sized> {
    b: Box<T>,
}

You can thus write:

let s: S<Write> = ...;

Only the last element of a struct can be declared as an unsized type.

struct Foo<T: ?Sized> {
    count: usize,
    value: T,           // allowed because it is the last element.
}

In this case, the struct is itself unsized.

Send and Sync traits

Types implementing Send are safe to pass by value to another thread. They can be moved across threads.

Types implementing Sync are safe to pass by non-mut reference to another thread. They can be shared across threads.

Most types are both Send and Sync. You don't have to use #[derive] explicitly. A struct or enum is Send/Sync if its fields are Send/Sync.

In standard libraries Sync types are a subset of Send types. Non Sync types examples: Cell<usize>, Receiver<u8> Non Send types examples: Rc<String>, *mut u8

Values moved between threads using a closure or using a channel shall be Send.

Notable Crates

Miscelanea

Random generators

Data encoding

Concurrency

Learning Sources

TODO

proudly self-hosted on a cheap Raspberry Pi 2