We’re big fans of safety-focused languages at FP Complete. As our previous blog post comparing Rust and Haskell made clear, we think both of these are great languages. A recurring theme we have for both internal and client projects is: which language should we use? The general guidelines in that blog post apply:
A large part of the higher safety of Haskell is its expressive type system. You can simply state more invariants, in general, in Haskell than in Rust.
However, I’d like to point out a situation where Rust’s safety guarantees are in fact stronger than Haskell’s, and how the compiler naturally pushes us towards the correct, safe solution.
My general recommendation about mutable variables when giving Haskell training is:
If you follow that advice, the bug below won’t occur, or at least won’t occur implicitly. That said, the code below is entirely valid Haskell, so I don’t feel that cheeky pointing it out.
Question: Will the following code throw an exception, or will it exit successfully?
main :: IO ()
main = do
ref <- newIORef (0 :: Int)
secret $ do
x <- randomIO
writeIORef ref x
threadDelay 1000
y <- readIORef ref
unless (x == y) $ error "failure!"
putStrLn "All good!"
This code creates a reference. Then, bracketed by some function
secret
, it will generate a random Int
,
write it to the reference, sleep for 1ms, read the reference out
again, and ensure the written and read values are the same.
So will the code throw an exception? The right answer is really
“it depends.” We’ve invoked a function called secret
,
with the type IO () -> IO ()
. Obviously, if
secret
is defined as secret _ = error
"hahaha"
, this code will error out. But I’m talking about
non-trivial failure cases.
One valid definition for secret
is the identity
function:
secret :: IO () -> IO ()
secret = id
And with this definition, our code exits successfully. Hurrah!
However—and you may have seen this coming—it’s fairly easy to
come up with an implementation of secret
which causes
our code to fail:
secret :: IO () -> IO ()
secret = replicateConcurrently_ 1000
We’ve now spawned 1,000 threads running concurrently, all trying to write to the same reference. Assuming one version of how these threads get scheduled, all 1,000 threads first write their values to the shared reference. Then, they all take a nap for 1ms. Finally, they all read the value, and 999 of them are disappointed with the result.
The right answer to this in Haskell is what I said above: avoid
mutable references, and prefer STM. If you use STM, you’ll need to
explicitly bracket your mutable actions with an
atomically
call, and it will be far more obvious when
you’re allowing other threads to mutate your variables.
Click below to learn more about a unique offer
Let’s write an equivalent main
function in
Rust:
fn main() {
let mut var = 0isize;
secret(|| {
let x = rand::random();
var = x;
sleep(Duration::from_millis(1));
assert_eq!(x, var);
});
}
Ignoring silly cases like secret
immediately
panic!
ing, will this code exit successfully? Our gut
reaction is “it depends.” But again, ignoring the silly cases, what
can go wrong?
Firstly, let’s try out the identity case for Rust. Writing higher order functions is a bit more verbose than in Haskell, but not too bad:
fn secret<F: FnMut()>(mut f: F) {
f();
}
Pro tip, especially to those who read
my Rust crash course lesson 5: you can use FnOnce
here instead. I used FnMut
because of the rest of the
code in this post.
If you squint hard enough, you’ll probably agree that this is basically the same as the above.
Alright, that’s all well and good, but how about throwing in a curve ball with multiple threads? Let’s really test out that Rust “fearless concurrency” concept:
fn secret<F: FnMut()>(mut f: F) {
let mut handles = vec![];
for _ in 0..1000 {
handles.push(spawn(f));
}
for handle in handles {
handle.join().unwrap();
}
}
This code spawns 1,000 threads, each running the same
f
closure. It then blocks on each of these completing
(via join
), and will then re-panic!
in
the main thread if any of the child threads panic!
ed.
For our purposes, it’s quite similar to the
replicateConcurrently_
in Haskell-land.
Obviously this code must fail at runtime, right? It has the same kind of logic bug as the Haskell code. However, the Rust code doesn’t even get that far. Instead, we get a compile time error:
error[E0277]: `F` cannot be sent between threads safely
--> src/main.rs:17:22
|
17 | handles.push(spawn(f));
| ^^^^^ `F` cannot be sent between threads safely
|
= help: the trait `std::marker::Send` is not implemented for `F`
= help: consider adding a `where F: std::marker::Send` bound
= note: required by `std::thread::spawn`
Oh, right. We can’t send data between threads unless we have a
Send
trait. Fortunately, the compiler tells us how to
fix that one right away. We change the type signature of
secret
to:
fn secret<F: FnMut() + Send>(mut f: F) {
But now we get a new error:
error[E0310]: the parameter type `F` may not live long enough
--> src/main.rs:17:22
|
14 | fn secret<F: FnMut() + Send>(mut f: F) {
| -- help: consider adding an explicit lifetime bound `F: 'static`...
...
17 | handles.push(spawn(f));
| ^^^^^
|
note: ...so that the type `F` will meet its required lifetime bounds
--> src/main.rs:17:22
|
17 | handles.push(spawn(f));
| ^^^^^
Oh, huh. F
may not live long enough. Perhaps the
closure—or references it has captured—will be dropped before the
other thread completes. Again, the compiler tells us how to fix
this: add the 'static
lifetime parameter:
fn secret<F: FnMut() + Send + 'static>(mut f: F) {
Darn, now we’ve got two problems: one in the
main
function, and one in spawn
:
error[E0597]: `var` does not live long enough
--> src/main.rs:8:9
|
6 | secret(|| {
| -- value captured here
7 | let x = rand::random();
8 | var = x;
| ^^^ borrowed value does not live long enough
...
12 | }
| - `var` dropped here while still borrowed
|
= note: borrowed value must be valid for the static lifetime...
error[E0382]: use of moved value: `f`
--> src/main.rs:17:28
|
17 | handles.push(spawn(f));
| ^ value moved here, in previous iteration of loop
|
= note: move occurs because `f` has type `F`, which does not implement the `Copy` trait
Ugh, this is getting tedious. Fine, let’s try to solve the
problem in secret
. We’ve moving the
f
closure on each iteration through the
for
loop. Let’s clone it instead. To do that, we need
to add a Clone
trait to our secret
function:
fn secret<F: FnMut() + Send + 'static + Clone>(mut f: F) {
let mut handles = vec![];
for _ in 0..1000 {
handles.push(spawn(f.clone()));
}
for handle in handles {
handle.join().unwrap();
}
}
Now, we’ve finally written a valid secret
function. But our main
function is still broken:
error[E0277]: the trait bound `&mut isize: std::clone::Clone` is not satisfied in `[closure@src/main.rs:6:12: 11:6 var:&mut isize]`
--> src/main.rs:6:5
|
6 | secret(|| {
| ^^^^^^ within `[closure@src/main.rs:6:12: 11:6 var:&mut isize]`, the trait `std::clone::Clone` is not implemented for `&mut isize`
|
= help: the following implementations were found:
<isize as std::clone::Clone>
= note: required because it appears within the type `[closure@src/main.rs:6:12: 11:6 var:&mut isize]`
note: required by `secret`
--> src/main.rs:14:1
|
14| fn secret<F: FnMut() + Send + 'static + Clone>(mut f: F) {
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The closure captures a mutable reference to a variable. But
mutable references cannot be cloned! Because, if we did that, we’d
end up with a data race. (Kind of like that original bug in the
Haskell program, right?) We’ll also run into a lifetime
problem, but the compiler hasn’t even identified that here since
it’s so hung up on the lack of a Clone
.
Well, we know how to solve this kind of situatuion in Rust: use a reference counted wrapper. Let’s go ahead and throw that at the problem:
use std::rc::Rc;
fn main() {
let mut var = Rc::new(0isize);
secret(|| {
let x = rand::random();
*var = x;
sleep(Duration::from_millis(1));
assert_eq!(x, *var);
});
}
And, once again, Rust doesn’t like this:
error[E0277]: `std::rc::Rc<isize>` cannot be shared between threads safely
--> src/main.rs:7:5
|
7 | secret(|| {
| ^^^^^^ `std::rc::Rc<isize>` cannot be shared between threads safely
|
= help: the trait `std::marker::Sync` is not implemented for `std::rc::Rc<isize>`
= note: required because of the requirements on the impl of `std::marker::Send` for `&std::rc::Rc<isize>`
= note: required because it appears within the type `[closure@src/main.rs:7:12: 12:6 var:&std::rc::Rc<isize>]`
note: required by `secret`
--> src/main.rs:15:1
|
15| fn secret<F: FnMut() + Send + 'static + Clone>(mut f: F) {
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
An Rc
is a non-atomic reference counted variable.
It’s slightly more efficient than an Arc
, because it
doesn’t need to use atomic primitives on the reference count
variable. But, we can’t send it to another thread, which is exactly
what we’re trying to do. (It’s also what the Haskell type system
doesn’t even notice it’s letting us do.)
Switching over to an Arc
instead solves that
problem with almost no code change. We also need to dereference the
var
when using it now:
use std::sync::Arc;
fn main() {
let var = Arc::new(0isize);
secret(move || {
let x = rand::random();
*var = x;
sleep(Duration::from_millis(1));
assert_eq!(x, *var);
});
}
But the compiler isn’t happy with this either!
error[E0594]: cannot assign to data in a `&` reference
--> src/main.rs:9:9
|
9 | *var = x;
| ^^^^^^^^ cannot assign
error: aborting due to previous error
An Arc
will dereference into an immutable
reference. So of course we can’t write to it! What we’ve
essentially created with our Arc<isize>
is the
equivalent of an immutable Int
in Haskell, which can
be shared between multiple threads, but never written to. In order
to allow writing, we need some kind of cell wrapper. We could test
the waters with RefCell
, but I’m sure everyone’s
falling asleep reading at this point, so let’s just cut to the
chase:
extern crate rand; // not needed in Rust 2018!
use std::thread::{spawn, sleep};
use std::time::Duration;
use std::sync::{Arc, Mutex};
fn main() {
let var = Arc::new(Mutex::new(0isize));
secret(move || {
let x = rand::random();
let mut var = var.lock().unwrap();
*var = x;
sleep(Duration::from_millis(1));
assert_eq!(x, *var);
});
}
fn secret<F: FnMut() + Send + 'static + Clone>(f: F) {
let mut handles = vec![];
for _ in 0..1000 {
handles.push(spawn(f.clone()));
}
for handle in handles {
handle.join().unwrap();
}
}
We’ve gone ahead and locked a mutex, ensuring that no other thread has the possibility of modifying our variable while we’re working with it. This code will compile successfully (finally!), and will complete successfully at runtime.
While Haskell generally has better safety guarantees than Rust, there are some cases where the explicit tracking of lifetimes and thread safety—versus Haskell’s garbage collection and explicit-mutable-types—produces a safer result. Specifically, in our case, the following features of Rust forced us into using the right solution:
The following features also play in, but overlap strongly with Haskell:
I don’t want to imply that it’s impossible to make a logic error in the Rust code. What I am saying is that Rust naturally pushed us towards the correct solution here. However, it’s possible for us to reduce the scope of the mutex lock and end up in the same situation as the Haskell code:
fn main() {
let var = Arc::new(Mutex::new(0isize));
secret(move || {
let x = rand::random();
{
let mut var = var.lock().unwrap();
*var = x;
}
sleep(Duration::from_millis(1));
{
let var = var.lock().unwrap();
assert_eq!(x, *var);
}
});
}
However, this code is more blatant about its lack of thread safety, just like the Haskell STM implementation is:
main :: IO ()
main = do
var <- newTVarIO (0 :: Int)
secret $ do
x <- randomIO
atomically $ writeTVar var x
threadDelay 1000
y <- atomically $ readTVar var
unless (x == y) $ error "failure!"
putStrLn "All good!"
While both the Haskell and Rust versions presented will fail at runtime, it’s fair to claim that the language and libraries did all they possibly could to help the programmer avoid the problem. In the Rust case, we explicitly delimited the scope and locked the mutex twice. In the Haskell code, we explicitly elected to perform two different atomic transactions.
When dealing with concurrent code—or really any code—it’s impossible to create a language that will prevent the possibility of all logic errors. Instead, our hope is that we avoid most logic errors with the help of our tooling. And I believe both Haskell and Rust assist with this significantly.
Learn more about Haskell and Rust at FP Complete.
Interested in exploring these languages for your team? Find out more about our consulting and training offerings.
Subscribe to our blog via email
Email subscriptions come from our Atom feed and are handled by Blogtrottr. You will only receive notifications of blog posts, and can unsubscribe any time.