Let's say I'm writing some program in Rust that needs to perform an operation every 60 seconds. Easy enough, we can do something like this:
fn background_task() {
loop {
println!("Performing background task.");
std::thread::sleep(std::time::Duration::from_secs(60));
}
}
fn main() {
background_task();
}
The Rust compiler is pretty smart about identifying a non-terminating loop. For example, if I add the following line at the end of the loop:
println!("Loop finished");
The code still compiles, but I get the following warning:
warning: unreachable statement
--> src/main.rs:6:5
|
2 | / loop {
3 | | println!("Performing background task.");
4 | | std::thread::sleep(std::time::Duration::from_secs(60));
5 | | }
| |_____- any code following this expression is unreachable
6 | println!("Loop finished");
| ^^^^^^^^^^^^^^^^^^^^^^^^^ unreachable statement
This infinite loop detection doesn't just generate warnings about unreachable code. It also impacts the types within your code. In our current version of the code, the loop evaluates* to a unit value (), and similarly the background_task function returns unit. We can actually change that around pretty easily:
fn background_task() -> u32 {
loop {
println!("Performing background task.");
std::thread::sleep(std::time::Duration::from_secs(60));
}
}
* Note: we'll see a bit later that this is a simplification, we'll get the details later.
All I did was add a return type to the background_task function, and everything still type checked! It makes sense in the call site within main: sticking a semicolon at the end of the statement means "discard the returned value," so the u32 that we claim is being returned is just thrown out.
But how can the same loop evaluate to both a unit value and a u32 value?
The bottom type
To understand this, we're going to take a quick trip down type theory lane. First, let's understand the relationship between types and values. Types define a set of values. For example, the type u32 represents all integers from 0 through 2^32 - 1. The String type represents all possible sequences of Unicode code points of any length. And so on.
There's a concept in type theory known as the bottom type. The Wikipedia definition is a bit dense:
the bottom type of a type system is the type that is a subtype of all other types
I find it easier to think of the bottom type as: the type that has no values at all. And it's easy to create such a type ourselves in Rust:
enum Bottom {}
Since there are no variants for this enum, it's impossible to create a value for it. Also, since we already saw that an infinite loop can return any type, we can modify our background_task function to return Bottom:
fn background_task() -> Bottom {
loop {
println!("Performing background task.");
std::thread::sleep(std::time::Duration::from_secs(60));
}
}
Logically speaking, we know that there's no way for a Bottom value to come into existence. Therefore, when we see a value of type Bottom, we can be assured that we'll never reach that code. We can somewhat cheekily enforce this with a helper method and a match:
impl Bottom {
fn use_it(self) {
match self {}
}
}
fn main() {
let bottom = background_task();
bottom.use_it();
}
This still isn't terribly useful. And more than that, this code generates some unfortunate warnings (just including the first for brevity):
warning: unreachable expression
--> src/main.rs:18:5
|
17 | let bottom = background_task();
| ----------------- any code following this expression is unreachable
18 | bottom.use_it();
| ^^^^^^ unreachable expression
|
note: this expression has type `Bottom`, which is uninhabited
While not yet useful, it does demonstrate that we can use a bottom type to represent "this can never happen." Let's see another use case for this.
Impossible errors
Sometimes, we have a trait that requires defining some error type. A good example of this is TryFrom. Let's implement a helper MyU32 type and then provide a TryFrom to convert from a u64. (In practice, you should try to reuse the existing TryFrom impl for u32, I'm doing manual testing and casting to be more explicit.)
#[derive(Debug)]
struct MyU32(u32);
#[derive(Debug)]
struct TooBigError;
impl TryFrom<u64> for MyU32 {
type Error = TooBigError;
fn try_from(value: u64) -> Result<MyU32, TooBigError> {
if value > (u32::MAX as u64) {
Err(TooBigError)
} else {
Ok(MyU32(value as u32))
}
}
}
fn main() {
println!("{:?}", MyU32::try_from(42u64));
println!("{:?}", MyU32::try_from(u64::MAX));
}
Now let's try converting from a smaller integer type instead: moving from a u16 to a u32. This is a total conversion: every possible input value from u16 maps to a value u32 output. No errors are possible. But the structure of TryFrom still requires defining an error type. So what can we do? One possibility is to continue using the TooBigError type, but that name is incredibly confusing.
Alternatively, we could change the error name to something like CannotError so that the code at least looks reasonable. But when we try to do more than just print out the results, we'll begin to see the pain points in the code:
#[derive(Debug)]
struct MyU32(u32);
#[derive(Debug)]
struct CannotError;
impl TryFrom<u16> for MyU32 {
type Error = CannotError;
fn try_from(value: u16) -> Result<MyU32, CannotError> {
Ok(MyU32(value as u32))
}
}
fn main() {
match MyU32::try_from(u16::MAX) {
Ok(value) => println!("{}", value.0),
Err(_) => unreachable!(),
}
}
Notice that we needed to explicitly add in an unreachable! macro call. I consider this a code smell. If possible, I want to avoid making assertions that code is unreachable, since my code analysis may be wrong. Instead, I'd rather rely on the compiler to guarantee it for me.
As it turns out, our bottom trick fits in perfectly here too! We need to have a type to represent the errors, but we want to say "there are no errors possible." Solution? Use a bottom type for it! We can simply change our CannotError type to be:
#[derive(Debug)]
enum CannotError {}
And then, in our main function, we no longer need to consider the error case at all. We can instead use an irrefutable pattern match since we know that the returned value will always be an Ok:
fn main() {
let Ok(value) = MyU32::try_from(u16::MAX);
println!("{}", value.0);
}
And because this is such a common use case, it's already provided straight from the standard library. Replace CannotError with std::convert::Infallible, and you get exactly the same behavior.
This is an example where I believe type theory is a huge win. Our code is now:
- More expressive because the return type of
try_fromtells us that an error cannot occur. - More robust because future changes that reintroduce a possible error case will cause the call sites of
try_fromto update themselves to handle that case.
To summarize to this point: bottom types are easy to define in Rust, and Infallible is a great example from the standard library. Bottom types allow us to say "there's no way a value of this type will be produced," which can be used to both represent non-termination (a function that never exits) or code branches that are not triggered (like errors). Now let's get into ergonomics.
What should background_task return?
With our newfound understanding of bottom types in place, let's revisit our initial use case: the non-terminating function.
fn background_task() {
loop {
println!("Performing background task.");
std::thread::sleep(std::time::Duration::from_secs(60));
}
}
What return type should this have ideally? With this implementation (returning unit), we know of at least one downside: the type signature of the function doesn't tell us that the function is non-terminating. That's not necessarily a problem at all, more of a lost perk.
But the actual motivating case for this post is slightly different. In that case, we have multiple background jobs running within a tokio JoinSet, and we need to unify those types together. I could demonstrate that with Tokio, but that would require pulling in async code, and we're not ready for that yet.
So, instead, I've implemented a mini non-async JoinSet to try things out. (Side note: please don't use this in production, I threw it together quickly for this post, it's likely broken in subtle ways.) With that in place, let's have a look at our new main module:
mod join_set;
fn background_task1() {
loop {
println!("Performing background task 1.");
std::thread::sleep(std::time::Duration::from_secs(60));
}
}
fn background_task2() -> Result<(), String> {
let mut counter = 0;
loop {
counter += 1;
println!("Performing background task 2.");
std::thread::sleep(std::time::Duration::from_secs(2));
if counter >= 3 {
return Err("Background task 2 errored".to_owned());
}
}
}
fn main() -> Result<(), String> {
let mut set: join_set::JoinSet<Result<(), String>> = join_set::JoinSet::new();
set.spawn(background_task1);
set.spawn(background_task2);
while let Some(res) = set.join_next() {
match res {
Err(e) => return Err(format!("Background task panicked: {e:?}")),
Ok(Err(e)) => return Err(format!("Background task errored out: {e}")),
Ok(Ok(())) => (),
}
}
Ok(())
}
This code does not compile. We are trying to launch both background_task1 (which returns unit) and background_task2 (which returns a Result), which leads to the compiler complaining about types not matching up:
error[E0271]: expected `background_task2` to return `()`, but it returns `Result<(), String>`
Let's walk through a few ways we could try to fix this.
Return a Result from background_task1
This is probably the easiest to implement. We just change the signature of background_task1 to:
fn background_task1() -> Result<(), String> {
Advantage It works with minimal fuss.
Disadvantage If we want to use this function on its own, we now have a spurious error case we need to handle. And if we wanted to use this function in a JoinSet with other functions that return something besides Result, we'd have to play this dance again.
Wrap while spawning
Instead of changing the function itself, we can simply change the type at the call site:
set.spawn(|| {
background_task1();
Ok(())
});
set.spawn(background_task2);
Advantage The type signatures of each function remain accurate. Or at least as close to accurate as a unit return type is from a non-terminating function.
Disadvantage It's quite a bit of boilerplate to include all over the place.
Non-solution: Infallible
We used Infallible above as a bottom type. Let's use it here too! Minimally, we'd want to change background_task1 to be:
fn background_task1() -> std::convert::Infallible {
And while we're at it, we may get adventurous and modify background_task2 as well:
fn background_task2() -> Result<std::convert::Infallible, String> {
This return type means: a successful return is impossible, but an error return is still possible.
The problem is that this still won't compile. Instead, we get the error message:
expected `background_task2` to return `Infallible`, but it returns `Result<Infallible, String>`
Even though:
- conceptually a bottom type is a subtype of all other types, and
- per type theory, the
Infallibletype is uninhabited and therefore meets the definition of a bottom type, - nonetheless, Rust doesn't have any special handling of
Infallibleto allow its values to unify with or be coerced into other types
I could manually address this by doing explicit pattern matching on the returned value from background_task1:
set.spawn(|| {
let infallible = background_task1();
match infallible {}
});
This has two main problems:
-
It's still verbose and non-ergonomic.
-
As written, the code generates a warning from the compiler:
26 | let infallible = background_task1(); | ------------------ any code following this expression is unreachable 27 | match infallible {} | ^^^^^^^^^^ unreachable expression
However, even more interestingly, since we also changed the return type of background_task2 to return a Result<Infallible, String> (using Infallible instead of unit), the compiler is now able to see that there's no way for one of the background tasks to exit successfully. Therefore, every time join_next returns, it's either because of a thread panic or a task erroring out. Since both of those result in a short-circuiting return, the compiler correctly tells us that this loop never actually loops. We can now replace our while loop with something simpler:
let res = set
.join_next()
.expect("Impossible: received a None from join_next, but there are threads");
match res {
Err(e) => Err(format!("Background task panicked: {e:?}")),
Ok(Err(e)) => Err(format!("Background task errored out: {e}")),
// x here has type Infallible, so we can use a match
// to prove this branch will never be taken.
// We could even leave this case out, the compiler will figure
// this out for us!
Ok(Ok(x)) => match x {},
}
The code is more explicit: we will only ever wait for one thread to exit, and when that happens, we will exit our process with an error.
These are good improvements, but still not ergonomic. Time we finally introduce the "real" solution!
The never type
Surprise: Rust has a built-in bottom type! You may be wondering why I waited this long to mention it. Well, for one reason, I wanted to motivate the understanding of it. For another reason... well, just wait.
The never type is written as !. And just like our other potential return types, setting it as the return type for background_task1 works just fine:
fn background_task1() -> ! {
loop {
println!("Performing background task 1.");
std::thread::sleep(std::time::Duration::from_secs(60));
}
}
Unfortunately, the same cannot be said of background_task2. If I change its signature to:
fn background_task2() -> Result<!, String> {
I get the error message:
error[E0658]: the `!` type is experimental
...
= note: see issue #35121 <https://github.com/rust-lang/rust/issues/35121> for more information
Oops. Exactly the thing we want to use, but it doesn't work! Drat.
But that's OK, for now let's simply put that function back to returning Result<(), String> but keep ! as the return type for background_task1. And to make the errors a bit more informative, I'll spawn background_task2 before background_task1:
set.spawn(background_task2);
set.spawn(background_task1);
The next error shows us an even more critical limitation of the never type in Rust:
expected `background_task1` to return `Result<(), String>`, but it returns `!`
The Rust compiler is not treating ! as a true "bottom type" that is automatically the subtype of every other type. Instead, ! is being treated as its own type which is distinct from Result<(), String>. What we would want instead would be for these two types to unify, or for the ! return value to be coerced into a Result<(), String>.
Fortunately, these features are planned. You can see more information on the upcoming improved never type at:
The never type is used by the compiler today in some cases. In fact, we alluded to one above already: infinite loops:
- When you have an infinite loop, it actually evaluates to
!. - Today, Rust has special coercions from
!in some positions, but doesn’t treat it as a regular, first-class subtype in all type relationships. - Therefore, in some special cases, we can set the final expression within the function body to a loop, and Rust will allow it to automatically coerce to whatever the return type of the function is defined as.
- That’s why our original
loop { .. }could "pretend" to return () oru32depending on the function signature: the loop has type!, and Rust lets!coerce to whatever return type the function declares.
So to summarize up until this point:
- A true bottom type would allow us to have our cake and eat it too: ergonomic code, clarity at the type level about termination and error conditions, and code simplifications like removing unnecessary loops.
- The never type in Rust is slated to provide all this functionality.
- At present, however, it only does that in a few select cases.
For the rest of this post, we're going to try to come up with an ergonomic-enough solution that works in today's stable Rust. And in order to do that, we're going to take a quick detour, and talk about Haskell's Void and absurd, plus universal quantification.
The Haskell detour
Haskell's standard library (base) has a built-in module called Data.Void which provides the Void uninhabited type and some helper functions: absurd (which we'll get to) and vacuous (which I'll leave as an exercise to the reader).
import Data.Void (Void)
import Control.Concurrent (threadDelay)
backgroundTask1 :: IO Void
backgroundTask1 = do
putStrLn "Performing background task 1."
threadDelay (1 * 1000 * 1000) -- given in microseconds
backgroundTask1 -- no explicit loops in Haskell, we use recursion
main :: IO ()
main = do
-- ignore the returned Void value
_ <- backgroundTask1
pure ()
Void is similar to Infallible in Rust: it’s an ordinary, uninhabited type from the standard library, not a magic built-in like !. In this case, we simply ignore the Void value. But we can do better. We can explicitly say how absurd it would be for that operation to return a value!
import Data.Void (Void, absurd)
import Control.Concurrent (threadDelay)
backgroundTask1 :: IO Void
backgroundTask1 = do
putStrLn "Performing background task 1."
threadDelay (1 * 1000 * 1000) -- given in microseconds
backgroundTask1 -- no explicit loops in Haskell, we use recursion
main :: IO ()
main = do
v <- backgroundTask1
absurd v
The type signature and implementation of absurd are, in my opinion, fascinating:
absurd :: Void -> a
absurd x = case x of {}
A case expression in Haskell is equivalent to a match in Rust, so we can see that the implementation is using the same "uninhabited type pattern matching" approach we've used before.
But look at the first line (the type signature). The meaning for those unfamiliar with Haskell is: if you give me a Void value, I'll give you back anything you want. That's the meaning of the a type variable: it's fully unconstrained, so the caller of this function determines what type it should be via type inference and unification.
And, it turns out, we didn't need the Void type or the absurd function here at all. In Haskell, we have a much simpler solution:
import Control.Concurrent (threadDelay)
backgroundTask1 :: IO a
backgroundTask1 = do
putStrLn "Performing background task 1."
threadDelay (1 * 1000 * 1000) -- given in microseconds
backgroundTask1 -- no explicit loops in Haskell, we use recursion
main :: IO ()
main = backgroundTask1
We can skip Void and absurd entirely by giving backgroundTask1 the more general type IO a. Since the function never returns, this is safe: a can be whatever the caller needs, and the code still type-checks.
This is beautiful. And Rust will hopefully have this soon too with the fully loaded never type.
One final point: you may be wondering, "Why bother having a Void type at all?" In many cases, it isn't necessary. However, it does have uses where we need to explicitly say "no, there really isn't anything coming out here, not even if the call site wants to say otherwise." One example is the connect function in my conduit package for streaming data. Using Void is a way to say in the type: this component of a data pipeline never produces any values.
In type-theory terms, absurd has the type forall a. Void -> a; we say it is universally quantified over a — it works for any result type a the caller chooses. Which begs the question: can we use the same trick in Rust?
Returning T in Rust
Let's go back to the drawing board. Let's replace the signature of background_task1 with:
fn background_task1<T>() -> T {
Since we've been making a lot of edits, here's the full code I'm referencing. And this just works! Hurrah! We don't even need that stupid never type that the Rust people are working on. We'll just use unbounded type parameters.
Let's go ahead and update our background_task2 for this:
fn background_task2<T>() -> Result<T, String> {
Unfortunately that didn't work out as expected:
error[E0282]: type annotations needed
--> src/main.rs:26:15
|
26 | set.spawn(background_task2);
| ^^^^^^^^^^^^^^^^ cannot infer type of the type parameter `T` declared on the function `background_task2`
The problem is that we need to help Rust out with type inference. We can see this even without modifying background_task2:
fn main() {
background_task1();
}
This results in:
23 | background_task1();
| ^^^^^^^^^^^^^^^^ cannot infer type of the type parameter `T` declared on the function `background_task1`
|
help: consider specifying the generic argument
|
23 | background_task1::<T>();
| +++++
As the error message recommends, we could address this by constraining T using the turbofish approach. However, I'd argue that this is both non-idiomatic Rust code, and non-ergonomic. As tempting as it is to copy Haskell's approach, this is where the two languages diverge in type-level features.
I'm going to consider the unbounded type variable a non-solution. It works theoretically, but it fights Rust’s ergonomics and inference system, and it forces noise onto every call site. Turns out that "stupid" never type wasn't so stupid after all.
The absurd macro in Rust
So: Haskell has absurd :: Void -> a, and Rust’s never type wants to give us the same expressive power, but isn’t quite there yet.
What can we do today?
Fortunately, we can implement our own version of absurd in Rust with exactly the same semantics as the Haskell version. In fact, we’ve already seen the building blocks:
- an uninhabited type (
Infallibleor a customenum Never {}), and - an exhaustive
matchover that type.
We'll implement this using a macro:
macro_rules! absurd {
($x:expr) => {
match $x {}
};
}
This matches the meaning from Haskell: given a value of an uninhabited type, we can produce any type we want—because the match has zero arms, so the code is never executed.
And then, when spawning our task, we apply the macro to generalize the return type and allow type inference to cause it to match the other spawned job:
set.spawn(|| absurd!(background_task1()));
set.spawn(background_task2);
This still isn’t as magical as the full ! coercion story that Rust is working toward—after all, we are the ones invoking the absurd! macro—but it gives us a nice, explicit, principled way to unify incompatible return types.
It also has two nice benefits:
-
The intent is obvious. We are marking the unreachable code path with something explicitly tied to “this cannot happen.”
-
The compiler keeps us honest. If you ever accidentally make an uninhabited type inhabited, the compiler will force you to update every absurd usage. (This is exactly the kind of type-level honesty we want.)
An AbsurdFuture
And finally, we can get into the real pull request that kicked all this off. The real pull request is part of Kolme. The situation we were trying to resolve was that:
- We have many background jobs that need to be spawned in a tokio
JoinSet - Some of the jobs return
Results (because they can error out), others simply return unit (later modified toInfallible) - We had some really ugly code to enable this, e.g.:
set.spawn(async {
processor.run().await;
#[allow(unreachable_code)]
Err(anyhow::anyhow!("Unexpected exit from processor"))
});
This code is really frustrating. Both we (the coders) and the compiler know that we'll never return from processor.run().await. However, if we leave the code like that, the compiler will complain that the types don't match up. And if we add the explicit (and completely spurious) Err line on its own, we'll get a warning about unreachable code. So not only do we need to add an unnecessary Err, but also an unnecessary allow pragma. Boo!
Initially, I tried implementing an absurd_future macro. That certainly worked, but Emanuel Borsboom discovered an even better one: a combo AbsurdFuture helper type with an absurd_future function.
The idea of AbsurdFuture is to turn a never-returning future into a future yielding any desired type. You can see how it cleaned up our Kolme code. Let's implement the same technique in our running background tasks example, now updated to use tokio, to see how that plays out in practice:
use std::{
convert::Infallible,
marker::PhantomData,
pin::Pin,
task::{Context, Poll},
};
use tokio::task::JoinSet;
/// Turn a never-returning future into a future yielding any desired type.
///
/// Useful for async tasks that logically don't complete but need to satisfy an
/// interface expecting a concrete output type.
#[must_use = "futures do nothing unless polled"]
pub struct AbsurdFuture<F, T> {
inner: Pin<Box<F>>,
_marker: PhantomData<fn() -> T>,
}
impl<F, T> AbsurdFuture<F, T> {
pub fn new(inner: F) -> Self {
Self {
inner: Box::pin(inner),
_marker: PhantomData,
}
}
}
impl<F, T> Future for AbsurdFuture<F, T>
where
F: Future<Output = Infallible>,
{
type Output = T;
fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output> {
let inner = self.get_mut().inner.as_mut();
match Future::poll(inner, cx) {
Poll::Pending => Poll::Pending,
Poll::Ready(never) => match never {},
}
}
}
pub fn absurd_future<F, T>(future: F) -> AbsurdFuture<F, T>
where
F: Future<Output = Infallible>,
{
AbsurdFuture::new(future)
}
async fn background_task1() -> Infallible {
loop {
println!("Performing background task 1.");
tokio::time::sleep(std::time::Duration::from_secs(60)).await;
}
}
async fn background_task2() -> Result<Infallible, String> {
let mut counter = 0;
loop {
counter += 1;
println!("Performing background task 2.");
tokio::time::sleep(std::time::Duration::from_secs(2)).await;
if counter >= 3 {
return Err("Background task 2 errored".to_owned());
}
}
}
#[tokio::main]
async fn main() -> Result<(), String> {
let mut set: JoinSet<Result<Infallible, String>> = JoinSet::new();
set.spawn(absurd_future(background_task1()));
set.spawn(background_task2());
let res = set
.join_next()
.await
.expect("Impossible: received a None from join_next, but there are threads");
match res {
Err(e) => Err(format!("Background task panicked: {e:?}")),
Ok(Err(e)) => Err(format!("Background task errored out: {e}")),
}
}
I really like this: expressive code, minimal ergonomic impact, and no dangling assertions like an unreachable!() macro waiting to be disproven later.
Conclusion
I didn't expect I'd be going down a type theory and Rust unstable features rabbit hole at the beginning of this week. But I'm really glad I did! While the current situation with never isn't great, there's a promising future I can't wait to explore.
What do you think? Is there enough value in the type-level proof of non-termination? Would you take a simpler approach to type unification instead? Let me know!
- Discuss on FP Block's subreddit
- Ping FP Block's or Michael's X account
Subscribe to our blog via email
Email subscriptions come from our Atom feed and are handled by Blogtrottr. You will only receive notifications of blog posts,
and can unsubscribe any time.
Do you like this blog post and need help with Next Generation Software Engineering, Platform Engineering or Blockchain & Smart Contracts? Contact us.