As part of my work on Austral, I did a lot of thinking on how linear type systems interact with error handling strategies. This article serializes those thoughts.
A linear type system guarantees all resources allocated by a terminating program will be freed, and none will be used after being freed. This guarantee is lost with the introduction of exceptions: we can throw an exception before the consumer of a linear resource is called, leaking the resource. This article goes through different strategies for reconciling linear types and exceptions.
A Brief Introduction to Linear Types
In a normal type system, values can be copied arbitrarily. When you do function
(x) { return (x, x); }
, you’re copying a value, even if it’s a reference. This
leads to problems: when you free a chunk of memory, there is no guarantee that
there isn’t another pointer to that same chunk. Reading from or writing to that
chunk leads to a segmentation fault at best, a security vulnerability at
worst.
This is still a problem in langage with garbage collection: file handles and sockets are analogous to memory, in that they have to be explicitly opened and closed. Trying to write to a socket after it’s been closed is an error. Trying to write to a file from multiple threads at once can lead to subtle, impossible to reproduce errors.
In a linear type system, types are divided into two universes: unrestricted types are things like integers and booleans, which fit in a register and can be copied without that leading to security vulnerabilities. Resources like memory, file handles, etc. belong in the linear universe. The compiler checks statically that values whose type belongs to the linear universe can only be used once.
This is simpler than it sounds, the rules basically are:
-
If you have a variable
x
of a linear type, thenx
must appear once and exactly once in the scope in which it is defined (excluding the statement that defines said variable). -
If it appears in a branch of an
if
statement, it must appear in all other branches. -
And, finally, it can’t appear in the body of a loop (if it was defined outside that loop), because then it’d be used once every iteration. When a linear value is used, it is said to be consumed.
So, for example, the API to open, write data to, and close a file handle might be:
open(path: string): file!
write(file: file!, data: string): file!
close(file: file!): unit
The exclamation mark indicates that the type is linear. Everything here is
fairly standard, except that the write
function takes a file object and
returns a file object. This is because files are linear. We can’t write:
let f = open("data.txt")
write(f, "foo")
write(f, "bar")
because f
is used twice (note that let f
doesn’t count as a
usage). Similarly, we can’t write:
let f = open("data.txt")
And forget about f
, the compiler will complain that f
is never used.
But we can write:
let f = open("data.txt")
let f' = write(f, "foo")
close(f')
Because each one of f
and f'
is used once and only once. And this is the
value of linear types: they constrain how we can use resources and ensure we
handle their lifecycle correctly.
So the basic rules are simple. Then on top of the simple rules you dump a truckload of convenience features like borrowing to make it less onerous for programmers to write linear code.
Example
If you’re convinced that linear types and exceptions don’t work together, skip this section. Otherwise, consider:
try {
let f = open("/etc/config");
// `write` consumes `f`, and returns a new linear file object
let f' = write(f, "Hello, world!");
throw Exception("Nope");
close(f');
} catch Exception {
puts("Leak!");
}
(Note: f'
contains the same underlying file handle as f
. It is only a new
linear object from the perspective of the type system. The way you’d implement
this is the file access library would represent file handles as opaque linear
types containing a non-linear file pointer that clients can’t access directly.)
A linear type system will accept this program: f
and f'
are both used
once. But this program has a resource leak: an exception is thrown before f'
is consumed.
If variables defined in a try
block can be used in the scope of the associated
catch
block, we could attempt a fix:
try {
let f = open("/etc/config");
let f' = write(f, "Hello, world!");
throw Exception("Nope");
close(f');
} catch Exception {
close(f');
}
Even if the type system accepts this (dubious, because f'
is potentially used
twice, as far as the type system knows), we can construct a much worse case in
which we break linearity by closing the file handle twice:
try {
let f = open("/etc/config");
let f' = write(f, "Hello, world!");
if (randBool()) {
puts("True! Closing the file...");
close(f');
} else {
puts("False! Closing then throwing");
close(f');
throw Exception("Nope");
}
} catch Exception {
close(f');
}
Assume for the sake of argument that the type system accepts this (the try
block is well-typed). If randBool()
evaluates to true
, the file is closed
correctly. If it evaluates to false
, we close the file twice.
Why Exceptions?
Exceptions involve implicit control flow, and require sophisticated runtime support, so we could decide to build a language without any exception handling mechanisms, and require the programmer to represent errors as values throughout. The problem with this solution is many ordinary functions are partial: integer division when the denominator is zero, or any arithmetic operation when it results in overflow, or out-of-bounds array access. Handling these cases using option types is unnecessarily complicated: simple arithmetic expressions would turn into a tree of nested pattern matching expressions to catch intermediate overflow errors, for example.
Unsound Approaches
These are solutions to the problem of exceptions in a linear type system that break the guarantees linear types provide.
Terminate the Process
Instead of handling the exception, terminate the process.
This is the simplest and coarsest approach. It treats the operating system as a garbage collector: memory not freed, file handles not closed, etc. will be dealt with by the operating system. The obvious drawback is calling a potentially-throwing function without risking termination (e.g., in a unit test framework) requires spawning a separate process, and communicating through RPC.
Terminate the Task
Instead of handling the exception, terminate the task, but not the whole process.
This reduces the cost of safely calling potentially-throwing code: spawning a thread is cheaper than forking a process, and since caller and callee share an address space, inter-thread communication can be fast. Resources allocated by the task, however, will not be freed when the task is terminated. In long-running servers where individual tasks are terminated occasionally, this can lead to resource – memory, file descriptor, socket – exhaustion.
At the same time, “terminate the process” and “terminate the task” are perfectly valid solutions for error cases that are truly impossible, but where the language is not expressive enough to express their impossibility.
Consider a type to represent positive integers:
signature POSITIVE_INT = sig
type pos
val fromInt : int -> pos option
val toInt : pos -> int
end
structure PositiveInt : POSITIVE_INT = struct
type pos = int
fun fromInt i =
if i > 0 then
SOME i
else
NONE
fun toInt p = p
end
Here, the type pos
is opaque, and the function fromInt
takes an integer and
returns an instance of pos
if the integer is positive, and otherwise returns
NONE
. This is fine, but if we want to use it with integer constants:
let p: pos = PositiveInt.fromInt 12
This won’t type check, because the type of p
is not pos
but pos
option
. But we know that 12 is a positive number. A construct to terminte the
task is useful here, so we could write:
let p: pos = case (PositiveInt.fromInt 12) of
SOME p => p
| NONE => abort "Impossible"
The problem with allowing terminate-like constructs is not technical, but social: with exceptions, programmers write empty catch blocks. With termination semantics, programmers will be tempted to continually expand the set of circumstances that are considered “impossible” or “too unlikely” to deal with.
Sound Approaches
These are approaches that preserve the linearity guarantees.
Values, not Exceptions
No exception handling. Instead, all potentially-throwing operations return union types. This can be made less onerous through syntactic sugar. The function (in a vaguely Rust-ish syntax):
T nth(array<T> arr, size_t index) throws OutOfBounds {
return arr[index];
}
Can be desugared to:
Result<T, OutOfBounds> nth(array<T> arr, size_t index) {
case arr[index] {
Some(elem: T) => {
return Result::ok(elem);
}
None => {
return Result::error(OutOfBounds());
}
}
}
This is appealing because much of the hassle of pattern matching Result
types
can be simplified by the compiler. But this approach is immensely limiting,
because as stated above, many fundamental operations have failure modes that
have to be handled explicitly:
add : (Int, Int) -> Result<Int, Overflow>
sub : (Int, Int) -> Result<Int, Overflow>
mul : (Int, Int) -> Result<Int, Overflow>
div : (Int, Int) -> Result<Int, Overflow | DivisionByZero>
nth : (Array<T>, Nat) -> Result<T, OutOfBounds>
As an example, consider a data structure implementation that uses arrays under the hood. The implementation has been thoroughly tested and you can easily convince yourself that it never accesses an array with an invalid index. But if the array indexing primitive returns an option type to indicate out-of-bounds access, the implementation has to handle this explicitly, and the option type will “leak” into client code, up an arbitrarily deep call stack.
The problem is that an ML-style type system considers all cases in a union type to be equiprobable, the normal path and the abnormal path have to be given equal consideration in the code. Exception handling systems let us conveniently differentiate between normal and abnormal cases.
Use Integrated Theorem Provers
No exceptions. Division by zero, overflow, and array index out of bounds errors are prevented at compile time by proving they don’t happen.
A full treatment of abstract interpretation is beyond the scope of this article. The usual tradeoff applies: the tractable static analysis methods prohibit many ordinary constructions, while the methods sophisticated enough to prove most code correct are extremely difficult to implement completely and correctly.
Capture the Linear Environment in the Exception
I call this the “PacLang++ solution”.
PacLang is an interesting language: it’s an imperative linearly-typed programming language specifically designed to write packet-processing algorithms for network procesors. The paper is worth reading.
Its authors describe the language as:
a simple, first order, call by value language, intended for constructing network packet processing programs. It resembles and behaves like C in most respects. The distinctive feature of PacLang is its type system, treating the datatypes that correspond to network packets within the program as linear types. The target platforms are application-specific network processor (NP) architectures such as the Intel IXP range and the IBM PowerNP.
The type system is straightforward: bool
, int
, and a linear packet
type. A
limited form of borrowing is supported, with the usual semantics:
In PacLang, the only linear reference is a
packet
. An alias to a reference of this type, a!packet
, can be created in a limited scope, by casting apacket
into a!packet
if used as a function argument whose signature requires a!packet
. An alias may never exist without an owning reference, and cannot be created from scratch. In the scope of that function, and other functions applied to the same!packet
, the alias can behave as a normal non-linear value, but is not allowed to co-exist in the same scope as the owning referencepacket
. This is enforced with constraints in the type system:
A
!packet
may not be returned from a function, as otherwise it would be possible for it to co-exist inscope with the owningpacket
A
!packet
may not be passed into a function as an argument where the owningpacket
is also being used as an argument, for the same reasonAny function taking a
!packet
cannot presume to “own” the value it aliases, so is not permitted to deallocate it or pass it to another a thread; this is enforced by the signatures of the relevant primitive functions. The constraints on thepacket
and!packet
reference types combined with the primitives for inter-thread communication give a uniqueness guarantee that only one thread will ever have reference to a packet.
An interesting restriction is that much of the language has to be written in A-normal form to simplify type checking. This is sound: extending a linear type system to implement convenience features like borrowing is made simpler by working with variables rather than arbitrary expressions.
The original language has no exception handling system. PacLang++, a successor with exception handling support, is introduced in the paper Memory safety with exceptions and linear types. The paper is difficult to find, so I will quote from it often. The authors first describe their motivation in adding exception handling:
In our experience of practical PacLang programming, an issue commonly arising is that of functions returning error values. The usual solution has been to return an unused integer value (C libraries commonly use -1 for this practice) where the function returns an integer, or to add a boolean to the return tuple signalling the presence of an error or other unusual situation. This quickly becomes awkward and ugly, especially when the error condition needs to be passed up several levels in the call graph. Additionally, it is far easier for a programmer to unintentionally ignore errors using this method, resulting in less obvious errors later in the program, for example a programmer takes the return value as valid data, complacently ignoring the possibility of an error, and using that error value where valid data is expected later in the program.
The linear type system of PacLang:
A linearly typed reference (in PacLang this is known as the owning reference, though other kinds of non-linear packet references will be covered later) is one that can be used only once along each execution path, making subsequent uses a type error; the type system supports this by removing a reference (consuming it) from the environment after use. As copying a linear reference consumes it, only one reference (the owning reference) to the packet may exist at any point in the program’s runtime. Furthermore, a linear reference must be used once and only once, guaranteeing that any linearly referenced value in a type safe program that halts will be consumed eventually.
The authors first discuss the exceptions as values approach, discarding it because it doesn’t support intra-function exception handling, and requires all functions to deallocate live linear values before throwing. The second attempt is described by the authors:
At the time an exception is raised, any packets in scope must be consumed through being used as an argument to the exception constructor, being “carried” by the exception and coming into the scope of the block that catches the exception.
This is also rejected, because:
this method does not account for live packets that are not in scope at the time an exception is raised. An exception can pass arbitrarily far up the call graph through multiple scopes that may contain live packets until it is caught.
The third and final approach:
We create an enhanced version of the original PacLang type system, which brings linear references into the environment implicitly wherever an exception is caught. Our type system ensures that the environment starting a catch block will contain a set of references (that are in the scope of the exception handling block) to the exact same linear references that were live at the instant the exception was thrown.
To illustrate I adapted the following example from the paper, adding a few more comments:
packet x = recv();
packet y = recv();
// At this point, the environment is {x, y}
try {
consume(x);
// At this point, the environment is just {y}
if (check(!y)) {
consume(y);
// Here, the environment is {}
} else {
throw Error; // Consumes y implicitly
// Here, the environment is {}
}
// Both branches end with the same environment
} catch Error(packet y) {
log_error();
consume(y);
}
The authors go on to explain a limitation of this scheme: if two different
throw
sites have a different environment, the program won’t type check. For
example:
packet x = recv();
// Environment is {x}
if (check(!x)) {
packet y = recv();
// Environment is {x, y}
if (checkBoth(!x, !y)) {
consume(x);
consume(y);
// Environment is {}
} else {
throw Error; // Consumes x and y
// Environment is {}
}
} else {
throw Error; // Consumes x
// Enviroment is {}
}
While the code is locally sound, one throw
site captures x
alone while one
captures x
and y
.
Suppose the language we’re working with requires functions to be annotated with
an exception signature, along the lines of checked exceptions. Then,
if all throw sites in a function f
implicitly capture a single linear packet
variable, we can annotate the function this way:
void f() throws Error(packet x)
But in the above code example, the exception annotation is ambiguous, because different throw sites capture different environments:
void f() throws Error(packet x)
// Or
void f() throws Error(packet x, packet y)
Choosing the former leaks y
, and choosing the latter means the value of y
will be undefined in some cases.
The PacLang++ Solution, Extended
The PacLang++ solution can be made more sensible with the use of option types:
because environments form a partially ordered set, we can use option types to
represent bindings which are not available at every throw
site. In the code
example above, we have:
{} < {x} < {x, y}
So the signature for this function is simply:
void f() throws Error(packet x, option<packet> y)
Use Affine Types
Affine types are a weakening of linear types, essentially linear types with implicit destructors. In a linear type system, values of a linear type must be used exactly once. In an affine type system, values of an affine type can be used at most once. If they are unused, the compiler automatically inserts a destructor call.
Rust does this, and there are good reasons to prefer affine types over linear types:
-
Less typing, since there is no need to explicitly call destructors.
-
Often, compilation fails because programmers make trivial mistakes, such as forgetting a semicolon. A similar mistake is forgetting to insert destructor calls. This isn’t possible with affine types, where the compiler handles object destruction for the programmer.
-
Destruction order is consistent and well-defined.
-
Most linear types have a single natural destructor function: pointers are deallocated, file handles are closed, database connections are closed, etc. Affine types formalize this practice: instead of having a constellation of ad-hoc destructor functions (
deallocate
,closeFile
,closeDatabase
,hangUp
), all destructors are presented behind a uniform interface: a generic function of typeT! -> Unit
.
The drawbacks of affine types are the same as those of destructors in languages like C++ and Ada, that is:
-
The use of destructors requires compiler insertion of implicit function calls, which have an invisible cost in runtime and code size, whereas linear types make this cost visible.
-
Destruction order has to be well-specified. For stack-allocated variables, this is straghtforward: destructors are called in inverse declaration order. For temporaries, this is complicated.
Additionally, destroying values we don’t do anything with could lead to bugs if the programmer simply forgets about a value they were supposed to use, and instead of warning them, the compiler cleans it up.
But there is a benefit to using affine types with destructors: exception
handling integrates perfectly well. Again, Rust does this: panic
and catch_unwind
are similar to try
and catch
, and
destructors are called by unwinding the stack and calling drop
on every affine
object. The result is that exceptions are safe: in the happy path, destructors
are called by the compiler. In the throwing path, the compiler ensures the
destructors are called anyways.
The implementation strategy is simple:
-
When the compiler sees a
throw
expression, it emits calls to the destructor of every (live) affine variable in the environment before emitting the unwinding code.That is, given an expression
throw(...)
, where the affine environment up to that expression is{x, y, z}
, the expression is transformed into:free(x); free(y); free(z); throw(...);
-
When the compiler sees a call to a potentially-throwing function (as determined by an effect system), it emits a
try
/catch
block: normal excecution proceeds normally, but if an exception is caught, the the destructors of all live affine variables are called, and the exception is re-thrown.Suppose we have a function call
f()
, where the affine environment up to the point of that call is{x, y}
. If the function potentially throws an exception, the function call gets rewritten to something like this:let result = try { f(); } catch Exception as e { free(x); free(y); throw(e); }
For a more complete example, a program like this:
void f() throws {
let x: string* = allocate("A fresh affine variable");
// Environment is {x}
g(x); // Environment is {}
}
void g(string* ptr) throws {
let y: string* = allocate("Another heap-allocated string");
// Environment is {ptr, y}
h(ptr); // Environment is {y}
}
void h(string* ptr) throws {
let z = allocate(1234);
if (randBool()) {
throw "Halt";
}
}
Would transform to something like this:
void f() {
let x: string* = _allocate_str("A fresh affine variable");
try {
g(x);
} catch {
rethrow;
}
}
void g(string* ptr) {
let y: string* = _allocate_str("Another heap-allocated string");
try {
h(ptr);
} catch {
_free_str(y);
rethrow;
}
_free_str(y);
}
void h(string* ptr) {
let z = allocate(1234);
if (randBool()) {
_free_intptr(z);
_free_str(ptr);
throw "Halt";
}
_free_intptr(z);
_free_str(ptr);
}
Two Disjoint Approaches
We don’t have to use values to represent any kind of error. Pervasive errors,
such as integer overflow, integer division by zero, and array index out of
bounds errors can simply terminate the program, allowing the operating system to
recover. Higher-level errors can be represented as values using a Result
type.
This is essentially the Herbceptions approach:
An alternate result is never an “error” (it is success, so report it using return). This includes “partial suc- cess” such as that a buffer was too small for the entire request but was filled to capacity so more can be read on the next call.
[…]
A programming bug or abstract machine corruption is never an “error” (both are not programmatically re- coverable, so report them to a human, by default using fail-fast). Programming bugs (e.g., out-of-bounds ac- cess, null dereference) and abstract machine corruption (e.g., stack overflow) cause a corrupted state that can- not be recovered from programmatically, and so they should never be reported to the calling code as errors that code could somehow handle.
In the same vein, the Midori approach:
A recoverable error is usually the result of programmatic data validation. Some code has examined the state of the world and deemed the situation unacceptable for progress. Maybe it’s some markup text being parsed, user input from a website, or a transient network connection failure. In these cases, programs are expected to recover. The developer who wrote this code must think about what to do in the event of failure because it will happen in well-constructed programs no matter what you do. The response might be to communicate the situation to an end-user, retry, or abandon the operation entirely, however it is a predictable and, frequently, planned situation, despite being called an “error.”
A bug is a kind of error the programmer didn’t expect. Inputs weren’t validated correctly, logic was written wrong, or any host of problems have arisen. Such problems often aren’t even detected promptly; it takes a while until “secondary effects” are observed indirectly, at which point significant damage to the program’s state might have occurred. Because the developer didn’t expect this to happen, all bets are off. All data structures reachable by this code are now suspect. And because these problems aren’t necessarily detected promptly, in fact, a whole lot more is suspect. Depending on the isolation guarantees of your language, perhaps the entire process is tainted.
Conclusion
There are effectively two solutions:
-
If you absolutely need traditional throw-and-catch exception handling, use affine types like Rust.
-
If you want linear types, take the Herb Sutter approach: divide errors into two categories, and use different approaches for both:
-
Integer overflow, array index out of bounds, and division by zero are terminating errors: the correct approach is to stop the program in its tracks and let the operating system recover.
-
“File not found”, “username is too short” are high-level errors that should be represented as values.
-
Bibliography
Not much, surprisingly:
-
Thrippleton, Richard, and Alan Mycroft. “Memory safety with exceptions and linear types.”
-
Tov, Jesse A., and Riccardo Pucella. “A theory of substructural types and control.” ACM SIGPLAN Notices. Vol. 46. No. 10. ACM, 2011.