By Robert Smith
Coalton is a statically typed functional programming language that lives inside Common Lisp. It has had an exciting few years. It is being used for industrial purposes, being put to its limits as a production language to build good, reliable, efficient, and robust products. Happily, with Coalton, many products shipped with tremendous success. But as we built these products, we noticed gaps in the language. As such, we’re setting the stage for the next tranche of Coalton work, and we’re going to preview some of these improvements here, including how Coalton can prove $\sqrt{2+\sqrt{3}} = \sqrt{2}(\sqrt{3}+1)/2$ exactly.
Table Of Contents
Fixed arity functions
One of the biggest changes coming to the next release of Coalton is fixed arity functions. This unlocks keyword arguments, better type error messages, more natural API design, and more efficient compilation. It replaces what was once almost a part of the language’s identity: Haskell-style curried functions. In Coalton (and Haskell), functions are typed as chains of arrows. This function
(define (f x y z)
(+ x (* y z)))
has the type Num :a => :a -> :a -> :a -> :a. This means we could write (f 1) or (f 1 2) or (f 1 2 3) and all would be completely valid expressions. As Coalton programmers, we advertised this everywhere, from the language’s README to conference presentations. The biggest selling points of this feature were
- Simplicity: Every function takes exactly one input and produces exactly one output. No exceptions. If you didn’t care about the input or output, you used
Unit, and we made special syntax for that. - Cool tricks: You could write very short, elegant code. Want to double all elements of a list?
(map (* 2) list). Want a list-doubler function?(map (* 2)). This was especially helpful for very abstract, higher-order code, where only the last couple arguments of a function were unimportant ports for plumbing large abstractions. (Think parser combinators where the last argument is the “thing to be parsed” input.) It also worked very well with Coalton’s nativepipeandnestmacros.
Fixed arity resolves several long-standing friction points with currying:
- More natural APIs. With currying, APIs were designed around what was most convenient to partially apply, even if it bucked larger programming trends. For example, to get the nth element of a list
l, we’d write(index n l)because currying the index is more useful than currying the list. But every other language writes something likel[n],l.get(n),(elt l n), etc. Fixed arity frees API design from these constraints. - Less implicit allocation. With currying, closure allocation was idiomatic. While Coalton has a lot of machinery to heuristically inline, making heap allocation a built-in feature of every function definition was pushing the language in a direction we didn’t want. Fixed arity doesn’t stop one from allocating closures with
fn, but it becomes explicit. - Better type errors. With currying, writing
(f 1 2)instead of(f 1 2 3)silently produces a partial application. The compiler happily infers a function type like:s -> :tand moves on. The real error only surfaces later, when that unexpected function value finally clashes with an incompatible type, often far from the actual mistake. With fixed arity, a missing argument is caught right where it happens. - A foundation for richer calling conventions. Optional arguments, keyword arguments, and new calling conventions all become straightforward with fixed arity.
With this change, Coalton functions now have fixed input arity, and providing too many or too few arguments is an immediate type error. The above function would instead have the type
Num :t => :t * :t * :t -> :t
which we like to pronounce “tee by tee by tee to tee.” This looks like tuple syntax from ML-family languages like OCaml or Standard ML, but in Coalton the * is syntactically tied to the ->. It’s not a tuple constructor, but rather syntax of the function type. In fact, the type of the actual Tuple constructor is
:a * :b -> Tuple :a :b
With this, the fixed input arity is strict; all arguments must be supplied. Coalton will no longer “auto-curry”:
COALTON-USER> (coalton-toplevel
(define (f x y z)
(+ x (* y z))))
;; F :: ∀ A. NUM A ⇒ (A * A * A → A)
; No value
COALTON-USER> (coalton (f 1.0 2.0 3.0))
7.0
COALTON-USER> (coalton (f 1.0 2.0))
error: Arity mismatch
|
1 | (COALTON (F 1.0 2.0))
| ^^^^^^^^^^^
Function call has 2 positional
arguments but inferred type
'(:A * :A * :A → :A)' takes 3
What if we want to have curried functions? We can still type and write them, but it will be more verbose. We have to curry it ourselves explicitly with either fn:
COALTON-USER> (coalton-toplevel
(define f-curried (fn (x) (fn (y) (fn (z) (f x y z))))))
;; F-CURRIED :: ∀ A. NUM A ⇒ (A → (A → (A → A)))
; No value
(We will see later how we can write this in a more streamlined way.) Moreover, calling it will require nested applications.
COALTON-USER> (coalton (((f-curried 1.0) 2.0) 3.0))
7.0
This is a major change to the way Coalton is written. We think this is a good change for the following simple reason: when we flipped the switch to enable this new function style, 99% of code compiled just fine, only the type signatures broke. What this means is that curried style wasn’t even being used too much, and where it was, it took only a handful of characters to fix.
Keyword arguments
In real-world Coalton code, we ran into a lot of instances where we wanted functions to have lots of options, especially functions which are often bindings or FFI-derived. To support this, we would make all sorts of ad hoc structs containing function options, and it all felt like disposable code goop.
Since we have fixed arity functions, we can now support keyword arguments. For example, Coalton’s standard library now defines file opening as:
(define (open path &key (direction Input) (if-exists EError))
...)
Here, open has the type
(Into :path Pathname) => :path &key (:direction OpenDirection) (:if-exists IfExists)
-> (Result FileError (FileStream :a))
Each keyword argument has a default value and is optional to supply at call sites. So we can call open in several ways:
(open "data.csv") ; all defaults
(open "data.csv" :direction Output) ; override direction
(open "data.csv" :if-exists Supersede) ; override if-exists
(open "data.csv" :direction Output :if-exists Supersede) ; override both
Keyword arguments work for toplevel functions, methods, and fn.
Multiple value returns
Coalton, somewhat experimentally, introduced “tuple unboxing”: If a function f returned a two-element tuple, then internally, two versions of f would be compiled:
f-unboxed: an internal function that computedfbut didn’t construct the tuple (instead passing return values on the stack), andf, as normal, constructing the tuple.
Then, Coalton would analyze where f was called, and decide if it could instead replace the call with f-unboxed if the result was being immediately destructured.
On one hand, this was a very welcome addition to Coalton. Existing APIs could become much more efficient (e.g., quotRem) without having to change anything. On the other hand, it created tremendous bloat in the compiler implementation (all this bookkeeping on tuple-returning functions) and program runtime (now tuple functions had two compiled entry points). Furthermore, this only worked for tuples with two components. Didn’t work for Tuple3, didn’t work with your own custom structs.
As such, this is one of the features we took a step back on in light of the other changes we made. With the change in how we handle input arity, we also now allow functions to natively return multiple values. For example, consider the Coalton function to split a string into two pieces. This is defined in the standard library as:
(declare split-at (UFix -> String -> (Tuple String String)))
(define (split-at n str)
(Tuple (substring str 0 n)
(substring str n (length str))))
In the next release of Coalton, it’s now written as
(declare split-at (UFix * String -> String * String))
(define (split-at n str)
(values (substring str 0 n)
(substring str n (length str))))
If we want to use it, we need to “receive” both values. We can do this with two new let-syntax features:
;; short let
(define ...
(let (values pre post) = (split-at 5 "hello world")
...)
;; regular let
(define ...
(let (((values pre post) (split-at 5 "hello world")))
...))
Unlike Common Lisp, multiple values always need to be handled, even if ignored.
When calling from Lisp, we can receive multiple values, using updated lisp syntax:
(lisp (-> Integer * Integer) (num den)
(cl:floor num den))
This tells Coalton that two Integer values will be returned. The lisp operator now always needs the ->, even if just a single value is returned.
The role of Unit and Void
In Coalton, Unit was the value used to idiomatically indicate a function has no useful input or output. Since the next release of Coalton will have functions with true zero-input or zero-output behavior, this idiom will no longer be used. A function defined as
(define (f)
...)
will have type Void -> ..., and functions that previously returned Unit as an empty value will instead have type ... -> Void.
Note that Void in function type syntax doesn’t refer to the uninhabited type familiar from Haskell or Rust. Rather, it’s syntactic notation indicating zero inputs or zero outputs. Previously, Void was meant to denote functions that don’t return or don’t terminate, but that was never truly meaningful in practice. Now its meaning is exactly “no inputs” or “no outputs”.
Syntax for collections and associations
One of Coalton’s greatest weaknesses right now is syntax for collections. Even lists require arduously calling a make-list macro. In the next release of Coalton, we are introducing two new syntaxes for collections and associations.
A collection is just a composite of elements. It may be a list, set, vector, or something else. We deliberately don’t define anything more than that. Collections can be written with brackets: [1 2 3] or []. What type should it be? Well, we have the whole power of the type system, so it should be any type that we could construct from a collection of objects! Hence, this is an “overloaded literal” that will work for any collection type implementing a new FromCollection type class.
(define-class (FromCollection :COLL :ELT (:COLL -> :ELT))
;; Create an empty collection
(make-empty-collection (Void -> :COLL))
;; Adjoin an element to a collection
;; May be pure or effectful.
(adjoin-to-collection (:COLL * :ELT -> :COLL))
;; Finalize the empty collection
(finalize-collection (:COLL -> :COLL)))
In the event of ambiguity, it defaults to Seq :t, just like numerical variables default to Integer.
An association is a collection of associated elements. It may be a hash table, dictionary, association list, or something else. Like collections, we deliberately don’t define anything more than that. Associations are written with square brackets as well, but related pairs are joined by =>:
["hello" => 1
"world" => 2
"!" => 3]
An empty association is [=>] to distinguish it from an empty collection [].
Like collections, associations are also overloaded; all you need is an instance of the new FromAssociation class:
(define-class (FromAssociation :ASSOC :KEY :VALUE (:ASSOC -> :KEY :VALUE))
;; Create an empty association
(make-empty-association (Void -> :ASSOC))
;; Adjoin a key-value pair to an association
;; May be pure or effectful.
(adjoin-to-association (:ASSOC * :KEY * :VALUE -> :ASSOC))
;; Finalize the association
(finalize-association (:ASSOC -> :ASSOC)))
In the event of ambiguity, it defaults to Seq (Tuple :k :v), just like numerical variables default to Integer.
Collection and association comprehensions
Collections and associations can also be written as comprehensions. Instead of listing the items explicitly, one can generate them. Here’s an example creating a HashMap of number names to their values:
(declare number-name (Integer -> String))
(define (number-name n)
(lisp (-> String) (n)
(cl:format cl:nil "~R" n)))
(declare dictionary (HashMap String Integer))
(define dictionary
[name => i for i in (up-to 10) with name = (number-name i)])
Now we can look up these values in the dictionary.
COALTON-USER> (coalton (lookup dictionary "seven"))
7
As before, if we declared dictionary to any association type, it would still work.
Comprehensions work for both. The grammar looks something like this:
[<elt-expr> <clause>+] ; collection
[<key-expr> => <val-expr> <clause>+] ; association
In both cases, the clauses can be one of three things:
with <var> = <val>: Assign a variable to a value.for <var> in <iter>: Iterate a variable across an iterator.when <expr>: Guard against a Boolean predicate.
Clauses are executed left-to-right like a little nested loop, and the “body” is the expressions for the elements being produced.
Syntax for short lambda functions
We decided early on that Coalton should have easy ways to make new functions. Instead of the traditional lambda, we call our anonymous functions fn, as in (fn (x) (* x x)). But since we got rid of the “currying style” of function definitions, fn has become a lot more prominent, and we think that this is still a little too wordy for very small, short functions. As such, we’re introducing short lambda syntax. What we wrote as
(fn (x) x)
(fn (f x y) (f y x))
(fn () 0)
(fn (_ a _) a)
can now be written as
\x.x
\fxy.(f y x)
\.0
\_a_.a
The “overhead” is just 2 non-shift characters, on American keyboard layouts at least.
This syntax nests just fine too. Our earlier example of the manually curried function:
(define f-curried (fn (x) (fn (y) (fn (z) (f x y z)))))
can be written more succinctly as:
(define f-curried \x.\y.\z.(f x y z))
The purpose of this isn’t to allow code to be golfed, but rather to allow “throwaway” functions to just take less space and ceremony.
Scoped type variables and forall
In previous versions of Coalton, type variables in a declare and the corresponding define were independent. There was no way to refer to a declared type variable inside the function body. This led to awkward workarounds.
Consider a function for numerical casting that checks if the number is in bounds of the destination type. The old source code was:
(inline)
(declare unify (:ty -> :ty -> :ty))
(define (unify _ use)
use)
(declare cast-if-inbounds ((Ord :src) (Bounded :target) =>
:src -> (Result String :target)))
(define (cast-if-inbounds x)
(let max-bound = maxBound)
(let min-bound = minBound)
(let int = (the Integer (unsafe-cast x)))
(if (or (< int (unsafe-cast min-bound))
(> int (unsafe-cast max-bound)))
(Err "value out of range")
(Ok
(unify max-bound (unify min-bound
(unsafe-cast x)))))))
Here, we have a helper function unify whose only purpose is to constrain two type variables to be the same. It’s then used to force (unsafe-cast ...) to have the same type as minBound and maxBound. It’s all very tangled.
In the next release of Coalton, we’re introducing forall, which scopes type variables from a declare over the corresponding define. With it, the same function becomes:
(declare cast-if-inbounds
(forall (:src :target)
((Ord :src) (Bounded :target) => :src -> (Result String :target))))
(define (cast-if-inbounds x)
(let max-bound = (the :target maxBound))
(let min-bound = (the :target minBound))
(let int = (the Integer (unsafe-cast x)))
(if (or (< int (unsafe-cast min-bound))
(> int (unsafe-cast max-bound)))
(Err "value out of range")
(Ok (the :target (unsafe-cast x))))))
Simple and direct: just use :target directly in the function body. No type acrobatics needed.
The forall is required to introduce scope; without it, type variables in declare and define remain independent. Similarly, if method definitions in define-class are typed with forall, then method instances may use those type variables.
Real algebraic numbers and xmath
Coalton has always supported numbers that are not traditionally found in programming language standard libraries, like transfinite and hyperdual numbers. Coalton’s variety of numbers is tremendously important for a wide variety of applications, from CAD to mathematical optimization.
However, even with the numbers we have, we still don’t have the ability to do certain numerical computations exactly. For example, suppose you are making a CAD program, and you want to see if two line segments are equal in length. One line segment, your CAD program calculates
$$a = \sqrt{2+\sqrt{3}}$$
and the other
$$b = \frac{\sqrt{2}}{2}(\sqrt{3}+1).$$
You approximate them both in your CAD engine to check if they might be equal:
COALTON-USER> (coalton-toplevel
(define a (the F32 (sqrt (+ 2 (sqrt 3)))))
(define b (the F32 (* (/ (sqrt 2) 2) (1+ (sqrt 3))))))
;; B :: F32
;; A :: F32
; No value
COALTON-USER> (coalton (make-list a b))
(1.9318516 1.9318516)
So it appears they’re the same. But for CAD, how do we know this isn’t just some catastrophic floating point issue? If we push this design to manufacturing, how do we know these parts will fit together?
If rational numbers can be seen as “exact fractions”, then real algebraic numbers can be seen as “exact roots”. Not just square roots. Any root to any polynomial. Coalton adds support for these. Instead of using F32 to represent our measured lengths, we can use RealAlgebraic:
COALTON-USER> (coalton-toplevel
(define a (the RealAlgebraic (sqrt (+ 2 (sqrt 3)))))
(define b (the RealAlgebraic (* (/ (sqrt 2) 2) (1+ (sqrt 3))))))
;; B :: REALALGEBRAIC
;; A :: REALALGEBRAIC
; No value
Now when we test equality, it is an exact test. No numerical approximation, no epsilons.
COALTON-USER> (coalton (== a b))
COMMON-LISP:T
This tells us that, despite having different forms, $a=b$ exactly.
Like rational numbers are merely real numbers that solve $$ux+v=0$$ for integers $u$ and $v$, real algebraic numbers are merely real numbers that solve
$$u_0 + u_1x + u_2x^2 + \cdots + u_nx^n=0$$
for some nonnegative integer $n$ and integers $u_k$. (A direct consequence of this is that the rational numbers are a subset of the real algebraic numbers.) The square root of 2 is such a number because it’s the positive solution to
$$x^2-2=0.$$
This is a much broader class of numbers than rationals, but it’s not everything. There are many numbers, namely the transcendental numbers like $\pi$, that aren’t represented by these. So they’re not a silver bullet, but substantially broaden the class of things we can do exact arithmetic with.
We realize these number types aren’t useful to everybody. Nor are other advanced math features brought in to Coalton. As such, we are introducing a separate standard library module called coalton/xmath, for “extended math”. It’s built into Coalton, it’s just not loaded as a dependency by default. Here, we’ll find
- Dyadic numbers
- Dual and hyperdual numbers
- Arbitrary precision floating point numbers
- Computable real numbers
- Real algebraic numbers
- Fast Fourier transforms
This will be a place of much extension in the future, since mathematical computation is a core driver for Coalton’s development.
Other improvements
-
Type system soundness. Hindley-Milner is famously unsound in the presence of mutating operators, and Coalton inherited this issue. We’ve now introduced weak type variables that restrict when types are allowed to generalize in circumstances that may involve mutability. Coalton types that use
repr :nativeare automatically marked as mutable and limited accordingly. This change was fully transparent to existing code. -
Standard library renaming. The standard library has been renamed from
coalton-library/footocoalton/foo. Old names will still work for now, but will be removed in the future. -
Versioning. Coalton will begin to be versioned. Versions pre-1.0 will mostly be qualitative indications of major additions, removals, or changes to the language. The next release will be the first official versioned release: Coalton 0.2.
The future of Coalton
We want to have Coalton 0.2 out the door by March 23, 2026. After that, we expect another period of development with this new language shape to see where the rough edges are, and where we need to round things out. We have an inkling of a feeling that linear/affine types, explicit resource management, and quality-of-life improvements will be on the docket for 0.3, but we’ll see.
Thanks to Colin O’Keefe, Jesse Bouwman, Joseph Garvin, and Owen Radcliffe for comments on the draft.