[Scala IO Paris 2024] Calculating Is Funnier Than Guessing
In the ScalaIO Paris 2024 session “Calculating is funnier than guessing”, Regis Kuckaertz, a French developer living in an English-speaking country captivated the audience with a methodical approach to writing compilers for domain-specific languages (DSLs) in Scala. The talk debunked the mystique of compiler construction, emphasizing a principled, calculation-based process over ad-hoc guesswork. Using equational reasoning and structural induction, the speaker derived a compiler and stack machine for a simple boolean expression language, Expr, and extended the approach to the more complex ZPure
datatype from the ZIO Prelude library. The result was a correct-by-construction compiler, offering performance gains over interpreters while remaining accessible to functional programmers.
Laying the Foundation with Equational Reasoning
The talk began by highlighting the limitations of interpreters for DSLs, which, while easy to write via structural induction, incur runtime overhead. The speaker argued that functional programming’s strength lies in embedding DSLs, citing examples like Cats Effect, ZIO, and Kulo for metrics. To achieve “abstraction without remorse,” DSLs must be compiled into efficient machine code. The proposed method, inspired by historical work on calculating compilers, avoids pre-made recipes, instead using a single-step derivation process combining evaluation, continuation-passing style (CPS), and defunctionalization.
For the Expr language, comprising boolean constants, negation, and conjunction, the speaker defined a denotational semantics with an evaluator function. This function maps expressions to boolean values, e.g., evaluating And(Not(B(true)), B(false))
to a boolean result. The evaluator was refined to make implicit behaviors explicit, such as Scala’s left-to-right evaluation of &&
, ensuring the specification aligns with developer expectations. This step underscored the importance of intimate familiarity with execution details, uncovered through the derivation process.
Deriving a Compiler for Expr
The core of the talk was deriving a compiler and stack machine for Expr using equational reasoning. The correctness specification required that compiling an expression and executing it on a stack yields the same result as evaluating the expression and pushing it onto the stack. The compiler was defined with a helper function using symbolic CPS, taking a continuation to guide code generation. For each constructor—B
(boolean), Not
, and And
—the speaker applied the specification, reducing expressions step-by-step.
For B
, a Push
instruction was introduced to place a boolean on the stack. For Not
, a Neg
instruction negated the top stack value, with the subexpression compiled inductively. For And
, the derivation distributed stack operations over conditional branches, introducing an If
instruction to select continuations based on a boolean. The final Compile
function used a Halt
continuation to stop execution. The resulting machine language and stack machine, implemented as an imperative tail-recursive loop, fit on a single slide, achieving orders-of-magnitude performance improvements over the interpreter.
Tackling Complexity with ZPure
To demonstrate real-world applicability, the speaker applied the technique to ZPure
, a datatype from ZIO Prelude for pure computations with state, logging, and error handling. The language includes constructors for pure values, failures, error handling, state management, logging, and flat mapping. The evaluator threads state and logs, succeeding or failing based on the computation. The compiler derivation followed the same process, introducing instructions like Push
, Throw
, Load
, Store
, Log
, Mark
, Unmark
, and Call
to handle values, errors, state, and continuations.
The derivation for ZPure
required careful handling of failures, where a Throw
instruction invokes a failure routine that unwinds the stack until it finds a handler or crashes. For Catch
and FlatMap
, the speaker applied the induction hypothesis, introducing stack markers to manage handlers and continuations. Despite Scala functions in ZPure
requiring runtime compilation, the speaker proposed defunctionalization—using data types like Flow
or lambda calculus encodings—to eliminate this, though this was left as future work. The resulting compiler and machine, again fitting on a slide, were correct by construction, with unreachable cases confidently excluded.
Reflections and Future Directions
The talk emphasized that calculating compilers is a mechanical, repeatable process, not a mysterious art. By deriving machine instructions through equational reasoning, developers ensure correctness without extensive unit testing. The speaker noted a limitation in ZPure
: its evaluator and compiler allow non-terminating expressions, which a partial monad could address. Future work includes defunctionalizing ZPure
to avoid runtime compilation and optimizing machine code into directed acyclic graphs to reduce duplication.
The speaker recommended resources like Philip Wadler’s papers on calculating compilers, encouraging functional programmers to explore this approachable technique. The talk, blending humor with rigor, demonstrated that compiling DSLs is not only feasible but also “funnier” than guessing, offering a path to efficient, correct code.
Links:
- Regis Kuckaertz on LinkedIn
- ScalaIO Paris 2024 Session Page
- ZIO Prelude Documentation (For context on ZPure)
- Philip Wadler’s Papers (For calculating compilers)
Hashtags: #Scala #CompilerDesign #EquationalReasoning #ZPure #ScalaIOParis2024 #FunctionalProgramming