Jonathan Lalou's Blog

Posts Tagged ‘ScalaIO’

[Scala IO Paris 2024] Escaping the False Dichotomy: Sanely Automatic Derivation in Scala

In the ScalaIO Paris 2024 session “Slow Auto, Inconvenient Semi: Escaping False Dichotomy with Sanely Automatic Derivation,” Mateusz Kubuszok delivered a user-focused exploration of typeclass derivation in Scala. With over nine years of Scala experience and as a co-maintainer of the Chimney library, Mateusz examined the trade-offs between automatic and semi-automatic derivation, proposing a “sanely automatic” approach that balances usability, compile-time performance, and runtime efficiency. Using JSON serialization libraries like Circe and Jsoniter-Scala as case studies, the talk highlighted how library design choices impact users and offered a practical solution to common derivation pain points, supported by benchmarks and a public repository.

Understanding Typeclasses and Derivation

Mateusz began by demystifying typeclasses, describing them as parameterized interfaces (e.g., an Encoder[T] for JSON serialization) whose implementations are provided by the compiler via implicits or givens. Typeclass derivation automates creating these implementations for complex types like case classes or sealed traits by combining implementations for their fields or subtypes. For example, encoding a User(name: String, address: Address) requires encoders for String and Address, which the compiler recursively resolves.

Derivation comes in two flavors: automatic and semi-automatic. Automatic derivation, popularized by Circe, uses implicits to generate implementations on-demand, but can lead to slow compilation and runtime performance issues. Semi-automatic derivation requires explicit calls (e.g., deriveEncoder[Address]) to define implicits, ensuring consistency but adding manual overhead. Mateusz emphasized that these choices, made by library authors, significantly affect users through compilation times, error messages, and performance.

The Pain Points of Automatic and Semi-Automatic Derivation

Automatic derivation in Circe recursively generates encoders for nested types, checking for existing implicits before falling back to derivation. This can cause circular dependencies or stack overflows if not managed carefully. Semi-automatic derivation avoids this by always generating new instances, but requires users to define implicits for every intermediate type, increasing code verbosity. Mateusz shared anecdotes of developers banning automatic derivation due to compilation times ballooning to 10 minutes for complex JSON hierarchies.

Benchmarks on a nested case class (five levels deep) revealed stark differences. On Scala 2.13, Circe’s automatic derivation took 14 seconds to compile a single file, versus 12 seconds for semi-automatic. On Scala 3, automatic derivation soared to 46 seconds (cold JVM) or 16 seconds (warm), while semi-automatic took 10 seconds or 1 second, respectively. Runtime performance also suffered: automatic derivation on Scala 3 was up to 10 times slower than semi-automatic, likely due to large inlined methods overwhelming JVM optimization.

Comparing Libraries: Circe vs. Jsoniter-Scala

Mateusz contrasted Circe with Jsoniter-Scala, which prioritizes performance. Jsoniter-Scala uses a single macro to generate recursive codec implementations, avoiding intermediate implicits except for custom overrides. This reduces memory allocation and compilation overhead. Benchmarks showed Jsoniter-Scala compiling faster than Circe (e.g., 2–3 times faster on Scala 3) and running three times faster, even when Circe was given a head start by testing on JSON representations rather than strings.

Jsoniter-Scala’s approach minimizes implicit searches, embedding logic in macros instead of relying on the compiler’s typechecker. For example, encoding a User with an Address field involves one codec handling all nested types, unlike Circe’s recursive implicit resolution. This results in fewer allocations and better error messages, as macros can pinpoint failure causes (e.g., a missing implicit for a specific field).

Sanely Automatic Derivation: A New Approach

Inspired by Jsoniter-Scala, Mateusz proposed “sanely automatic derivation” to combine automatic derivation’s convenience with semi-automatic’s performance. Using his Chimney library as a testbed, he split typeclasses into two traits: one for user-facing APIs and another for automatic derivation, invisible to implicit searches to avoid circular dependencies. This allows recursive derivation with minimal implicits, using macros to handle nested types efficiently.

Mateusz implemented this for a Jsoniter-Scala wrapper on Scala 3, achieving compilation times comparable to Jsoniter-Scala’s single-implicit approach and faster than Circe’s semi-automatic derivation. Runtime performance matched Jsoniter-Scala’s, with negligible overhead. Error messages were improved by logging macro actions (e.g., which field caused a failure) via Scala’s macro settings, viewable in IDEs like VS Code without console output.

A Fair Comparison: Custom Typeclass Benchmark

To ensure fairness, Mateusz created a custom Show typeclass (similar to Circe’s ShowPretty) for pretty-printing case classes using StringBuilder. He implemented it with Shapeless (Scala 2), Mirrors (Scala 3), Magnolia (both), and his sanely automatic approach. Initial results showed his approach outperforming Shapeless and Mirrors but trailing Magnolia’s semi-automatic derivation. By adding caching within macros (reusing derived implementations for repeated types), his approach became the fastest across all platforms, compiling in less time than Shapeless, Mirrors, or Magnolia, with better runtime performance.

This caching, inspired by Jsoniter-Scala, avoids re-deriving the same type multiple times within a macro, reducing method size and enabling JVM optimization. The change required minimal code, demonstrating that library authors can adopt this approach with a single, non-invasive pull request.

Implications for Scala’s Future

Mateusz concluded by addressing Scala’s reputation for slow compilation, citing a Virtus Lab survey where 53% of developers complained about compile times, often tied to typeclass derivation. Libraries like Shapeless and Magnolia prioritize developer convenience over user experience, leading to opaque errors and performance issues. His sanely automatic derivation offers a user-friendly alternative: one import, fast compilation, efficient runtime, and debuggable errors.

By sharing his Chimney Macro Commons library, Mateusz encourages library authors to rethink derivation strategies. While macros require more maintenance than Shapeless or Magnolia, they become viable as libraries scale and prioritize user needs. He urged developers to provide feedback to library maintainers, challenging assumptions that automatic and semi-automatic are the only options, to make Scala more accessible and production-ready.

Links:

Hashtags: #Scala #TypeclassDerivation #ScalaIOParis2024 #Circe #JsoniterScala #Chimney #Performance #Macros #Scala3

Posted in en-US | Tags: Scala, ScalaIO | No Comments »

[Scala IO Paris 2024] Calculating Is Funnier Than Guessing

Author: Jonathan Lalou

In the ScalaIO Paris 2024 session “Calculating is funnier than guessing”, Regis Kuckaertz, a French developer living in an English-speaking country captivated the audience with a methodical approach to writing compilers for domain-specific languages (DSLs) in Scala. The talk debunked the mystique of compiler construction, emphasizing a principled, calculation-based process over ad-hoc guesswork. Using equational reasoning and structural induction, the speaker derived a compiler and stack machine for a simple boolean expression language, Expr, and extended the approach to the more complex ZPure datatype from the ZIO Prelude library. The result was a correct-by-construction compiler, offering performance gains over interpreters while remaining accessible to functional programmers.

Laying the Foundation with Equational Reasoning

The talk began by highlighting the limitations of interpreters for DSLs, which, while easy to write via structural induction, incur runtime overhead. The speaker argued that functional programming’s strength lies in embedding DSLs, citing examples like Cats Effect, ZIO, and Kulo for metrics. To achieve “abstraction without remorse,” DSLs must be compiled into efficient machine code. The proposed method, inspired by historical work on calculating compilers, avoids pre-made recipes, instead using a single-step derivation process combining evaluation, continuation-passing style (CPS), and defunctionalization.

For the Expr language, comprising boolean constants, negation, and conjunction, the speaker defined a denotational semantics with an evaluator function. This function maps expressions to boolean values, e.g., evaluating And(Not(B(true)), B(false)) to a boolean result. The evaluator was refined to make implicit behaviors explicit, such as Scala’s left-to-right evaluation of &&, ensuring the specification aligns with developer expectations. This step underscored the importance of intimate familiarity with execution details, uncovered through the derivation process.

Deriving a Compiler for Expr

The core of the talk was deriving a compiler and stack machine for Expr using equational reasoning. The correctness specification required that compiling an expression and executing it on a stack yields the same result as evaluating the expression and pushing it onto the stack. The compiler was defined with a helper function using symbolic CPS, taking a continuation to guide code generation. For each constructor—B (boolean), Not, and And—the speaker applied the specification, reducing expressions step-by-step.

For B, a Push instruction was introduced to place a boolean on the stack. For Not, a Neg instruction negated the top stack value, with the subexpression compiled inductively. For And, the derivation distributed stack operations over conditional branches, introducing an If instruction to select continuations based on a boolean. The final Compile function used a Halt continuation to stop execution. The resulting machine language and stack machine, implemented as an imperative tail-recursive loop, fit on a single slide, achieving orders-of-magnitude performance improvements over the interpreter.

Tackling Complexity with ZPure

To demonstrate real-world applicability, the speaker applied the technique to ZPure, a datatype from ZIO Prelude for pure computations with state, logging, and error handling. The language includes constructors for pure values, failures, error handling, state management, logging, and flat mapping. The evaluator threads state and logs, succeeding or failing based on the computation. The compiler derivation followed the same process, introducing instructions like Push, Throw, Load, Store, Log, Mark, Unmark, and Call to handle values, errors, state, and continuations.

The derivation for ZPure required careful handling of failures, where a Throw instruction invokes a failure routine that unwinds the stack until it finds a handler or crashes. For Catch and FlatMap, the speaker applied the induction hypothesis, introducing stack markers to manage handlers and continuations. Despite Scala functions in ZPure requiring runtime compilation, the speaker proposed defunctionalization—using data types like Flow or lambda calculus encodings—to eliminate this, though this was left as future work. The resulting compiler and machine, again fitting on a slide, were correct by construction, with unreachable cases confidently excluded.

Reflections and Future Directions

The talk emphasized that calculating compilers is a mechanical, repeatable process, not a mysterious art. By deriving machine instructions through equational reasoning, developers ensure correctness without extensive unit testing. The speaker noted a limitation in ZPure: its evaluator and compiler allow non-terminating expressions, which a partial monad could address. Future work includes defunctionalizing ZPure to avoid runtime compilation and optimizing machine code into directed acyclic graphs to reduce duplication.

The speaker recommended resources like Philip Wadler’s papers on calculating compilers, encouraging functional programmers to explore this approachable technique. The talk, blending humor with rigor, demonstrated that compiling DSLs is not only feasible but also “funnier” than guessing, offering a path to efficient, correct code.

Links:

Regis Kuckaertz on LinkedIn
ScalaIO Paris 2024 Session Page
ZIO Prelude Documentation (For context on ZPure)
Philip Wadler’s Papers (For calculating compilers)

Hashtags: #Scala #CompilerDesign #EquationalReasoning #ZPure #ScalaIOParis2024 #FunctionalProgramming

Posted in en-US | Tags: DSL, Scala, ScalaIO | No Comments »