Jonathan Lalou's Blog

[Scala IO Paris 2024] Escaping the False Dichotomy: Sanely Automatic Derivation in Scala

Author: Jonathan Lalou | December 26, 2024

In the ScalaIO Paris 2024 session “Slow Auto, Inconvenient Semi: Escaping False Dichotomy with Sanely Automatic Derivation,” Mateusz Kubuszok delivered a user-focused exploration of typeclass derivation in Scala. With over nine years of Scala experience and as a co-maintainer of the Chimney library, Mateusz examined the trade-offs between automatic and semi-automatic derivation, proposing a “sanely automatic” approach that balances usability, compile-time performance, and runtime efficiency. Using JSON serialization libraries like Circe and Jsoniter-Scala as case studies, the talk highlighted how library design choices impact users and offered a practical solution to common derivation pain points, supported by benchmarks and a public repository.

Understanding Typeclasses and Derivation

Mateusz began by demystifying typeclasses, describing them as parameterized interfaces (e.g., an Encoder[T] for JSON serialization) whose implementations are provided by the compiler via implicits or givens. Typeclass derivation automates creating these implementations for complex types like case classes or sealed traits by combining implementations for their fields or subtypes. For example, encoding a User(name: String, address: Address) requires encoders for String and Address, which the compiler recursively resolves.

Derivation comes in two flavors: automatic and semi-automatic. Automatic derivation, popularized by Circe, uses implicits to generate implementations on-demand, but can lead to slow compilation and runtime performance issues. Semi-automatic derivation requires explicit calls (e.g., deriveEncoder[Address]) to define implicits, ensuring consistency but adding manual overhead. Mateusz emphasized that these choices, made by library authors, significantly affect users through compilation times, error messages, and performance.

The Pain Points of Automatic and Semi-Automatic Derivation

Automatic derivation in Circe recursively generates encoders for nested types, checking for existing implicits before falling back to derivation. This can cause circular dependencies or stack overflows if not managed carefully. Semi-automatic derivation avoids this by always generating new instances, but requires users to define implicits for every intermediate type, increasing code verbosity. Mateusz shared anecdotes of developers banning automatic derivation due to compilation times ballooning to 10 minutes for complex JSON hierarchies.

Benchmarks on a nested case class (five levels deep) revealed stark differences. On Scala 2.13, Circe’s automatic derivation took 14 seconds to compile a single file, versus 12 seconds for semi-automatic. On Scala 3, automatic derivation soared to 46 seconds (cold JVM) or 16 seconds (warm), while semi-automatic took 10 seconds or 1 second, respectively. Runtime performance also suffered: automatic derivation on Scala 3 was up to 10 times slower than semi-automatic, likely due to large inlined methods overwhelming JVM optimization.

Comparing Libraries: Circe vs. Jsoniter-Scala

Mateusz contrasted Circe with Jsoniter-Scala, which prioritizes performance. Jsoniter-Scala uses a single macro to generate recursive codec implementations, avoiding intermediate implicits except for custom overrides. This reduces memory allocation and compilation overhead. Benchmarks showed Jsoniter-Scala compiling faster than Circe (e.g., 2–3 times faster on Scala 3) and running three times faster, even when Circe was given a head start by testing on JSON representations rather than strings.

Jsoniter-Scala’s approach minimizes implicit searches, embedding logic in macros instead of relying on the compiler’s typechecker. For example, encoding a User with an Address field involves one codec handling all nested types, unlike Circe’s recursive implicit resolution. This results in fewer allocations and better error messages, as macros can pinpoint failure causes (e.g., a missing implicit for a specific field).

Sanely Automatic Derivation: A New Approach

Inspired by Jsoniter-Scala, Mateusz proposed “sanely automatic derivation” to combine automatic derivation’s convenience with semi-automatic’s performance. Using his Chimney library as a testbed, he split typeclasses into two traits: one for user-facing APIs and another for automatic derivation, invisible to implicit searches to avoid circular dependencies. This allows recursive derivation with minimal implicits, using macros to handle nested types efficiently.

Mateusz implemented this for a Jsoniter-Scala wrapper on Scala 3, achieving compilation times comparable to Jsoniter-Scala’s single-implicit approach and faster than Circe’s semi-automatic derivation. Runtime performance matched Jsoniter-Scala’s, with negligible overhead. Error messages were improved by logging macro actions (e.g., which field caused a failure) via Scala’s macro settings, viewable in IDEs like VS Code without console output.

A Fair Comparison: Custom Typeclass Benchmark

To ensure fairness, Mateusz created a custom Show typeclass (similar to Circe’s ShowPretty) for pretty-printing case classes using StringBuilder. He implemented it with Shapeless (Scala 2), Mirrors (Scala 3), Magnolia (both), and his sanely automatic approach. Initial results showed his approach outperforming Shapeless and Mirrors but trailing Magnolia’s semi-automatic derivation. By adding caching within macros (reusing derived implementations for repeated types), his approach became the fastest across all platforms, compiling in less time than Shapeless, Mirrors, or Magnolia, with better runtime performance.

This caching, inspired by Jsoniter-Scala, avoids re-deriving the same type multiple times within a macro, reducing method size and enabling JVM optimization. The change required minimal code, demonstrating that library authors can adopt this approach with a single, non-invasive pull request.

Implications for Scala’s Future

Mateusz concluded by addressing Scala’s reputation for slow compilation, citing a Virtus Lab survey where 53% of developers complained about compile times, often tied to typeclass derivation. Libraries like Shapeless and Magnolia prioritize developer convenience over user experience, leading to opaque errors and performance issues. His sanely automatic derivation offers a user-friendly alternative: one import, fast compilation, efficient runtime, and debuggable errors.

By sharing his Chimney Macro Commons library, Mateusz encourages library authors to rethink derivation strategies. While macros require more maintenance than Shapeless or Magnolia, they become viable as libraries scale and prioritize user needs. He urged developers to provide feedback to library maintainers, challenging assumptions that automatic and semi-automatic are the only options, to make Scala more accessible and production-ready.

Links:

Hashtags: #Scala #TypeclassDerivation #ScalaIOParis2024 #Circe #JsoniterScala #Chimney #Performance #Macros #Scala3

Posted in en-US | Tags: Scala, ScalaIO