Jonathan Lalou's Blog

Posts Tagged ‘Scala’

[Scala IO Paris 2024] Escaping the False Dichotomy: Sanely Automatic Derivation in Scala

In the ScalaIO Paris 2024 session “Slow Auto, Inconvenient Semi: Escaping False Dichotomy with Sanely Automatic Derivation,” Mateusz Kubuszok delivered a user-focused exploration of typeclass derivation in Scala. With over nine years of Scala experience and as a co-maintainer of the Chimney library, Mateusz examined the trade-offs between automatic and semi-automatic derivation, proposing a “sanely automatic” approach that balances usability, compile-time performance, and runtime efficiency. Using JSON serialization libraries like Circe and Jsoniter-Scala as case studies, the talk highlighted how library design choices impact users and offered a practical solution to common derivation pain points, supported by benchmarks and a public repository.

Understanding Typeclasses and Derivation

Mateusz began by demystifying typeclasses, describing them as parameterized interfaces (e.g., an Encoder[T] for JSON serialization) whose implementations are provided by the compiler via implicits or givens. Typeclass derivation automates creating these implementations for complex types like case classes or sealed traits by combining implementations for their fields or subtypes. For example, encoding a User(name: String, address: Address) requires encoders for String and Address, which the compiler recursively resolves.

Derivation comes in two flavors: automatic and semi-automatic. Automatic derivation, popularized by Circe, uses implicits to generate implementations on-demand, but can lead to slow compilation and runtime performance issues. Semi-automatic derivation requires explicit calls (e.g., deriveEncoder[Address]) to define implicits, ensuring consistency but adding manual overhead. Mateusz emphasized that these choices, made by library authors, significantly affect users through compilation times, error messages, and performance.

The Pain Points of Automatic and Semi-Automatic Derivation

Automatic derivation in Circe recursively generates encoders for nested types, checking for existing implicits before falling back to derivation. This can cause circular dependencies or stack overflows if not managed carefully. Semi-automatic derivation avoids this by always generating new instances, but requires users to define implicits for every intermediate type, increasing code verbosity. Mateusz shared anecdotes of developers banning automatic derivation due to compilation times ballooning to 10 minutes for complex JSON hierarchies.

Benchmarks on a nested case class (five levels deep) revealed stark differences. On Scala 2.13, Circe’s automatic derivation took 14 seconds to compile a single file, versus 12 seconds for semi-automatic. On Scala 3, automatic derivation soared to 46 seconds (cold JVM) or 16 seconds (warm), while semi-automatic took 10 seconds or 1 second, respectively. Runtime performance also suffered: automatic derivation on Scala 3 was up to 10 times slower than semi-automatic, likely due to large inlined methods overwhelming JVM optimization.

Comparing Libraries: Circe vs. Jsoniter-Scala

Mateusz contrasted Circe with Jsoniter-Scala, which prioritizes performance. Jsoniter-Scala uses a single macro to generate recursive codec implementations, avoiding intermediate implicits except for custom overrides. This reduces memory allocation and compilation overhead. Benchmarks showed Jsoniter-Scala compiling faster than Circe (e.g., 2–3 times faster on Scala 3) and running three times faster, even when Circe was given a head start by testing on JSON representations rather than strings.

Jsoniter-Scala’s approach minimizes implicit searches, embedding logic in macros instead of relying on the compiler’s typechecker. For example, encoding a User with an Address field involves one codec handling all nested types, unlike Circe’s recursive implicit resolution. This results in fewer allocations and better error messages, as macros can pinpoint failure causes (e.g., a missing implicit for a specific field).

Sanely Automatic Derivation: A New Approach

Inspired by Jsoniter-Scala, Mateusz proposed “sanely automatic derivation” to combine automatic derivation’s convenience with semi-automatic’s performance. Using his Chimney library as a testbed, he split typeclasses into two traits: one for user-facing APIs and another for automatic derivation, invisible to implicit searches to avoid circular dependencies. This allows recursive derivation with minimal implicits, using macros to handle nested types efficiently.

Mateusz implemented this for a Jsoniter-Scala wrapper on Scala 3, achieving compilation times comparable to Jsoniter-Scala’s single-implicit approach and faster than Circe’s semi-automatic derivation. Runtime performance matched Jsoniter-Scala’s, with negligible overhead. Error messages were improved by logging macro actions (e.g., which field caused a failure) via Scala’s macro settings, viewable in IDEs like VS Code without console output.

A Fair Comparison: Custom Typeclass Benchmark

To ensure fairness, Mateusz created a custom Show typeclass (similar to Circe’s ShowPretty) for pretty-printing case classes using StringBuilder. He implemented it with Shapeless (Scala 2), Mirrors (Scala 3), Magnolia (both), and his sanely automatic approach. Initial results showed his approach outperforming Shapeless and Mirrors but trailing Magnolia’s semi-automatic derivation. By adding caching within macros (reusing derived implementations for repeated types), his approach became the fastest across all platforms, compiling in less time than Shapeless, Mirrors, or Magnolia, with better runtime performance.

This caching, inspired by Jsoniter-Scala, avoids re-deriving the same type multiple times within a macro, reducing method size and enabling JVM optimization. The change required minimal code, demonstrating that library authors can adopt this approach with a single, non-invasive pull request.

Implications for Scala’s Future

Mateusz concluded by addressing Scala’s reputation for slow compilation, citing a Virtus Lab survey where 53% of developers complained about compile times, often tied to typeclass derivation. Libraries like Shapeless and Magnolia prioritize developer convenience over user experience, leading to opaque errors and performance issues. His sanely automatic derivation offers a user-friendly alternative: one import, fast compilation, efficient runtime, and debuggable errors.

By sharing his Chimney Macro Commons library, Mateusz encourages library authors to rethink derivation strategies. While macros require more maintenance than Shapeless or Magnolia, they become viable as libraries scale and prioritize user needs. He urged developers to provide feedback to library maintainers, challenging assumptions that automatic and semi-automatic are the only options, to make Scala more accessible and production-ready.

Links:

Hashtags: #Scala #TypeclassDerivation #ScalaIOParis2024 #Circe #JsoniterScala #Chimney #Performance #Macros #Scala3

Posted in en-US | Tags: Scala, ScalaIO | No Comments »

[Scala IO Paris 2024] Calculating Is Funnier Than Guessing

Author: Jonathan Lalou

In the ScalaIO Paris 2024 session “Calculating is funnier than guessing”, Regis Kuckaertz, a French developer living in an English-speaking country captivated the audience with a methodical approach to writing compilers for domain-specific languages (DSLs) in Scala. The talk debunked the mystique of compiler construction, emphasizing a principled, calculation-based process over ad-hoc guesswork. Using equational reasoning and structural induction, the speaker derived a compiler and stack machine for a simple boolean expression language, Expr, and extended the approach to the more complex ZPure datatype from the ZIO Prelude library. The result was a correct-by-construction compiler, offering performance gains over interpreters while remaining accessible to functional programmers.

Laying the Foundation with Equational Reasoning

The talk began by highlighting the limitations of interpreters for DSLs, which, while easy to write via structural induction, incur runtime overhead. The speaker argued that functional programming’s strength lies in embedding DSLs, citing examples like Cats Effect, ZIO, and Kulo for metrics. To achieve “abstraction without remorse,” DSLs must be compiled into efficient machine code. The proposed method, inspired by historical work on calculating compilers, avoids pre-made recipes, instead using a single-step derivation process combining evaluation, continuation-passing style (CPS), and defunctionalization.

For the Expr language, comprising boolean constants, negation, and conjunction, the speaker defined a denotational semantics with an evaluator function. This function maps expressions to boolean values, e.g., evaluating And(Not(B(true)), B(false)) to a boolean result. The evaluator was refined to make implicit behaviors explicit, such as Scala’s left-to-right evaluation of &&, ensuring the specification aligns with developer expectations. This step underscored the importance of intimate familiarity with execution details, uncovered through the derivation process.

Deriving a Compiler for Expr

The core of the talk was deriving a compiler and stack machine for Expr using equational reasoning. The correctness specification required that compiling an expression and executing it on a stack yields the same result as evaluating the expression and pushing it onto the stack. The compiler was defined with a helper function using symbolic CPS, taking a continuation to guide code generation. For each constructor—B (boolean), Not, and And—the speaker applied the specification, reducing expressions step-by-step.

For B, a Push instruction was introduced to place a boolean on the stack. For Not, a Neg instruction negated the top stack value, with the subexpression compiled inductively. For And, the derivation distributed stack operations over conditional branches, introducing an If instruction to select continuations based on a boolean. The final Compile function used a Halt continuation to stop execution. The resulting machine language and stack machine, implemented as an imperative tail-recursive loop, fit on a single slide, achieving orders-of-magnitude performance improvements over the interpreter.

Tackling Complexity with ZPure

To demonstrate real-world applicability, the speaker applied the technique to ZPure, a datatype from ZIO Prelude for pure computations with state, logging, and error handling. The language includes constructors for pure values, failures, error handling, state management, logging, and flat mapping. The evaluator threads state and logs, succeeding or failing based on the computation. The compiler derivation followed the same process, introducing instructions like Push, Throw, Load, Store, Log, Mark, Unmark, and Call to handle values, errors, state, and continuations.

The derivation for ZPure required careful handling of failures, where a Throw instruction invokes a failure routine that unwinds the stack until it finds a handler or crashes. For Catch and FlatMap, the speaker applied the induction hypothesis, introducing stack markers to manage handlers and continuations. Despite Scala functions in ZPure requiring runtime compilation, the speaker proposed defunctionalization—using data types like Flow or lambda calculus encodings—to eliminate this, though this was left as future work. The resulting compiler and machine, again fitting on a slide, were correct by construction, with unreachable cases confidently excluded.

Reflections and Future Directions

The talk emphasized that calculating compilers is a mechanical, repeatable process, not a mysterious art. By deriving machine instructions through equational reasoning, developers ensure correctness without extensive unit testing. The speaker noted a limitation in ZPure: its evaluator and compiler allow non-terminating expressions, which a partial monad could address. Future work includes defunctionalizing ZPure to avoid runtime compilation and optimizing machine code into directed acyclic graphs to reduce duplication.

The speaker recommended resources like Philip Wadler’s papers on calculating compilers, encouraging functional programmers to explore this approachable technique. The talk, blending humor with rigor, demonstrated that compiling DSLs is not only feasible but also “funnier” than guessing, offering a path to efficient, correct code.

Links:

Regis Kuckaertz on LinkedIn
ScalaIO Paris 2024 Session Page
ZIO Prelude Documentation (For context on ZPure)
Philip Wadler’s Papers (For calculating compilers)

Hashtags: #Scala #CompilerDesign #EquationalReasoning #ZPure #ScalaIOParis2024 #FunctionalProgramming

Posted in en-US | Tags: DSL, Scala, ScalaIO | No Comments »

[ScalaDays 2019] Techniques for Teaching Scala

Author: Jonathan Lalou

Noel Welsh, co-founder of Underscore VC and a respected voice in the Scala community, known for his educational contributions including the books “Creative Scala” and “Essential Scala,” shared his valuable insights on “Techniques for Teaching Scala” at ScalaDays Lausanne 2019. His presentation moved beyond the technical intricacies of the language to explore the pedagogical approaches that make learning Scala more effective, engaging, and accessible to newcomers.

Addressing Scala’s Complexity

Welsh likely began by acknowledging the perceived complexity of Scala. While powerful, its rich feature set, including its blend of object-oriented and functional paradigms, can present a steep learning curve. His talk would have focused on how educators and mentors can break down these complexities into digestible components, fostering a positive learning experience that encourages long-term adoption and mastery of the language.

Structured and Incremental Learning

One of the core themes Welsh might have explored is the importance of a structured and incremental approach to teaching. Instead of overwhelming beginners with advanced concepts like implicits, monads, or higher-kinded types from day one, he probably advocated for starting with Scala’s more familiar object-oriented features, gradually introducing functional concepts as the learner builds confidence and a foundational understanding. He might have shared specific curriculum structures or learning pathways that have proven successful in his experience, perhaps drawing from his work at Underscore Training or the philosophy behind “Creative Scala,” which uses creative coding exercises to make learning fun and tangible.

Practical Application and Real-World Examples

Another key technique Welsh likely emphasized is the power of practical application and real-world examples. Abstract theoretical explanations, he might have argued, are less effective than hands-on coding exercises that allow learners to see Scala in action and understand how its features solve concrete problems. This could involve guiding students through building small, meaningful projects or using Scala for data manipulation, web development, or other relatable tasks. The use of tools like Scala Worksheets or Ammonite scripts for immediate feedback and experimentation could also have been highlighted as beneficial for learning.

Psychological Aspects of Learning

Welsh may have also delved into the psychological aspects of learning a new programming language. Addressing common frustrations, building a supportive learning environment, and celebrating small victories can significantly impact a student’s motivation and persistence. He might have discussed the role of clear, concise explanations, well-chosen analogies, and the importance of providing constructive feedback. Techniques for teaching specific Scala concepts that often trip up beginners, such as pattern matching, futures, or the collections library, could have been illustrated with practical teaching tips.

Investing in Scala Education

Ultimately, Noel Welsh’s presentation would have been a call to the Scala community to invest not just in developing the language and its tools, but also in refining the art and science of teaching it. By adopting more effective pedagogical techniques, the community can lower the barrier to entry, nurture new talent, and ensure that Scala’s power is accessible to a broader audience of developers, fostering a more vibrant and sustainable ecosystem.

Links:

Noel Welsh – Underscore VC

Hashtags: #ScalaDays2019 #Scala #Education #Teaching #Pedagogy #NoelWelsh #CreativeScala

Posted in en-US | Tags: Scala, ScalaDays | No Comments »

[ScalaDays 2019] Scala Best Practices Unveiled

Author: Jonathan Lalou

At ScalaDays Lausanne 2019, Nicolas Rinaudo, CTO of Besedo, delivered a candid and insightful talk on Scala best practices, drawing from his experience merging Node, Java, and Scala teams. Speaking at EPFL’s 10th anniversary conference, Nicolas shared pitfalls he wished he’d known as a beginner, from array comparisons to exceptions, offering practical solutions. His interactive approach—challenging the audience to spot code issues—drew laughter, applause, and questions on tools like Scalafix, making his session a highlight for developers seeking to write robust Scala code.

Array Comparisons and Type Annotations

Nicolas began with a deceptive array comparison using ==, which fails due to arrays’ reference equality on the JVM, unlike other Scala collections. Beginners from Java or JavaScript backgrounds often stumble here, expecting value equality. The fix, sameElements, ensures correct comparisons, while Scala 3’s immutable arrays behave sanely. Another common error is omitting type annotations on public methods. Without explicit types, the compiler may infer overly broad types like Some instead of Option, risking binary compatibility issues in libraries. Nicolas advised always annotating public members, noting Scala 3’s requirement for implicit members, though the core issue persists, emphasizing vigilance for maintainable code.

Sealed Traits and Case Classes

Enumerations in Scala, Nicolas warned, are treacherous, allowing non-exhaustive pattern matches that compile but crash at runtime. Sealed traits, with subtypes declared in the same file, enable the compiler to enforce exhaustive matching, preventing such failures. He recommended enabling fatal warnings to catch these issues. Similarly, non-final case classes can be extended, leading to unexpected equality behaviors, like two distinct objects comparing equal due to inherited methods. Making case classes final avoids this, though Scala’s default remains unchanged even in Scala 3. Nicolas also noted that sealed traits aren’t foolproof—subtypes can be extended unless final, a subtlety that surprises learners but is manageable with disciplined design.

Exceptions and Referential Transparency

Exceptions, common in Java, disrupt Scala’s referential transparency, where expressions should be replaceable with their values without altering behavior. Nicolas likened exceptions to goto statements, as their origin is untraceable, and Scala’s unchecked exceptions evade compile-time checks. For instance, parsing a string to an integer may throw a NumberFormatException, but most inputs fail, making it a common case unsuitable for exceptions. Instead, use Option for absent values, Either for errors with context, or Try for Java interop. These encode errors in return types, enhancing predictability. Scala 3 offers no changes here, so Nicolas’s advice remains critical for writing transparent, robust code.

Enforcing Best Practices

To prevent these pitfalls, Nicolas advocated rigorous code reviews, citing research showing reduced defects. Automated tools like WartRemover, Scalafix, Scapegoat, and Scalastyle detect issues during builds, though some, like WartRemover’s strict defaults, may need tuning. Migrating to Scala 3 addresses many problems, such as implicit conversions and enumeration ambiguities, with features like the enum keyword and explicit import rules. Nicolas invited the audience to share additional practices, pointing to his companion website for detailed articles. His acknowledgment of Scala contributors, especially for Scala 3 insights, underscored the community’s role in evolving best practices, making his talk a call to action for cleaner coding.

Links:

Hashtags: #ScalaDays2019 #Scala #BestPractices #FunctionalProgramming

Posted in en-US | Tags: Best Practices, Scala, ScalaDays | No Comments »

[ScalaDays 2019] Preserving Privacy with Scala

Author: Jonathan Lalou

At ScalaDays Lausanne 2019, Manohar Jonnalagedda and Jakob Odersky, researchers turned industry innovators, unveiled a Scala-powered approach to secure multi-party computation (MPC) at EPFL’s 10th anniversary conference. Their talk, moments before lunch, captivated attendees with a protocol to average salaries without revealing individual data, sparking curiosity about privacy-preserving applications. Manohar and Jakob, from Inpher, detailed a compiler transforming high-level code into secure, distributed computations, addressing real-world challenges like GDPR-compliant banking and satellite collision detection, earning applause and probing questions on security and scalability.

A Privacy-Preserving Protocol

Manohar opened with a relatable scenario: wanting to compare salaries with Jakob without disclosing personal figures. Their solution, a privacy-preserving protocol, lets three parties—Manohar, Jakob, and their CTO Dmitar—compute an average securely. Each generates three random numbers summing to zero, sharing them such that each party holds a unique view of partial sums. In Scala, this is modeled with a SecretValue type for private integers and a SharedNumber list, accessible only by the corresponding party. Each sums their shares, publishes the result, and the final sum reveals the average without exposing individual salaries. This protocol, using random shares, ensures no single party can deduce another’s data unless all communications are intercepted, balancing simplicity and security.

Secure Multi-Party Computation

Jakob explained MPC as a cryptographic subfield enabling joint function computation without revealing private inputs. The salary example used addition, but MPC supports multiplication via Beaver triplets, precomputed by a trusted dealer for efficiency. With addition and multiplication, MPC handles polynomials, enabling linear and logistic regression or exponential approximations via Taylor polynomials. Manohar highlighted Scala’s role in modeling these operations, with functions for element-wise addition and revealing sums. The protocol achieves information-theoretic security for integers, where masked values are indistinguishable, but floating-point numbers require computational security due to distribution challenges. This flexibility makes MPC suitable for complex computations, from machine learning to statistical analysis, all while preserving privacy.

Real-World Applications

MPC shines in domains where data sensitivity or legal constraints, like GDPR, restrict sharing. Manohar cited ING, a bank building credit-scoring models across European countries without moving user data across borders, complying with GDPR. Another compelling case involved satellite operators—American, Russian, or Chinese—secretly computing collision risks to avoid incidents like the 2009 Iridium-Cosmos crash, which threatened the International Space Station. Jakob emphasized that Inpher’s XOR platform, legally vetted by Baker McKenzie, ensures GDPR compliance by keeping data at its source. These use cases, from finance to defense, underscore MPC’s value in enabling secure collaboration, with Scala providing a robust, type-safe foundation for protocol implementation.

Building a Compiler for MPC

To scale MPC beyond simple embeddings, Manohar and Jakob’s team developed a compiler at Inpher, targeting a high-level language resembling Python or Scala for data scientists. This compiler transforms linear algebra-style code into low-level primitives for distributed execution across parties’ virtual machines, verified to prevent data leaks. It performs static analysis to optimize memory and communication, inferring masking parameters to minimize computational overhead. For example, multiplying masked floating-point numbers risks format explosion, so the compiler uses fixed-point representations and statistical bounds to maintain efficiency. The output, resembling assembly for MPC engines, manages memory allocation and propagates data dimensions. While currently MPC-focused, the compiler’s design could integrate other privacy techniques, offering a versatile platform for secure computation.

Links:

Hashtags: #ScalaDays2019 #Scala #Privacy #MPC

Posted in en-US | Tags: Scala, ScalaDays, Security | No Comments »

[ScalaDaysNewYork2016] Large-Scale Graph Analysis with Scala and Akka

Author: Jonathan Lalou

Ben Fonarov, a Big Data specialist at Capital One, presented a compelling case study at Scala Days New York 2016 on building a large-scale graph analysis engine using Scala, Akka, and HBase. Ben detailed the architecture and implementation of Athena, a distributed time-series graph system designed to deliver integrated, real-time data to enterprise users, addressing the challenges of data overload in a banking environment.

Addressing Enterprise Data Needs

Ben Fonarov opened by outlining the motivation behind Athena: the need to provide integrated, real-time data to users at Capital One. Unlike traditional table-based thinking, Athena represents data as a graph, modeling entities like accounts and transactions to align with business concepts. Ben highlighted the challenges of data overload, with multiple data warehouses and ETL processes generating vast datasets. Athena’s visual interface allows users to define graph schemas, ensuring data is accessible in a format that matches their mental models.

Architectural Considerations

Ben described two architectural approaches to building Athena. The naive implementation used a single actor to process queries, which was insufficient for production-scale loads. The robust solution leveraged an Akka cluster, distributing query processing across nodes for scalability. A query parser translated user requests into graph traversals, while actors managed tasks and streamed results to users. This design ensured low latency and scalability, handling up to 200 billion nodes efficiently.

Streaming and Optimization

A key feature of Athena, Ben explained, was its ability to stream results in real time, avoiding the batch processing limitations of frameworks like TinkerPop’s Gremlin. By using Akka’s actor-based concurrency, Athena processes queries incrementally, delivering results as they are computed. Ben discussed optimizations, such as limiting the number of nodes per actor to prevent bottlenecks, and plans to integrate graph algorithms like PageRank to enhance analytical capabilities.

Future Directions and Community Engagement

Ben concluded by sharing future plans for Athena, including adopting a Gremlin-like DSL for graph traversals and integrating with tools like Spark and H2O. He emphasized the importance of community feedback, inviting developers to join Capital One’s data team to contribute to Athena’s evolution. Running on AWS EC2, Athena represents a scalable solution for enterprise graph analysis, poised to transform how banks handle complex data relationships.

Links:

Capital One company website

Posted in en-US | Tags: Akka, BenFonarov, BigData, CapitalOne, GraphAnalysis, HBase, Scala, ScalaDaysNewYork2016 | No Comments »

[ScalaDaysNewYork2016] Implicits Inspected and Explained: Demystifying Scala’s Power

Author: Jonathan Lalou

At Scala Days New York 2016, Tim Soethout, a functional programmer at ING Bank, offered a comprehensive guide to Scala’s implicits, a feature often perceived as magical by developers transitioning from basic to advanced Scala programming. Tim’s presentation bridged this gap, providing clear explanations and practical examples to demonstrate how implicits enhance code expressiveness and flexibility.

Understanding Implicits

Tim Soethout began by defining implicits as a mechanism for providing values or conversions without explicit references, enabling concise and flexible code. Drawing parallels with object-oriented programming, Tim explained that implicits extend “is-a” and “has-a” relationships with “is-viewable-as,” allowing developers to add rich interfaces to existing types. For instance, in Akka, the ! (tell) operator uses an implicit sender parameter, simplifying message passing. Similarly, Scala’s Futures rely on implicit execution contexts to manage asynchronous operations, abstracting thread scheduling from developers.

Compiler Resolution of Implicits

A key focus of Tim’s talk was demystifying how the Scala compiler resolves implicits. He outlined the compiler’s search process, which prioritizes local scope, companion objects, and package objects related to the involved types. Tim cautioned against implicit conversions with mismatched semantics, as they can lead to unexpected behavior. Using a live coding demo, he illustrated how implicits enable expressive DSLs, such as JSON serialization libraries, by automatically resolving type-specific writers, thus reducing boilerplate code.

Type Classes and Extensibility

Tim explored type classes as a powerful application of implicits, allowing non-intrusive library extensions. By defining behaviors like JSON serialization in companion objects, developers can extend functionality without modifying core libraries. He demonstrated this with a JSON writer example, where implicits ensured type-safe serialization for complex data structures. Tim emphasized that this approach fosters loose coupling, making libraries more modular and easier to maintain.

Practical Debugging Tips

Addressing common challenges, Tim offered strategies for debugging implicits, such as inspecting bytecode or leveraging IDEs to trace implicit resolutions. He warned against chaining multiple implicit conversions, as the compiler restricts itself to a single conversion to avoid complexity. By sharing practical examples, Tim equipped developers with the tools to harness implicits effectively, ensuring they enhance rather than obscure code clarity.

Links:

ING Bank company website

Posted in en-US | Tags: FunctionalProgramming, Implicits, INGBank, Scala, ScalaDaysNewYork2016, TimSoethout, TypeClasses | No Comments »

[ScalaDaysNewYork2016] Finagle Under the Hood: Unraveling Twitter’s RPC System

Author: Jonathan Lalou

At Scala Days New York 2016, Vladimir Kostyukov, a member of Twitter’s Finagle team, provided an in-depth exploration of Finagle, a scalable and extensible RPC system written in Scala. Vladimir elucidated how Finagle simplifies the complexities of distributed systems, offering a functional programming model that enhances developer productivity while managing intricate backend operations like load balancing and circuit breaking.

The Essence of Finagle

Vladimir Kostyukov introduced Finagle as a robust RPC system used extensively at Twitter and other organizations. Unlike traditional application frameworks, Finagle focuses on facilitating communication between services, abstracting complexities such as connection pooling and load balancing. Vladimir highlighted its protocol-agnostic nature, supporting protocols like HTTP/2 and Twitter’s custom Mux, which enables efficient multiplexing. This flexibility allows developers to build services in Scala or Java, seamlessly integrating Finagle into diverse tech stacks.

Client-Server Dynamics

Delving into Finagle’s internals, Vladimir described its client-server model, where services are treated as composable functions. When a client sends a request, Finagle’s stack—comprising modules for connection pooling, load balancing, and failure handling—processes it efficiently. On the server side, incoming requests are routed through similar modules, ensuring resilience and scalability. Vladimir emphasized the “power of two choices” load balancing algorithm, which selects the least-loaded node from two randomly chosen servers, achieving near-optimal load distribution in constant time.

Advanced Features and Ecosystem

Vladimir showcased Finagle’s advanced capabilities, such as streaming support for large payloads and integration with tools like Zipkin for tracing and Twitter Server for monitoring. Libraries like Finatra and Featherbed further enhance Finagle’s utility, enabling developers to build RESTful APIs and type-safe HTTP clients. These features make Finagle a powerful choice for handling high-throughput systems, as demonstrated by its widespread adoption at Twitter for managing massive data flows.

Community and Future Prospects

Encouraging community engagement, Vladimir invited developers to contribute to Finagle’s open-source repository on GitHub. He discussed ongoing efforts to support HTTP/2 and improve performance, underscoring Finagle’s evolution toward a utopian RPC system. By offering a welcoming environment for pull requests and feedback, Vladimir emphasized the collaborative spirit driving Finagle’s development, ensuring it remains a cornerstone of scalable service architectures.

Links:

Posted in en-US | Tags: DistributedSystems, Finagle, RPC, Scala, ScalaDaysNewYork2016, Twitter, VladimirKostyukov | No Comments »

[ScalaDaysNewYork2016] Nightmare Before Best Practices: Lessons from Failure

Author: Jonathan Lalou

At Scala Days New York 2016, José Castro, a software engineer at Codacy, delivered a riveting presentation that diverged from the typical conference narrative. Instead of showcasing success stories, José shared cautionary tales of software development mishaps, emphasizing the critical importance of adhering to best practices to prevent costly errors. Through vivid anecdotes, he illustrated how neglecting simple procedures can lead to significant financial and operational setbacks, offering valuable lessons for developers.

The Costly Oversight in Payment Systems

José Castro began with a chilling account of a website launch that initially seemed successful but resulted in a €180,000 loss. The development team had integrated a shopping cart with a bank’s payment system, but for three weeks, no customer payments were processed. José recounted how a developer’s personal purchase revealed that the system was authorizing transactions without completing charges, a flaw unnoticed due to inadequate testing. The bank’s policy allowed only one week to finalize charges, rendering earlier transactions uncollectible. This oversight, José emphasized, could have been prevented with rigorous integration testing and automated checks to ensure payment flows were correctly implemented.

Deployment Disasters and Human Error

Another tale José shared involved a deployment error that brought down a critical system for 12 hours. A developer, tasked with updating a customer-facing application, accidentally deployed to the production environment instead of staging, overwriting essential configurations. The absence of proper deployment protocols and environment safeguards exacerbated the issue, leading to significant downtime. José highlighted the need for automated deployment pipelines and environment-specific configurations to prevent such human errors, ensuring that production systems remain insulated from untested changes.

The Perils of Inadequate Documentation

José also recounted a scenario where insufficient documentation led to a prolonged outage in a payment processing system. A critical configuration change was made without updating the documentation, leaving the team unable to troubleshoot when the system failed. This lack of clarity delayed recovery, costing the company valuable time and revenue. José advocated for documentation-driven development, where comprehensive records of system configurations and procedures are maintained, enabling quick resolution of issues and reducing dependency on individual knowledge.

Fostering a Healthy Code Review Culture

In addressing code review challenges, José discussed the emotional barriers developers face when receiving feedback. He shared an example of a team member who successfully separated personal ego from code quality, embracing constructive criticism. To mitigate conflicts, José recommended automated code review tools like Codacy, which provide objective feedback, reducing interpersonal tension. By automating routine checks, teams can focus on higher-level implementation discussions, fostering a collaborative environment and improving code quality without bruising egos.

Links:

Codacy company website

Posted in en-US | Tags: BestPractices, Codacy, CodeReview, JoséCastro, Scala, ScalaDaysNewYork2016, SoftwareDevelopment | No Comments »

[ScalaDaysNewYork2016] Perfect Scalability: Architecting Limitless Systems

Author: Jonathan Lalou

Michael Nash, co-author of Applied Akka Patterns, delivered an insightful exploration of scalability at Scala Days New York 2016, distinguishing it from performance and outlining strategies to achieve near-linear scalability using the Lightbend ecosystem. Michael’s presentation delved into architectural principles, real-world patterns, and tools that enable systems to handle increasing loads without failure.

Scalability vs. Performance

Michael Nash clarified that scalability is the ability to handle greater loads without breaking, distinct from performance, which focuses on processing the same load faster. Using a simple graph, Michael illustrated how performance improvements shift response times downward, while scalability extends the system’s capacity to handle more requests. He cautioned that poorly designed systems hit scalability limits, leading to errors or degraded performance, emphasizing the need for architectures that avoid these bottlenecks.

Avoiding Scalability Pitfalls

Michael identified key enemies of scalability, such as shared databases, synchronous communication, and sequential IDs. He advocated for denormalized, isolated data stores per microservice, using event sourcing and CQRS to decouple systems. For instance, an inventory service can update based on events from a customer service without direct database access, enhancing scalability. Michael also warned against overusing Akka cluster sharding, which introduces overhead, recommending it only when consistency is critical.

Leveraging the Lightbend Ecosystem

The Lightbend ecosystem, including Scala, Akka, and Spark, provides robust tools for scalability, Michael explained. Akka’s actor model supports asynchronous messaging, ideal for distributed systems, while Spark handles large-scale data processing. Tools like Docker, Mesos, and Lightbend’s ConductR streamline deployment and orchestration, enabling rolling upgrades without downtime. Michael emphasized integrating these tools with continuous delivery and deep monitoring to maintain system health under high loads.

Real-World Applications and DevOps

Michael shared case studies from IoT wearables to high-finance systems, highlighting common patterns like event-driven architectures and microservices. He stressed the importance of DevOps in scalable systems, advocating for automated deployment pipelines and monitoring to detect issues early. By embracing failure as inevitable and designing for resilience, systems can scale across data centers, as seen in continent-spanning applications. Michael’s practical advice included starting deployment planning early to avoid scalability bottlenecks.

Links:

Lightbend company website

Posted in en-US | Tags: Akka, Lightbend, MichaelNash, ReactiveSystems, Scala, Scalability, ScalaDaysNewYork2016, Spark | No Comments »