Jonathan Lalou's Blog

Posts Tagged ‘Performance’

[DotJs2024] Embracing Reactivity: Signals Unveiled in Modern Web Frameworks

As web architectures burgeon in intricacy, the quest for fluid state orchestration intensifies, demanding primitives that harmonize intuition with efficiency. Ruby Jane Cabagnot, a Oslo-based full-stack artisan and co-author of Practical Enterprise React, illuminated this quest at dotJS 2024. With a portfolio spanning cloud services and DevOps, Ruby dissected signals’ ascendancy in frameworks like SolidJS and Svelte, tracing their lineage from Knockout’s observables to today’s compile-time elixirs. Her exposition: a clarion for developers to harness these sentinels, streamlining reactivity while amplifying responsiveness.

Ruby’s odyssey commenced with historical moorings: Knockout’s MVVM pioneers observables, auto-propagating UI tweaks; AngularJS echoed with bidirectional bonds, model-view symphonies. React’s virtual DOM and hooks refined declarative flows, context cascades sans impurity. Yet, SolidJS and Svelte pioneer signals—granular beacons tracking dependencies, updating solely perturbed loci. In Solid, createSignal births a reactive vessel: name tweaks ripple to inputs, paragraphs—minimal footprint, maximal sync. Svelte compiles bindings at build: $: value directives weave reactivity into markup, runtime overhead evaporated.

Vue’s ref system aligns, signals as breath-easy bindings. Ruby extolled their triad: intuitiveness supplants boilerplate bazaars; performance prunes needless re-renders, DOM diffs distilled; developer delight via declarative purity, codebases crystalline. Signals transcend UIs, infiltrating WebAssembly’s server tides, birthing omnipresent reactivity. Ruby’s entreaty: probe these pillars, propel paradigms where apps pulse as dynamically as their environs.

Evolutionary Echoes of Reactivity

Ruby retraced trails: Knockout’s observables ignited auto-updates; AngularJS’s bonds synchronized realms. React’s hooks democratized context; Solid/Svelte’s signals granularize, compile-time cunning curbing cascades—name flux mends markup sans wholesale refresh.

Signals’ Synergies in Action

Solid’s vessels auto-notify dependents; Svelte’s directives distill runtime to essence. Vue’s refs render reactivity reflexive. Ruby rejoiced: libraries obsolete, renders refined, ergonomics elevated—crafting canvases concise, performant, profound.

Links:

Posted in en-US | Tags: dotJS2024, Performance, Reactivity, RubyJaneCabagnot, Signals, SolidJS, StateManagement, Svelte, Vue, WebDev | No Comments »

[DevoxxFR2025] Boosting Java Application Startup Time: JVM and Framework Optimizations

Author: Jonathan Lalou

In the world of modern application deployment, particularly in cloud-native and microservice architectures, fast startup time is a crucial factor impacting scalability, resilience, and cost efficiency. Slow-starting applications can delay deployments, hinder auto-scaling responsiveness, and consume resources unnecessarily. Olivier Bourgain, in his presentation, delved into strategies for significantly accelerating the startup time of Java applications, focusing on optimizations at both the Java Virtual Machine (JVM) level and within popular frameworks like Spring Boot. He explored techniques ranging from garbage collection tuning to leveraging emerging technologies like OpenJDK’s Project Leyden and Spring AOT (Ahead-of-Time Compilation) to make Java applications lighter, faster, and more efficient from the moment they start.

The Importance of Fast Startup

Olivier began by explaining why fast startup time matters in modern environments. In microservices architectures, applications are frequently started and stopped as part of scaling events, deployments, or rolling updates. A slow startup adds to the time it takes to scale up to handle increased load, potentially leading to performance degradation or service unavailability. In serverless or function-as-a-service environments, cold starts (the time it takes for an idle instance to become ready) are directly impacted by application startup time, affecting latency and user experience. Faster startup also improves developer productivity by reducing the waiting time during local development and testing cycles. Olivier emphasized that optimizing startup time is no longer just a minor optimization but a fundamental requirement for efficient cloud-native deployments.

JVM and Garbage Collection Optimizations

Optimizing the JVM configuration and understanding garbage collection behavior are foundational steps in improving Java application startup. Olivier discussed how different garbage collectors (like G1, Parallel, or ZGC) can impact startup time and memory usage. Tuning JVM arguments related to heap size, garbage collection pauses, and just-in-time (JIT) compilation tiers can influence how quickly the application becomes responsive. While JIT compilation is crucial for long-term performance, it can introduce startup overhead as the JVM analyzes and optimizes code during initial execution. Techniques like Class Data Sharing (CDS) were mentioned as a way to reduce startup time by sharing pre-processed class metadata between multiple JVM instances. Olivier provided practical tips and configurations for optimizing JVM settings specifically for faster startup, balancing it with overall application performance.

Framework Optimizations: Spring Boot and Beyond

Popular frameworks like Spring Boot, while providing immense productivity benefits, can sometimes contribute to longer startup times due to their extensive features and reliance on reflection and classpath scanning during initialization. Olivier explored strategies within the Spring ecosystem and other frameworks to mitigate this. He highlighted Spring AOT (Ahead-of-Time Compilation) as a transformative technology that analyzes the application at build time and generates optimized code and configuration, reducing the work the JVM needs to do at runtime. This can significantly decrease startup time and memory footprint, making Spring Boot applications more suitable for resource-constrained environments and serverless deployments. Project Leyden in OpenJDK, aiming to enable static images and further AOT compilation for Java, was also discussed as a future direction for improving startup performance at the language level. Olivier demonstrated how applying these framework-specific optimizations and leveraging AOT compilation can have a dramatic impact on the startup speed of Java applications, making them competitive with applications written in languages traditionally known for faster startup.

Links:

Olivier Bourgain: https://www.linkedin.com/in/olivier-bourgain/
Mirakl: https://www.mirakl.com/
Spring Boot: https://spring.io/projects/spring-boot
OpenJDK Project Leyden: https://openjdk.org/projects/leyden/
Devoxx France LinkedIn: https://www.linkedin.com/company/devoxx-france/
Devoxx France Bluesky: https://bsky.app/profile/devoxx.fr
Devoxx France Website: https://www.devoxx.fr/

Posted in en-US | Tags: AOT, CloudNative, DevoxxFR2025, GarbageCollection, Java, JVM, Microservices, Mirakl, OlivierBourgain, Performance, ProjectLeyden, SpringBoot, StartupTime | No Comments »

[OxidizeConf2024] The Basics of Profile-Guided Optimization (PGO) in Rust

Author: Jonathan Lalou

Understanding PGO and Its Relevance

In the realm of high-performance software, optimizing applications for speed and efficiency is paramount. At OxidizeConf2024, Alexander Zaitsev, a seasoned developer with a background in C++ and extensive experience in compiler optimizations, delivered a comprehensive exploration of Profile-Guided Optimization (PGO) in Rust. PGO leverages runtime statistics to enhance software performance, a technique long supported by the Rustc compiler but fraught with practical challenges. Alexander shared his insights from applying PGO across diverse domains—compilers, databases, game engines, and CLI tools—offering a roadmap for developers seeking to harness its potential.

PGO operates by collecting runtime data to inform compiler optimizations, enabling tailored code generation that aligns with actual workloads. Alexander emphasized its applicability to CPU-intensive applications with complex branching, such as parsers, databases, and operating systems like OxideOS. By analyzing real-world benchmarks, he demonstrated significant performance gains, with speedups observed in open-source projects. These results, derived from practical workloads rather than synthetic tests, underscore PGO’s value in enhancing Rust applications, making it a vital tool for developers prioritizing performance.

Practical Applications and Performance Gains

Alexander’s presentation highlighted concrete examples of PGO’s impact. For instance, applying PGO to libraries like Serde resulted in notable performance improvements, with benchmarks showing reduced execution times for JSON parsing and other tasks. He showcased results from various applications, including databases and game engines, where PGO optimized critical paths, reducing latency and improving throughput. These gains were achieved by enabling PGO’s instrumentation-based approach, which collects detailed runtime profiles to guide optimizations like inlining and branch prediction.

However, Alexander cautioned that PGO’s effectiveness depends on workload representativeness. For databases, he used analytical workloads to generate profiles, ensuring optimizations aligned with typical usage patterns. This approach contrasts with manual optimization techniques, such as annotating hot paths, which Alexander argued are less adaptable to changing workloads. By automating optimization through PGO, developers can achieve consistent performance improvements without the maintenance burden of manual tweaks, a significant advantage in dynamic environments.

Navigating PGO’s Challenges and Ecosystem

While PGO offers substantial benefits, Alexander detailed several pitfalls. Common traps include profile mismatches, where training data does not reflect production workloads, leading to suboptimal optimizations. He also highlighted issues with link-time optimization (LTO), which, while complementary to PGO, is not universally adopted. To mitigate these, Alexander recommended starting with release mode optimizations (level three) and enabling LTO before applying PGO. For advanced users, sampling-based PGO or continuous profiling, as practiced by Google, can further enhance results.

The Rust ecosystem’s PGO support is evolving, with tools like cargo-pgo and community contributions improving accessibility. Alexander pointed to his “Awesome PGO” repository as a comprehensive resource, offering benchmarks and guidance across ecosystems, including C++. He also noted ongoing efforts to integrate machine learning into PGO workflows, citing tools like LLVM’s BOLT and Intel’s Thin Layout Optimizer. These advancements signal a bright future for PGO in Rust, though challenges like build system integration and community tooling maturity remain.

Links:

Awesome PGO repository

Posted in en-US | Tags: AlexanderZaitsev, CompilerOptimization, OxidizeConf2024, Performance, PGO, RustLang | No Comments »

[Oracle Dev Days 2025] Optimizing Java Performance: Choosing the Right Garbage Collector

Author: Jonathan Lalou

Jean-Philippe BEMPEL , a seasoned developer at Datadog and a Java Champion, delivered an insightful presentation on selecting and tuning Garbage Collectors (GCs) in OpenJDK to enhance Java application performance. His talk, rooted in practical expertise, unraveled the complexities of GCs, offering a roadmap for developers to align their choices with specific application needs. By dissecting the characteristics of various GCs and their suitability for different workloads, Jean-Philippe provided actionable strategies to optimize memory management, reduce production issues, and boost efficiency.

Understanding Garbage Collectors in OpenJDK

Garbage Collectors are pivotal in Java’s memory management, silently handling memory allocation and reclamation. However, as Jean-Philippe emphasized, a misconfigured GC can lead to significant performance bottlenecks in production environments. OpenJDK offers a suite of GCs—Serial GC, Parallel GC, G1, Shenandoah, and ZGC—each designed with distinct characteristics to cater to diverse application requirements. The challenge lies in selecting the one that best matches the workload, whether it prioritizes throughput or low latency.

Jean-Philippe began by outlining the foundational concepts of GCs, particularly the generational model. Most GCs in OpenJDK are generational, dividing memory into the Young Generation (for short-lived objects) and the Old Generation (for longer-lived objects). The Young Generation is further segmented into the Eden space, where new objects are allocated, and Survivor spaces, which hold objects that survive initial collections before promotion to the Old Generation. Additionally, the Metaspace stores class metadata, a critical but often overlooked component of memory management.

Serial GC: Simplicity for Constrained Environments

The Serial GC, one of the oldest collectors, operates with a single thread and employs a stop-the-world approach, pausing all application threads during collection. Jean-Philippe highlighted its suitability for small-scale applications, particularly those running in containers with less than 2 GB of RAM, where it serves as the default GC. Its simplicity makes it ideal for environments with limited resources, but its stop-the-world nature can introduce noticeable pauses, making it less suitable for latency-sensitive applications.

To illustrate, Jean-Philippe explained the mechanics of the Young Generation’s Survivor spaces. These spaces, S0 and S1, alternate roles as source and destination during minor GC cycles, copying live objects to manage memory efficiently. Objects surviving multiple cycles are promoted to the Old Generation, reducing the overhead of frequent collections. This generational approach leverages the hypothesis that most objects die young, minimizing the cost of memory reclamation.

Parallel GC: Maximizing Throughput

For applications prioritizing throughput, such as batch processing jobs, the Parallel GC offers significant advantages. Unlike the Serial GC, it leverages multiple threads to reclaim memory, making it efficient for systems with ample CPU cores. Jean-Philippe noted that it was the default GC until JDK 8 and remains a strong choice for throughput-oriented workloads like Spark jobs, Kafka consumers, or ETL processes.

The Parallel GC, also stop-the-world, excels in scenarios where total execution time matters more than individual pause durations. Jean-Philippe shared a benchmark using a JFR (Java Flight Recorder) file parsing application, where Parallel GC outperformed others, achieving a throughput of 97% (time spent in application versus GC). By tuning the Young Generation size to reduce frequent minor GCs, developers can further minimize object copying, enhancing overall performance.

G1 GC: Balancing Throughput and Latency

The G1 (Garbage-First) GC, default since JDK 9 for heaps larger than 2 GB, strikes a balance between throughput and latency. Jean-Philippe described its region-based memory management, dividing the heap into smaller regions (Eden, Survivor, Old, and Humongous for large objects). This structure allows G1 to focus on regions with the most garbage, optimizing memory reclamation with minimal copying.

In his benchmark, G1 showed a throughput of 85%, with average pause times of 76 milliseconds, aligning with its target of 200 milliseconds. However, Jean-Philippe pointed out challenges with Humongous objects, which can increase GC frequency if not managed properly. By adjusting region sizes (up to 32 MB), developers can mitigate these issues, improving throughput for applications like batch jobs while maintaining reasonable pause times.

Shenandoah and ZGC: Prioritizing Low Latency

For latency-sensitive applications, such as HTTP servers or microservices, Shenandoah and ZGC are the go-to choices. These concurrent GCs minimize pause times, often below a millisecond, by performing most operations alongside the running application. Jean-Philippe highlighted Shenandoah’s non-generational approach (though a generational version is in development) and ZGC’s generational support since JDK 21, making the latter particularly efficient for large heaps.

In a latency-focused benchmark using a Spring PetClinic application, Jean-Philippe demonstrated that Shenandoah and ZGC maintained request latencies below 200 milliseconds, significantly outperforming Parallel GC’s 450 milliseconds at the 99th percentile. ZGC’s use of colored pointers and load/store barriers ensures rapid memory reclamation, allowing regions to be freed early in the GC cycle, a key advantage over Shenandoah.

Tuning Strategies for Optimal Performance

Tuning GCs is as critical as selecting the right one. For Parallel GC, Jean-Philippe recommended sizing the Young Generation to reduce the frequency of minor GCs, ideally exceeding 50% of the heap to minimize object copying. For G1, adjusting region sizes can address Humongous object issues, while setting a maximum pause time target (e.g., 50 milliseconds) can shift its behavior toward latency sensitivity, though it may not compete with Shenandoah or ZGC in extreme cases.

For concurrent GCs like Shenandoah and ZGC, ensuring sufficient heap size and CPU cores prevents allocation stalls, where threads wait for memory to be freed. Jean-Philippe emphasized that Shenandoah requires careful heap sizing to avoid full GCs, while ZGC’s rapid region reclamation reduces such risks, making it more forgiving for high-allocation-rate applications.

Selecting the Right GC for Your Workload

Jean-Philippe concluded by categorizing workloads into two types: throughput-oriented (SPOT) and latency-sensitive. For SPOT workloads, such as batch jobs or ETL processes, Parallel GC or G1 are optimal, with Parallel GC offering easier tuning for predictable performance. For latency-sensitive applications, like microservices or databases (e.g., Cassandra), ZGC’s generational efficiency and Shenandoah’s low-pause capabilities shine, with ZGC being particularly effective for large heaps.

By analyzing workload characteristics and leveraging tools like GC Easy for log analysis, developers can make informed GC choices. Jean-Philippe’s benchmarks underscored the importance of tailoring GC configurations to specific use cases, ensuring both performance and stability in production environments.

Links:

Hashtags: #Java #GarbageCollector #OpenJDK #Performance #Tuning #Datadog #JeanPhilippeBempel #OracleDevDays2025

Posted in en-US | Tags: Garbage Collector, JVM, OpenJDK, Performance, Turning | No Comments »

[DotJs2025] Node.js Will Use All the Memory Available, and That’s OK!

Author: Jonathan Lalou

In the pulsating heart of server-side JavaScript, where applications hum under relentless loads, a persistent myth endures: Node.js’s voracious appetite for RAM signals impending doom. Matteo Collina, co-founder and CTO at Platformatic, dismantled this notion at dotJS 2025, revealing how V8’s sophisticated heap stewardship—far from a liability—empowers resilient, high-throughput services. With over 15 years sculpting performant ecosystems, including Fastify’s lean framework and Pino’s swift logging, Matteo illuminated the elegance of embracing memory as a strategic asset, not an adversary. His revelation: judicious tuning transforms perceived excess into a catalyst for latency gains and stability, urging developers to recalibrate preconceptions for enterprise-grade robustness.

Matteo commenced with a ritual lament: weekly pleas from harried coders convinced their apps hemorrhage resources, only to confess manual terminations at arbitrary thresholds—no crashes, merely preempted panics. This vignette unveiled the crux: Node’s default 1.4GB cap (64-bit) isn’t a leak’s harbinger but a deliberate throttle, safeguarding against unchecked sprawl. True leaks—orphaned closures, eternal event emitters—defy GC’s mercy, accruing via retain cycles. Yet, most “leaks” masquerade as legitimate growth: caches bloating under traffic, buffers queuing async floods. Matteo advocated profiling primacy: Chrome DevTools’ heap snapshots, clinic.js’s flame charts—tools unmasking culprits sans conjecture.

Delving into V8’s bowels, Matteo traced the Orinoco collector’s cadence: minor sweeps scavenging new-space detritus, majors consolidating old-space survivors. Latency lurks in these pauses; unchecked heaps amplify them, stalling event loops. His panacea: hoist the ceiling via --max-old-space-size=4096, bartering RAM for elongated intervals between majors. Benchmarks corroborated: a 4GB tweak on a Fastify benchmark slashed P99 latency by 8-10%, throughput surging analogously—thinner GC curves yielding smoother sails. This alchemy, Matteo posited, flips economics: memory’s abundance (cloud’s elastic reservoirs) trumps compute’s scarcity, especially as SSDs eclipse HDDs in I/O velocity.

Enterprise vignettes abounded. Platformatic’s observability suite, Pino’s zero-allocation streams—testaments to lean design—thrive sans austerity. Matteo cautioned: leaks persist, demanding vigilance—nullify globals, prune listeners, wield weak maps for caches. Yet, fear not the fullness; it’s V8’s vote of confidence in your workload’s vitality. As Kubernetes autoscalers and monitoring recipes (his forthcoming tome’s bounty) democratize, Node’s memory ethos evolves from taboo to triumph.

Demystifying Heaps and Collectors

Matteo dissected V8’s realms: new-space for ephemeral allocations, old-space for tenured stalwarts—Orinoco’s incremental majors mitigating stalls. Defaults constrain; elevations liberate, as 2025’s guides affirm: monitor via --inspect, profile with heapdump.js, tuning for 10% latency dividends sans leaks.

Trading Bytes for Bandwidth

Empirical edges: Fastify’s trials evince heap hikes yielding throughput boons, GC pauses pruned. Platformatic’s ethos—frictionless backends—embodies this: Pino’s streams, Fastify’s routers, all memory-savvy. Matteo’s gift: enterprise blueprints, from K8s scaling to on-prem Next.js, in his 296-page manifesto.

Links:

Posted in en-US | Tags: dotJS2025, Enterprise, Fastify, GarbageCollection, MatteoCollina, MemoryManagement, NodeJS, Performance, Platformatic, V8 | No Comments »

[DevoxxBE2024] Performance-Oriented Spring Data JPA & Hibernate by Maciej Walkowiak

Author: Jonathan Lalou

At Devoxx Belgium 2024, Maciej Walkowiak delivered a compelling session on optimizing Spring Data JPA and Hibernate for performance, a critical topic given Hibernate’s ubiquity and polarizing reputation in Java development. With a focus on practical solutions, Maciej shared insights from his extensive consulting experience, addressing common performance pitfalls such as poor connection management, excessive queries, and the notorious N+1 problem. Through live demos and code puzzles, he demonstrated how to configure Hibernate and Spring Data JPA effectively, ensuring applications remain responsive and scalable. His talk emphasized proactive performance tuning during development to avoid production bottlenecks.

Why Applications Slow Down

Maciej opened by debunking myths about why applications lag, dismissing outdated notions that Java or databases are inherently slow. Instead, he pinpointed the root cause: misuse of technologies like Hibernate. Common issues include poor database connection management, which can halt applications, and issuing excessive or slow queries due to improper JPA mappings or over-fetching data. Maciej stressed the importance of monitoring tools like DataDog APM, which revealed thousands of queries in a single HTTP request in one of his projects, taking over 7 seconds. He urged developers to avoid guessing and use tracing tools or SQL logging to identify issues early, ideally during testing with tools like Digma’s IntelliJ plugin.

Optimizing Database Connection Management

Effective connection management is crucial for performance. Maciej explained that establishing database connections is costly due to network latency and authentication overhead, especially in PostgreSQL, where each connection spawns a new OS process. Connection pools, standardized in Spring Boot, mitigate this by creating a fixed number of connections (default: 10) at startup. However, developers must ensure connections are released promptly to avoid exhaustion. Using FlexyPool and Spring Boot Data Source Decorator, Maciej demonstrated logging connection acquisition and release times. In one demo, a transactional method unnecessarily held a connection for 273 milliseconds due to an external HTTP call within the transaction. Disabling spring.jpa.open-in-view reduced this to 61 milliseconds, freeing the connection after the transaction completed.

Transaction Management for Efficiency

Maciej highlighted the pitfalls of default transaction settings and nested transactions. By default, Spring Boot’s auto-commit mode triggers commits after each database interaction, but disabling it (spring.datasource.auto-commit=false) delays connection acquisition until the first database interaction, reducing connection hold times. For complex workflows, he showcased the TransactionTemplate for programmatic transaction management, allowing developers to define transaction boundaries within a method without creating artificial service layers. This approach avoids issues with @Transactional(propagation = Propagation.REQUIRES_NEW), which can occupy multiple connections unnecessarily, as seen in a demo where nested transactions doubled connection usage, risking pool exhaustion.

Solving the N+1 Problem and Over-Fetching

The N+1 query problem, a common Hibernate performance killer, occurs when lazy-loaded relationships trigger additional queries per entity. In a banking application demo, Maciej showed a use case where fetching bank transfers by sender ID resulted in multiple queries due to eager fetching of related accounts. By switching @ManyToOne mappings to FetchType.LAZY and using explicit JOIN FETCH in custom JPQL queries, he reduced queries to a single, efficient one. Additionally, he addressed over-fetching by using getReferenceById() instead of findById(), avoiding unnecessary queries when only entity references are needed, and introduced the @DynamicUpdate annotation to update only changed fields, optimizing updates for large tables.

Projections and Tools for Long-Term Performance

For read-heavy operations, Maciej advocated using projections to fetch only necessary data, avoiding the overhead of full entity loading. Spring Data JPA supports projections via records or interfaces, automatically generating queries based on method names or custom JPQL. Dynamic projections further simplify repositories by allowing runtime specification of return types. To maintain performance, he recommended tools like Hypersistence Optimizer (a commercial tool by Vlad Mihalcea) and QuickPerf (an open-source library, though unmaintained) to enforce query expectations in tests. These tools help prevent regressions, ensuring optimizations persist despite team changes or project evolution.

Links:

Posted in en-US | Tags: Database, DevoxxBE2024, Hibernate, MaciejWalkowiak, NPlusOne, Performance, Projections, SpringDataJPA | No Comments »

[DevoxxBE2024] Project Panama in Action: Building a File System by David Vlijmincx

Author: Jonathan Lalou

At Devoxx Belgium 2024, David Vlijmincx delivered an engaging session on Project Panama, demonstrating its power by building a custom file system in Java. This practical, hands-on talk showcased how Project Panama simplifies integrating C libraries into Java applications, replacing the cumbersome JNI with a more developer-friendly approach. By leveraging Fuse, virtual threads, and Panama’s memory management capabilities, David walked attendees through creating a functional file system, highlighting real-world applications and performance benefits. His talk emphasized the ease of integrating C libraries and the potential to build high-performance, innovative solutions.

Why Project Panama Matters

David began by addressing the challenges of JNI, which many developers find frustrating due to its complexity. Project Panama, part of OpenJDK, offers a modern alternative for interoperating with native C libraries. With a vast ecosystem of specialized C libraries—such as io_uring for asynchronous file operations or libraries for AI and keyboard communication—Panama enables Java developers to access functionality unavailable in pure Java. David demonstrated this by comparing file reading performance: using io_uring with Panama, he read files faster than Java’s standard APIs (e.g., BufferedReader or Channels) in just two nights of work, showcasing Panama’s potential for performance-critical applications.

Building a File System with Fuse

The core of David’s demo was integrating the Fuse (Filesystem in Userspace) library to create a custom file system. Fuse acts as a middle layer, intercepting commands like ls from the terminal and passing them to a Java application via Panama. David explained how Fuse provides a C struct that Java developers can populate with pointers to Java methods, enabling seamless communication between C and Java. This struct, filled with method pointers, is mounted to a directory (e.g., ~/test), allowing the Java application to handle file system operations transparently to the user, who sees only the terminal output.

Memory Management with Arenas

A key component of Panama is its memory management via arenas, which David used to allocate memory for passing strings to Fuse. He demonstrated using Arena.ofShared(), which allows memory sharing across threads and explicit lifetime control via try-with-resources. Other arena types, like Arena.ofConfined() (single-threaded) or Arena.global() (unbounded lifetime), were mentioned for context. David allocated a memory segment to store pointers to a string array (e.g., ["-f", "-d", "~/test"]) and used Arena.allocateFrom() to create C-compatible strings. This ensured safe memory handling when interacting with Fuse, preventing leaks and simplifying resource management.

Downcalls and Upcalls: Bridging Java and C

David detailed the process of making downcalls (Java to C) and upcalls (C to Java). For downcalls, he created a function descriptor mirroring the C method’s signature (e.g., fuse_main_real, returning an int and taking parameters like string arrays and structs). Using Linker.nativeLinker(), he generated a platform-specific linker to invoke the C method. For upcalls, he recreated Fuse’s struct in Java using MemoryLayout.structLayout, populating it with pointers to Java methods like getattr. Tools like JExtract simplified this by generating bindings automatically, reducing boilerplate code. David showed how JExtract creates Java classes from C headers, though it requires an additional abstraction layer for user-friendly APIs.

Implementing File System Operations

David implemented two file system operations: reading files and creating directories. For reading, he extracted the file path from a memory segment using MemorySegment.getString(), checked if it was a valid file, and copied file contents into a buffer with MemorySegment.reinterpret() to handle size constraints. For directory creation, he added paths to a map, demonstrating simplicity. Running the application mounted the file system to ~/test, where commands like mkdir and echo worked seamlessly, with Fuse calling Java methods via upcalls. David unmounted the file system, showing its clean integration. Performance tips included reusing method handles and memory segments to avoid overhead, emphasizing careful memory management.

Links:

Posted in en-US | Tags: CIntegration, DavidVlijmincx, DevoxxBE2024, FileSystem, Fuse, Java, Performance, ProjectPanama | No Comments »

[DotJs2024] Thinking About Your Code: Push vs Pull

Author: Jonathan Lalou

Navigating the currents of performant code demands a lens attuned to flow dynamics, where producers and consumers dance in tandem—or discord. Ben Lesh, a veteran of high-stakes web apps from Netflix’s infrastructure dashboards to RxJS stewardship, shared this paradigm at dotJS 2024. With roots in rendering millions of devices across North America’s bandwidth, Lesh distilled decades of collaboration with elite engineers into a quartet of concepts: producers, consumers, push, pull. These primitives illuminate code’s underbelly, spotlighting concurrency pitfalls, backpressure woes, and optimal primitives for JavaScript’s asynchronous tapestry.

Lesh’s entrée was a bespoke live demo: enlisting audience volunteer Jessica Sachs to juggle M&Ms, embodying production-consumption. Pull—Jessica grabbing at will—affords control but falters asynchronously; absent timely M&Ms, hands empty. Push—Lesh feeding sequentially—frees producers for factories but risks overload, manifesting backpressure as frantic consumption. Code mirrors this: a getValue() invocation pulls synchronously, assigning to a consumer like console.log; for loops iterate pulls from arrays. Yet, actors abound: functions produce, variables consume; callbacks push events, observables compose them.

JavaScript’s arsenal spans quadrants. Pure pull: functions and intervals yield eager values. Push: callbacks for one-offs, observables for streams—RxJS’s forte, enabling operators like map or mergeMap for event orchestration. Pull-then-push hybrids: promises (function returning deferred push) and async iterables (yielding promise-wrapped results), ideal for paced delivery via for await...of, mitigating backpressure in slow consumers. Push-then-pull inverts: signals—Ember computeds, Solid observables, Angular runes—notify changes, deferring reads until render. Lesh previewed TC39 signals: subscribe for pushes, get for pulls, birthing dependency graphs that lazy-compute, tracking granular ties for efficient diffing.

This framework unveils pathologies: thread lockups from unchecked pushes, concurrency clashes in nested callbacks. Lesh advocated scanning code for actors—spotting producers hidden in APIs—and matching primitives to intent. Pull suits sync simplicity; push excels in async firehoses; hybrids temper throughput; signals orchestrate reactive UIs. As frameworks like React lean on signals for controlled reads pre-render, developers gain foresight into bottlenecks, fostering resilient, scalable architectures.

Decoding Flow Primitives in JavaScript

Lesh partitioned primitives into a revealing matrix: pull for immediacy (functions pulling values), push for autonomy (observables dispatching relentlessly). Hybrids like promises bridge, returning handles for eventual pushes; async iterables extend, pacing via awaits. Signals, the push-pull hybrid, notify sans immediate computation—perfect for UI graphs where effects propagate selectively, as in Solid’s fine-grained reactivity or Angular’s zoned eschewal.

Navigating Backpressure and Optimization

Backpressure—producers overwhelming consumers—Lesh dramatized via M&M deluge, solvable by hybrids throttling intake. Signals mitigate via lazy evals: update signals, compute only on get, weaving dependency webs that prune cascades. Lesh urged: interrogate code’s flows—who pushes/pulls?—to preempt issues, leveraging RxJS for composition, signals for reactivity, ensuring apps hum under load.

Links:

Ben Lesh on LinkedIn

Posted in en-US | Tags: BenLesh, dotJS2024, JavaScript, Observables, Performance, PushPull, ReactiveProgramming, RxJS, Signals | No Comments »

[PHPForumParis2022] FrankenPHP: Diving into PHP’s Interpreter, Virtual Machines, and More – Kévin Dunglas

Author: Jonathan Lalou

Kévin Dunglas, a seasoned developer at Les-Tilleuls.coop and creator of API Platform, presented an innovative exploration of FrankenPHP at PHP Forum Paris 2022. Blending PHP with Go, Kévin introduced a groundbreaking server solution that pushes PHP’s boundaries. His talk delved into the technical intricacies of integrating Go’s threading model with PHP’s interpreter, offering a glimpse into a future where PHP applications achieve unprecedented performance and flexibility.

Introducing FrankenPHP

Kévin opened with the origins of FrankenPHP, a project born from his passion for both PHP and Go. Inspired by Les-Tilleuls’ developer Loris Sorio, who designed its logo, FrankenPHP aims to combine PHP’s ease of use with Go’s performance capabilities. Kévin explained how it leverages Go’s threading to overcome PHP-FPM’s limitations, enabling features like concurrent request handling. This fusion, he argued, unlocks new possibilities for PHP applications, particularly in high-performance scenarios.

Overcoming Technical Challenges

Delving into the technical core, Kévin described the complexities of integrating PHP’s Zend Thread Safe (ZTS) mode with Go’s threading model. He highlighted challenges like signal conflicts and the lack of OPcache support, which required custom modifications to PHP’s source code. By isolating PHP processes within Go threads, Kévin’s team achieved stable communication, though he noted the solution remains experimental. His transparency about these hurdles provided valuable insights for developers exploring similar integrations.

Performance and Future Directions

Kévin showcased FrankenPHP’s performance potential, demonstrating how enabling OPcache by modifying PHP’s SAPI list significantly reduced compilation overhead. He outlined future goals, including support for Laravel Octane and Symfony’s CLI, while acknowledging Windows compatibility challenges. Kévin’s call for community contributions to refine FrankenPHP underscored its open-source ethos, inviting developers to explore its code and report issues to enhance its stability.

Community Engagement and Collaboration

Concluding, Kévin emphasized the collaborative spirit driving FrankenPHP’s development. He encouraged developers to contribute via GitHub, highlighting the project’s experimental nature and potential for growth. By sharing Les-Tilleuls’ vision, Kévin inspired attendees to experiment with FrankenPHP, fostering a community-driven effort to redefine PHP’s role in modern web development.

Links:

Posted in en-US | Tags: APIPlatform, FrankenPHP, Go, KévinDunglas, LesTilleuls, LorisSorio, Performance, PHP, PHPForumParis2022 | No Comments »

[KotlinConf2018] Performant Multiplatform Serialization in Kotlin: Eric Cochran’s Approach to Code Sharing

Author: Jonathan Lalou

Lecturer

Eric Cochran is an Android developer at Pinterest, focusing on performance across the app stack. He contributes to open-source projects, notably the Moshi JSON library. Relevant links: Pinterest Engineering Blog (publications); LinkedIn Profile (professional page).

Abstract

This article analyzes Eric Cochran’s exploration of Kotlin Serialization for multiplatform projects, emphasizing its role in enhancing code reuse across platforms. Set in the context of Pinterest’s performance-driven Android development, it examines methodologies for integrating serialization with data formats and frameworks. The analysis highlights innovations in type safety and performance, with implications for cross-platform scalability and library evolution.

Introduction and Context

Eric Cochran presented at KotlinConf 2018, focusing on Kotlin Serialization’s potential to unify code in multiplatform environments. As an Android developer at Pinterest, Cochran’s work on serialization formats like Moshi informed his advocacy for Kotlin’s experimental library. The context is the growing need for shared logic in apps targeting JVM, JS, and Native, where serialization ensures seamless data handling across diverse runtimes.

Methodological Approaches to Serialization

Cochran outlined Kotlin Serialization’s setup: Annotate data classes with @Serializable to generate compile-time adapters, supporting JSON, Protobuf, and CBOR. Integration with frameworks like OkHttp or Ktor involves custom serializers for complex types. He demonstrated parsing dynamic JSON structures, emphasizing compile-time safety over Moshi’s runtime reflection. Performance optimizations included minimizing allocations and leveraging inline classes. Cochran compared Moshi’s factory-based API, noting its JVM-centric limitations versus Kotlin Serialization’s multiplatform readiness.

Analysis of Innovations and Features

Kotlin Serialization innovates with compile-time code generation, avoiding reflection’s overhead, unlike Moshi’s Java type reliance. It supports multiple formats, enhancing flexibility compared to JSON-centric libraries. Inline classes reduce boxing, boosting performance. Limitations include poor dynamic type handling and manual serializer implementation for custom cases. Compared to Moshi, it offers broader platform support but lacks mature metadata APIs.

Implications and Consequences

The library implies greater code sharing in multiplatform apps, reducing duplication and maintenance. Its performance focus suits high-throughput systems like Pinterest’s. Consequences include a shift toward compile-time solutions, though experimental status requires caution. Future integration with Okio’s multiplatform efforts could resolve reflection issues, broadening adoption.

Conclusion

Cochran’s insights position Kotlin Serialization as a cornerstone for multiplatform data handling, offering a performant, type-safe alternative that promises to reshape cross-platform development.

Links

Lecture video: https://www.youtube.com/watch?v=p8Wt_atMA50
Lecturer’s X/Twitter: @ericcochran
Lecturer’s LinkedIn: Eric Cochran
Organization’s X/Twitter: @PinterestEng
Organization’s LinkedIn: Pinterest

Posted in en-US | Tags: Kotlin, KotlinConf, KotlinConf2018, Multiplatform, Performance, Serialization | No Comments »