Jonathan Lalou's Blog

Archive for the ‘en-US’ Category

[NodeCongress2024] Deep Dive into Undici: Architecture, Performance, and the Future of HTTP in Node.js

Lecturer: Matteo Collina

Matteo Collina is an internationally recognized expert in Node.js and open-source architecture, serving as the Co-Founder and CTO of Platformatic. He holds a Ph. D. in “Application Platforms for the Internet of Things”. A member of the Node.js Technical Steering Committee (TSC), he is a major contributor to the platform’s core, with a focus on streams, diagnostics, and the HTTP stack. He is the original author of the highly successful, high-performance web framework Fastify and the ultra-fast JSON logger Pino. His open-source modules are downloaded billions of times annually.

Institutional Profile: Matteo Collina
Publications: Matteo Collina’s talks, articles, workshops, certificates – GitNation

Abstract

This article presents a technical analysis of Undici, the high-performance, standards-compliant HTTP/1.1 client that serves as the foundation for the native fetch() API in Node.js. It explains the motivation for Undici’s creation—addressing critical performance and protocol deficiencies in the legacy Node.js stack. The article details the core architectural components, particularly the Client and Dispatcher abstractions, and explains how Undici achieves superior efficiency through advanced connection management and HTTP/1.1 pipelining. The final analysis explores the methodological implications of Undici’s modularity, including enabling zero-overhead internal testing and powering highly efficient modular monolith and microservice runtimes.

Context: Limitations of the Legacy Node.js HTTP Stack

The legacy Node.js HTTP client suffered from several long-standing limitations, primarily in performance and compliance with modern standards. Specifically, it lacked proper support for HTTP/1.1 pipelining—the ability to send multiple requests sequentially over a single connection without waiting for the first response. Furthermore, its connection pool management was inefficient, often failing to enforce proper limits, leading to potential resource exhaustion and performance bottlenecks. Undici was developed to resolve these architectural deficiencies, becoming the native engine for fetch() within Node.js core.

Architecture and Methodology of Undici

Undici’s design is centered around optimizing connection usage and abstracting the request lifecycle:

The Client and Connection Pools: The core component is the Client, which is scoped to a single origin (protocol, hostname, and port). The Client manages a pool of TCP connections and is responsible for implementing the efficiency of the HTTP protocol.
Pipelining for Performance: Undici explicitly implements HTTP/1.1 pipelining. This methodology permits the efficient use of the network and is essential for maximum HTTP/1.1 performance, particularly when connecting to modern servers that support the feature.
The Dispatcher Abstraction: Undici utilizes a pluggable Dispatcher interface. This abstraction governs the process of taking a request, managing connection logic, and writing the request to a socket. Key Dispatcher implementations include the standard Client (for a single origin) and the Agent (for multiple origins).
Connection Management: The pooling mechanism employs a strategy to retire connections gracefully to allow DNS changes and resource rotation, contrasting with legacy systems that often held connections indefinitely.

Consequences and Architectural Innovations

Undici’s modular and abstracted architecture has led to significant innovations beyond core HTTP performance:

In-Process Request Testing: The Dispatcher model allows for the implementation of a MockClient (accessible via the light-my-request module), which completely bypasses the network stack. This permits the injection of HTTP requests directly into a running Node.js server within the same process, enabling zero-overhead, high-speed unit and integration testing without opening any actual sockets.
Internal Mesh Networking: The architecture enables a unique pattern for running multiple microservices within a single process. Using a custom dispatcher (fastify-undici-dispatcher), internal HTTP requests can be routed directly to other services (e.g., Fastify instances) running in the same process via an in-memory mesh network, completely bypassing the network layer for inter-service communication. This methodology, employed in the Platformatic runtime, allows developers to transition from a modular monolith to a microservice architecture with minimal code changes, retaining maximum performance for inter-service calls.

Links

Lecture Video: Deep Dive into Undici – Matteo Collina, Node Congress 2024
Lecturer’s X/Twitter: https://twitter.com/matteocollina
Organization: https://platformatic.dev/

Hashtags: #Undici #NodeJS #HTTPClient #Fastify #Microservices #PerformanceEngineering #Platformatic

Posted in en-US | Tags: NodeCongress2024 | No Comments »

[DotJs2024] Thinking About Your Code: Push vs Pull

Author: Jonathan Lalou

Navigating the currents of performant code demands a lens attuned to flow dynamics, where producers and consumers dance in tandem—or discord. Ben Lesh, a veteran of high-stakes web apps from Netflix’s infrastructure dashboards to RxJS stewardship, shared this paradigm at dotJS 2024. With roots in rendering millions of devices across North America’s bandwidth, Lesh distilled decades of collaboration with elite engineers into a quartet of concepts: producers, consumers, push, pull. These primitives illuminate code’s underbelly, spotlighting concurrency pitfalls, backpressure woes, and optimal primitives for JavaScript’s asynchronous tapestry.

Lesh’s entrée was a bespoke live demo: enlisting audience volunteer Jessica Sachs to juggle M&Ms, embodying production-consumption. Pull—Jessica grabbing at will—affords control but falters asynchronously; absent timely M&Ms, hands empty. Push—Lesh feeding sequentially—frees producers for factories but risks overload, manifesting backpressure as frantic consumption. Code mirrors this: a getValue() invocation pulls synchronously, assigning to a consumer like console.log; for loops iterate pulls from arrays. Yet, actors abound: functions produce, variables consume; callbacks push events, observables compose them.

JavaScript’s arsenal spans quadrants. Pure pull: functions and intervals yield eager values. Push: callbacks for one-offs, observables for streams—RxJS’s forte, enabling operators like map or mergeMap for event orchestration. Pull-then-push hybrids: promises (function returning deferred push) and async iterables (yielding promise-wrapped results), ideal for paced delivery via for await...of, mitigating backpressure in slow consumers. Push-then-pull inverts: signals—Ember computeds, Solid observables, Angular runes—notify changes, deferring reads until render. Lesh previewed TC39 signals: subscribe for pushes, get for pulls, birthing dependency graphs that lazy-compute, tracking granular ties for efficient diffing.

This framework unveils pathologies: thread lockups from unchecked pushes, concurrency clashes in nested callbacks. Lesh advocated scanning code for actors—spotting producers hidden in APIs—and matching primitives to intent. Pull suits sync simplicity; push excels in async firehoses; hybrids temper throughput; signals orchestrate reactive UIs. As frameworks like React lean on signals for controlled reads pre-render, developers gain foresight into bottlenecks, fostering resilient, scalable architectures.

Decoding Flow Primitives in JavaScript

Lesh partitioned primitives into a revealing matrix: pull for immediacy (functions pulling values), push for autonomy (observables dispatching relentlessly). Hybrids like promises bridge, returning handles for eventual pushes; async iterables extend, pacing via awaits. Signals, the push-pull hybrid, notify sans immediate computation—perfect for UI graphs where effects propagate selectively, as in Solid’s fine-grained reactivity or Angular’s zoned eschewal.

Navigating Backpressure and Optimization

Backpressure—producers overwhelming consumers—Lesh dramatized via M&M deluge, solvable by hybrids throttling intake. Signals mitigate via lazy evals: update signals, compute only on get, weaving dependency webs that prune cascades. Lesh urged: interrogate code’s flows—who pushes/pulls?—to preempt issues, leveraging RxJS for composition, signals for reactivity, ensuring apps hum under load.

Links:

Ben Lesh on LinkedIn

Posted in en-US | Tags: BenLesh, dotJS2024, JavaScript, Observables, Performance, PushPull, ReactiveProgramming, RxJS, Signals | No Comments »

[SpringIO2024] Text-to-SQL: Chat with a Database Using Generative AI by Victor Martin & Corrado De Bari @ Spring I/O 2024

Author: Jonathan Lalou

At Spring I/O 2024 in Barcelona, Victor Martin, a product manager for Oracle Database, delivered a compelling session on Text-to-SQL, a transformative approach to querying databases using natural language, powered by generative AI. Stepping in for his colleague Corrado De Bari, who was unable to attend, Victor explored how Large Language Models (LLMs) and Oracle’s innovative tools, including Spring AI and Select AI, enable business users with no SQL expertise to interact seamlessly with databases. The talk highlighted practical implementations, security considerations, and emerging technologies like AI Vector Search, offering a glimpse into the future of database interaction.

The Promise of Text-to-SQL

Text-to-SQL leverages LLMs to translate natural language queries into executable SQL, democratizing data access for non-technical users. Victor began by posing a challenge: how long would it take to build a REST endpoint for a business user to query a database using plain text? Traditionally, this task required manual SQL construction, schema validation, and error handling. With modern frameworks like Spring Boot and Oracle’s Select AI, this process is streamlined. Select AI, integrated into Oracle Database 19c and enhanced in 23 AI, supports features like RUN_SQL to execute generated queries, NARRATE to return results as human-readable text, and EXPLAIN_SQL to detail query reasoning. Victor emphasized that these tools reduce development time, enabling rapid deployment of user-friendly database interfaces.

Configuring Oracle Database for Text-to-SQL

Implementing Text-to-SQL requires minimal configuration within Oracle Database. Victor outlined the steps: first, set up an Access Control List (ACL) to allow external LLM calls, specifying the host and port. Next, create credentials for the LLM service (e.g., Oracle Cloud Infrastructure Generative AI, Open AI, or Azure Open AI) using the DBMS_CLOUD_AI package. Finally, define a profile linking the schema, tables, and chosen LLM. This profile is selected per session to ensure queries use the correct context. Victor demonstrated this with a Spring Boot application, where the profile is set before invoking Select AI. The simplicity of this setup, combined with Spring AI’s abstraction, makes it accessible even for developers new to AI-driven database interactions.

Enhancing Queries with Schema Annotations

A key challenge in Text-to-SQL is ensuring LLMs interpret ambiguous schemas correctly. Victor highlighted that table and column names like “C1” or “Table1” can confuse models. To address this, Oracle Database supports annotations—comments on tables and columns that provide business context. For example, annotating a column as “process status” with possible values clarifies its purpose, aiding the LLM in generating accurate joins and filters. These annotations, which don’t affect production queries, are created collaboratively by DBAs and business stakeholders. Victor shared a real-world example from Oracle’s telecom applications, where annotated schemas improved query precision, enabling complex queries without manual intervention.

AI Vector Search: Querying Unstructured Data

Victor introduced AI Vector Search, a cutting-edge feature in Oracle Database 23 AI, which extends Text-to-SQL to unstructured data. Unlike traditional SQL, which queries structured data, vector search encodes text, images, or audio into high-dimensional vectors representing semantic meaning. These vectors, stored as a new VECTOR data type, enable similarity-based queries. For instance, a job search query for “software engineer positions in New York” can combine structured filters (e.g., location) with vector-based matching of job descriptions and resumes. Victor explained how embedding models, deployed via Oracle’s DBMS_DATA_MINING package, generate these vectors, with metrics like cosine similarity determining relevance. This capability opens new use cases, from document search to personalized recommendations.

Links:

Posted in en-US | Tags: AIVectorSearch, CorradoDeBari, GenerativeAI, OracleDatabase, SelectAI, SpringAI, SpringIO2024, TextToSQL, VictorMartin | No Comments »

[OxidizeConf2024] Continuous Compliance with Rust in Automotive Software

Author: Jonathan Lalou

Introduction to Automotive Compliance

The automotive industry, with its intricate blend of mechanical and electronic systems, demands rigorous standards to ensure safety and reliability. Vignesh Radhakrishnan from Thoughtworks delivered an insightful presentation at OxidizeConf2024, exploring the concept of continuous compliance in automotive software development using Rust. He elucidated how the shift from mechanical to software-driven vehicles has amplified the need for robust compliance processes, particularly in adhering to standards like ISO 26262 and Automotive SPICE (ASPICE). These standards are pivotal in ensuring that automotive software meets stringent safety and quality requirements, safeguarding drivers and passengers alike.

Vignesh highlighted the transformation in the automotive landscape, where modern vehicles integrate complex software for features like adaptive headlights and reverse assist cameras. Unlike mechanical components with predictable failure patterns, software introduces variability that necessitates standardized compliance to maintain quality. The presentation underscored the challenges of traditional compliance methods, which are often manual, disconnected from development workflows, and conducted at the end of the development cycle, leading to inefficiencies and delayed feedback.

Continuous Compliance: A Paradigm Shift

Continuous compliance represents a transformative approach to integrating safety and quality assessments into the software development lifecycle. Vignesh emphasized that this practice involves embedding compliance checks within the development pipeline, allowing for immediate feedback on non-compliance issues. By maintaining documentation close to the code, such as requirements and test cases, developers can ensure traceability and accountability. This method not only streamlines the audit process but also reduces the mean-time-to-recovery when issues arise, enhancing overall efficiency.

The use of open-source tools like Sphinx, a Python documentation generator, was a focal point of Vignesh’s talk. Sphinx facilitates bidirectional traceability by linking requirements to code components, enabling automated generation of audit-ready documentation in HTML and PDF formats. Vignesh demonstrated a proof-of-concept telemetry project, showcasing how Rust’s cohesive toolchain, including Cargo and Clippy, integrates seamlessly with these tools to produce compliant software artifacts. This approach minimizes manual effort and ensures that compliance is maintained iteratively with every code commit.

Rust’s Role in Simplifying Compliance

Rust’s inherent features make it an ideal choice for automotive software development, particularly in achieving continuous compliance. Vignesh highlighted Rust’s robust toolchain, which includes tools like Cargo for building, testing, and formatting code. Unlike C or C++, where developers rely on disparate tools from multiple vendors, Rust offers a unified, developer-friendly environment. This cohesiveness simplifies the integration of compliance processes into continuous integration (CI) pipelines, as demonstrated in Vignesh’s example using CircleCI to automate compliance checks.

Moreover, Rust’s emphasis on safety and ownership models reduces common programming errors, aligning well with the stringent requirements of automotive standards. By leveraging Rust’s capabilities, developers can produce cleaner, more maintainable code that inherently supports compliance efforts. Vignesh’s example of generating traceability matrices and architectural diagrams using open-source tools like PlantUML further illustrated how Rust can enhance the compliance process, making it more accessible and cost-effective.

Practical Implementation and Benefits

In his demonstration, Vignesh showcased a practical implementation of continuous compliance using a telemetry project that streams data to AWS. By integrating Sphinx with Rust code, he illustrated how requirements, test cases, and architectural designs could be documented and linked automatically. This setup allows for real-time compliance assessments, ensuring that software remains audit-ready at all times. The use of open-source plugins and tools provides flexibility, enabling adaptation to various input sources like Jira, further streamlining the process.

The benefits of this approach are manifold. Continuous compliance fosters greater accountability within development teams, as non-compliance issues are identified early. It also enhances flexibility by allowing integration with existing project tools, reducing dependency on proprietary solutions. Vignesh cited the Ferrocene compiler as a real-world example, where similar open-source tools have been used to generate compliance artifacts, demonstrating the feasibility of this approach in large-scale projects.

Links:

Thoughtworks company website

Posted in en-US | Tags: ASPICE, Automotive, FerrousSystems, ISO26262, OxidizeConf2024, Programming, RustDev, RustLang, Thoughtworks, VigneshRadhakrishnan | No Comments »

[DevoxxUK2024] Game, Set, Match: Transforming Live Sports with AI-Driven Commentary by Mark Needham & Dunith Danushka

Author: Jonathan Lalou

Mark Needham, from ClickHouse’s product team, and Dunith Danushka, a Senior Developer Advocate at Redpanda, presented an innovative experiment at DevoxxUK2024, showcasing an AI-driven co-pilot for live sports commentary. Inspired by the BBC’s live text commentary for sports like tennis and football, their solution automates repetitive summarization tasks, freeing human commentators to focus on nuanced insights. By integrating Redpanda for streaming, ClickHouse for analytics, and a large language model (LLM) for text generation, they demonstrate a scalable architecture for real-time commentary. Their talk details the technical blueprint, practical implementation, and broader applications, offering a compelling pattern for generative AI in streaming data contexts.

Real-Time Data Streaming with Redpanda

Dunith introduces Redpanda, a Kafka-compatible streaming platform written in C++ to maximize modern hardware efficiency. Unlike Kafka, Redpanda consolidates components like the broker, schema registry, and HTTP proxy into a single binary, simplifying deployment and management. Its web-based console and CLI (rpk) facilitate debugging and administration, such as creating topics and inspecting payloads. In their demo, Mark and Dunith simulate a tennis match by feeding JSON-formatted events into a Redpanda topic named “points.” These events, capturing match details like scores and players, are published at 20x speed using a Python script with the Twisted library. Redpanda’s ability to handle high-throughput streams—hundreds of thousands of messages per second—ensures robust real-time data ingestion, setting the stage for downstream processing.

Analytics with ClickHouse

Mark explains ClickHouse’s role as a column-oriented analytics database optimized for aggregation queries. Unlike row-oriented databases like PostgreSQL, ClickHouse stores columns contiguously, enabling rapid processing of operations like counts or averages. Its vectorized query execution processes column chunks in parallel, enhancing performance for analytics tasks. In the demo, events from Redpanda are ingested into ClickHouse via a Kafka engine table, which mirrors the “points” topic. A materialized view transforms incoming JSON data into a structured table, converting timestamps and storing match metadata. Mark also creates a “matches” table for historical context, demonstrating ClickHouse’s ability to ingest streaming data in real time without batch processing, a key feature for dynamic applications.

Generating Commentary with AI

The core innovation lies in generating human-like commentary using an LLM, specifically OpenAI’s model. Mark and Dunith design a Streamlit-based web application, dubbed the “Live Text Commentary Admin Center,” where commentators can manually input text or trigger AI-generated summaries. The application queries ClickHouse for recent events (e.g., the last minute or game) using SQL, converts results to JSON, and feeds them into the LLM with a prompt instructing it to write concise, present-tense summaries for tennis fans. For example, a query retrieving the last game’s events might yield, “Zverev and Alcaraz slug it out in an epic five-set showdown.” While effective with frontier models like GPT-4, smaller models like Llama 3 struggled, highlighting the need for robust LLMs. The generated text is published to a Redpanda “live_text” topic, enabling flexible consumption.

Broadcasting and Future Applications

To deliver commentary to end users, Mark and Dunith employ Server-Sent Events (SSE) via a FastAPI server, streaming Redpanda’s “live_text” topic to a Streamlit web app. This setup mirrors real-world applications like Wikipedia’s recent changes feed, ensuring low-latency updates. The demo showcases commentary appearing in real time, with potential extensions like tweeting updates or storing them in a data warehouse. Beyond sports, Dunith highlights the architecture’s versatility for domains like live auctions, traffic updates, or food delivery tracking (e.g., Uber Eats notifications). Future enhancements include fine-tuning smaller LLMs, integrating fine-grained statistics via text-to-SQL, or summarizing multiple matches for comprehensive coverage, demonstrating the pattern’s adaptability for real-time generative applications.

Links:

Posted in en-US | Tags: AI, ClickHouse, DevoxxUK2024, DunithDanushka, MarkNeedham, Redpanda, SportsCommentary, StreamingData | No Comments »

[DotAI2024] DotAI 2024: Ines Montani – Crafting Resilient NLP Systems in the Generative Era

Author: Jonathan Lalou

Ines Montani, co-founder and CEO of Explosion AI, illuminated the pitfalls and potentials of natural language processing pipelines at DotAI 2024. As a core contributor to spaCy—an open-source NLP powerhouse—and Prodigy, a data annotation suite, Montani champions modular tools that blend human intuition with computational might. Her address critiqued the “prompts suffice” ethos, advocating hybrid architectures that fuse rules, examples, and generative flair for robust, production-viable solutions.

Harmonizing Paradigms for Enduring Intelligence

Montani traced instruction evolution: from rigid rules yielding brittle brittleness to supervised learning’s nuanced exemplars, now augmented by in-context prompts’ linguistic alchemy. Rules shine in clarity for novices, yet crumble under data flux; examples infuse domain savvy but demand curation toil; prompts democratize prototyping, yet hallucinate sans anchors.

The synergy? Layered pipelines where rules scaffold prompts, examples calibrate outputs, and LLMs infuse creativity. Montani showcased spaCy’s evolution: rule-based tokenizers ensure consistency, while generative components handle ambiguity, like entity resolution in noisy texts. This modularity mitigates drift, preserving fidelity across model swaps.

In industrial extraction—parsing resumes or contracts—Montani stressed data’s primacy: raw inputs reveal logic gaps, prompting refactorings that unearth “window-knocking machines”—flawed proxies mistaking correlation for causation. A chatbot querying calendars, she analogized, falters if oblivious to time zones; true utility demands holistic orchestration.

Fostering Modularity Amid Generative Hype

Montani cautioned against abstraction overload: leaky layers spawn brittle facades, where one-liners unravel on edge cases. Instead, embrace transparency—Prodigy’s active learning loops refine datasets iteratively, blending human oversight with AI proposals to curb over-reliance.

Retrieval-augmented generation (RAG) exemplifies balanced integration: LLMs query structured stores, yielding chat interfaces atop databases, supplanting clunky GUIs. Yet, Montani warned, context dictates efficacy; for analytical dives, raw views trump conversational veils.

Her ethos: interrogate intent—who wields the tool, what risks lurk? Surprise greets data dives, unveiling bespoke logics that generative magic alone can’t conjure. Efficiency, privacy, and modularity—spaCy’s hallmarks—thwart big-tech monoliths, empowering bespoke ingenuity.

In sum, Montani’s blueprint rejects compromise: generative AI amplifies, not supplants, principled engineering, birthing interfaces that endure and elevate.

Links:

Posted in en-US | Tags: DotAI2024, GenerativeAI, InesMontani, NLP, Prodigy, spaCy | No Comments »

[PHPForumParis2023] Experience Report: Building Two Open-Source Personal AIs with OpenAI – Maxime Thoonsen

Author: Jonathan Lalou

Maxime Thoonsen, CTO at Theodo, shared an exhilarating session at Forum PHP 2023, detailing his experience building two open-source personal AI applications using OpenAI’s technologies. As an organizer of the Generative AI Paris Meetup, Maxime’s passion for the PHP community and innovative AI solutions shone through. His step-by-step approach demystified AI development, encouraging PHP developers to explore generative AI by demonstrating its simplicity and potential through practical examples.

Understanding Generative AI’s Potential

Maxime began by introducing the capabilities of generative AI, emphasizing its accessibility for PHP developers. He explained how OpenAI’s APIs enable the creation of applications that process and generate human-like text. Drawing from his work at Theodo, Maxime showcased two personal AI projects, illustrating how they leverage semantic search and embeddings to deliver tailored responses. His enthusiasm for the community, where he began his speaking career, underscored the collaborative spirit driving AI innovation.

Practical AI Development with OpenAI

Delving into the technical details, Maxime walked the audience through building AI applications using OpenAI’s APIs. He highlighted the simplicity of implementing semantic search to retrieve relevant data from documents, advising against premature fine-tuning in favor of straightforward similarity searches. Responding to an audience question, Maxime noted the availability of open-source alternatives like Llama and Mistral, though he acknowledged OpenAI’s GPT-4 as a leader in embedding accuracy. His examples empowered developers to start building AI-driven features in their PHP projects.

Navigating the AI Ecosystem

Maxime concluded by addressing the rapidly evolving AI landscape, likening it to the proliferation of JavaScript frameworks. He emphasized the cost-effectiveness of smaller open-source models for specific use cases, while noting OpenAI’s edge in precision. His talk inspired developers to join communities like the Generative AI Paris Meetup to explore AI further, fostering a sense of curiosity and experimentation within the PHP ecosystem.

Links:

Posted in en-US | Tags: GenerativeAI, MaximeThoonsen, OpenAI, PHP, PHPForumParis2023, Theodo | No Comments »

[SpringIO2024] Serverless Java with Spring by Maximilian Schellhorn & Dennis Kieselhorst @ Spring I/O 2024

Author: Jonathan Lalou

Serverless computing has transformed application development by abstracting infrastructure management, offering fine-grained scaling, and billing only for execution time. At Spring I/O 2024 in Barcelona, Maximilian Schellhorn and Dennis Kieselhorst, AWS Solutions Architects, shared their expertise on building serverless Java applications with Spring. Their session explored running existing Spring Boot applications in serverless environments and developing event-driven applications using Spring Cloud Function, with a focus on performance optimizations and practical tooling.

The Serverless Paradigm

Maximilian began by contrasting traditional containerized applications with serverless architectures. Containers, while resource-efficient, require developers to manage orchestration, networking, and scaling. Serverless computing, exemplified by AWS Lambda, eliminates these responsibilities, allowing developers to focus on code. Maximilian highlighted four key promises: reduced operational overhead, automatic granular scaling, pay-per-use billing, and high availability. Unlike containers, which remain active and incur costs even when idle, serverless functions scale to zero, executing only in response to events like API requests or queue messages, optimizing cost and resource utilization.

Spring Cloud Function for Event-Driven Development

Spring Cloud Function simplifies serverless development by enabling developers to write event-driven applications as Java functions. Maximilian demonstrated how it leverages Spring Boot’s familiar features—autoconfiguration, dependency injection, and testing—while abstracting cloud-specific details. Functions receive event payloads (e.g., JSON from API Gateway or Kafka) and can convert them into POJOs, streamlining business logic implementation. The framework’s generic invoker supports function routing, allowing multiple functions within a single codebase, and enables local testing via HTTP endpoints. This portability ensures applications can target various serverless platforms without vendor lock-in, enhancing flexibility.

Adapting Existing Spring Applications

For teams with existing Spring Boot applications, Dennis introduced the AWS Serverless Java Container, an open-source library acting as an adapter to translate serverless events into Java Servlet requests. This allows REST controllers to function unchanged in a serverless environment. Version 2.0.2, released during the conference, supports Spring Boot 3 and integrates with Spring Cloud Function. Dennis emphasized its ease of use: add the library, configure a handler, and deploy. While this approach incurs some overhead compared to native functions, it enables rapid migration of legacy applications, preserving existing investments without requiring extensive rewrites.

Optimizing Performance with SnapStart and GraalVM

Performance, particularly cold start times, is a critical concern in serverless Java applications. Dennis addressed this by detailing AWS Lambda SnapStart, which snapshots the initialized JVM and micro-VM, reducing startup times by up to 80% without additional costs. SnapStart, integrated with Spring Boot 3.2’s CRaC (Coordinated Restore at Checkpoint) support, manages initialization hooks to handle resources like database connections. For further optimization, Maximilian discussed GraalVM native images, which compile Java code into binaries for faster startups and lower memory usage. However, GraalVM’s complexity and framework limitations make SnapStart the preferred starting point, with GraalVM reserved for extreme performance needs.

Practical Considerations and Tooling

Maximilian and Dennis stressed practical considerations, such as database connection management and observability. Serverless scaling can overwhelm traditional databases, necessitating connection pooling adjustments or proxies like AWS RDS Proxy. Observability in Lambda relies on a push model, integrating with tools like CloudWatch, X-Ray, or OpenTelemetry, though additional layers may impact performance. To aid adoption, they offered a Lambda Workshop and a Serverless Java Replatforming Guide, providing hands-on learning and written guidance. These resources, accessible via AWS accounts, empower developers to experiment and apply serverless principles effectively.

Links:

Posted in en-US | Tags: AWSLambda, DennisKieselhorst, GraalVM, Java, MaximilianSchellhorn, Serverless, SnapStart, SpringCloudFunction, SpringIO2024 | No Comments »

[DevoxxUK2024] Enter The Parallel Universe of the Vector API by Simon Ritter

Author: Jonathan Lalou

Simon Ritter, Deputy CTO at Azul Systems, delivered a captivating session at DevoxxUK2024, exploring the transformative potential of Java’s Vector API. This innovative API, introduced as an incubator module in JDK 16 and now in its eighth iteration in JDK 23, empowers developers to harness Single Instruction Multiple Data (SIMD) instructions for parallel processing. By leveraging Advanced Vector Extensions (AVX) in modern processors, the Vector API enables efficient execution of numerically intensive operations, significantly boosting application performance. Simon’s talk navigates the intricacies of vector computations, contrasts them with traditional concurrency models, and demonstrates practical applications, offering developers a powerful tool to optimize Java applications.

Understanding Concurrency and Parallelism

Simon begins by clarifying the distinction between concurrency and parallelism, a common source of confusion. Concurrency involves tasks that overlap in execution time but may not run simultaneously, as the operating system may time-share a single CPU. Parallelism, however, ensures tasks execute simultaneously, leveraging multiple CPUs or cores. For instance, two users editing documents on separate machines achieve parallelism, while a single-core CPU running multiple tasks creates the illusion of parallelism through time-sharing. Java’s threading model, introduced in JDK 1.0, facilitates concurrency via the Thread class, but coordinating data sharing across threads remains challenging. Simon highlights how Java evolved with the concurrency utilities in JDK 5, the Fork/Join framework in JDK 7, and parallel streams in JDK 8, each simplifying concurrent programming while introducing trade-offs, such as non-deterministic results in parallel streams.

The Essence of Vector Processing

The Vector API, distinct from the legacy java.util.Vector class, enables true parallel processing within a single execution unit using SIMD instructions. Simon explains that vectors in mathematics represent sets of values, unlike scalars, and the Vector API applies this concept by storing multiple values in wide registers (e.g., 256-bit AVX2 registers). These registers, divided into lanes (e.g., eight 32-bit integers), allow a single operation, such as adding a constant, to process all lanes in one clock cycle. This contrasts with iterative loops, which process elements sequentially. Historical context reveals SIMD’s roots in 1960s supercomputers like the ILLIAC IV and Cray-1, with modern implementations in Intel’s MMX, SSE, and AVX instructions, culminating in AVX-512 with 512-bit registers. The Vector API abstracts these complexities, enabling developers to write cross-platform code without targeting specific microarchitectures.

Leveraging the Vector API

Simon illustrates the Vector API’s practical application through its core components: Vector, VectorSpecies, and VectorShape. The Vector class, parameterized by type (e.g., Integer), supports operations like addition and multiplication across all lanes. Subclasses like IntVector handle primitive types, offering methods like fromArray to populate vectors from arrays. VectorShape defines register sizes (64 to 512 bits or S_MAX for the largest available), ensuring portability across architectures like Intel and ARM. VectorSpecies combines type and shape, specifying, for example, an IntVector with eight lanes in a 256-bit register. Simon demonstrates a loop processing a million-element array, using VectorSpecies to calculate iterations based on lane count, and employs VectorMask to handle partial arrays, ensuring no side effects from unused lanes. This approach optimizes performance for numerically intensive tasks, such as matrix computations or data transformations.

Performance Insights and Trade-offs

The Vector API’s performance benefits shine in specific scenarios, particularly when autovectorization by the JIT compiler is insufficient. Simon references benchmarks from Tomas Zezula, showing that explicit Vector API usage outperforms autovectorization for small arrays (e.g., 64 elements) due to better register utilization. However, for larger arrays (e.g., 2 million elements), memory access latency—100+ cycles for RAM versus 3-5 for L1 cache—diminishes gains. Conditional operations, like adding only even-valued elements, further highlight the API’s value, as the C2 JIT compiler often fails to autovectorize such cases. Azul’s Falcon JIT compiler, based on LLVM, improves autovectorization, but explicit Vector API usage remains superior for complex operations. Simon emphasizes that while the API offers significant flexibility through masks and shuffles, its benefits wane with large datasets due to memory bottlenecks.

Links:

Azul Systems website

Posted in en-US | Tags: AzulSystems, DevoxxUK2024, Java, PerformanceOptimization, SIMD, SimonRitter, VectorAPI | No Comments »

Predictive Modeling and the Illusion of Signal

Author: Jonathan Lalou

Introduction

Vincent Warmerdam delves into the illusions often encountered in predictive modeling, highlighting the cognitive traps and statistical misconceptions that lead to overconfidence in model performance.

The Seduction of Spurious Correlations

Models often perform well on training data by exploiting noise rather than genuine signal. Vincent emphasizes critical thinking and statistical rigor to avoid being misled by deceptively strong results.

Building Robust Models

Using robust cross-validation, considering domain knowledge, and testing against out-of-sample data are vital strategies to counteract the illusion of predictive prowess.

Conclusion

Data science is not just coding and modeling — it requires constant skepticism, critical evaluation, and humility. Vincent reminds us to stay vigilant against the comforting but dangerous mirage of false predictability.

Posted in en-US | Tags: PyData2024, Python | No Comments »