Jonathan Lalou's Blog

Archive for the ‘en-US’ Category

[DevoxxPL2022] Before It’s Too Late: Finding Real-Time Holes in Data • Chayim Kirshen

Chayim Kirshen, a veteran of the startup ecosystem and client manager at Redis, captivated audiences at Devoxx Poland 2022 with a dynamic exploration of real-time data pipeline challenges. Drawing from his experience with high-stakes environments, including a 2010 stock exchange meltdown, Chayim outlined strategies to ensure data integrity and performance in large-scale systems. His talk provided actionable insights for developers, emphasizing the importance of storing raw data, parsing in real time, and leveraging technologies like Redis to address data inconsistencies.

The Perils of Unclean Data

Chayim began with a stark reality: data is rarely clean. Recounting a 2010 incident where hackers compromised a major stock exchange’s API, he highlighted the cascading effects of unreliable data on real-time markets. Data pipelines face issues like inconsistent formats (CSV, JSON, XML), changing sources (e.g., shifting API endpoints), and service reliability, with modern systems often tolerating over a thousand minutes of downtime annually. These challenges disrupt real-time processing, critical for applications like stock exchanges or ad bidding networks requiring sub-100ms responses. Chayim advocated treating data as programmable code, enabling developers to address issues systematically rather than reactively.

Building Robust Data Pipelines

To tackle these issues, Chayim proposed a structured approach to data pipeline design. Storing raw data indefinitely—whether in S3, Redis, or other storage—ensures a fallback for reprocessing. Parsing data in real time, using defined schemas, allows immediate usability while preserving raw inputs. Bulk changes, such as SQL bulk inserts or Redis pipelines, reduce network overhead, critical for high-throughput systems. Chayim emphasized scheduling regular backfills to re-import historical data, ensuring consistency despite source changes. For example, a stock exchange’s ticker symbol updates (e.g., Fitbit to Google) require ongoing reprocessing to maintain accuracy. Horizontal scaling, using disposable nodes, enhances availability and resilience, avoiding single points of failure.

Real-Time Enrichment and Redis Integration

Data enrichment, such as calculating stock bid-ask spreads or market cap changes, should occur post-ingestion to avoid slowing the pipeline. Chayim showcased Redis, particularly its Gears and JSON modules, for real-time data processing. Redis acts as a buffer, storing raw JSON and replicating it to traditional databases like PostgreSQL or MySQL. Using Redis Gears, developers can execute functions within the database, minimizing network costs and enabling rapid enrichment. For instance, calculating a stock’s daily percentage change can run directly in Redis, streamlining analytics. Chayim highlighted Python-based tools like Celery for scheduling backfills and enrichments, allowing asynchronous processing and failure retries without disrupting the main pipeline.

Scaling and Future-Proofing

Chayim stressed horizontal scaling to distribute workloads geographically, placing data closer to users for low-latency access, as seen in ad networks. By using Redis for real-time writes and offloading to workers via Celery, developers can manage millions of daily entries, such as stock ticks, without performance bottlenecks. Scheduled backfills address data gaps, like API schema changes (e.g., integer to string conversions), by reprocessing raw data. This approach, combined with infrastructure-as-code tools like Terraform, ensures scalability and adaptability, allowing organizations to focus on business logic rather than data management overhead.

Links:

Redis company website

Posted in en-US | Tags: BigData, ChayimKirshen, DataPipelines, DevoxxPL2022, DevoxxPoland, RealTimeData, Redis, StockExchange | No Comments »

[DevoxxPL2022] From Private Through Hybrid to Public Cloud – Product Migration • Paweł Piekut

Author: Jonathan Lalou

At Devoxx Poland 2022, Paweł Piekut, a seasoned software developer at Bosch, delivered an insightful presentation on the migration of their e-bike cloud platform from a private cloud to a public cloud environment. Drawing from his expertise in Java, Kotlin, and .NET, Paweł narrated the intricate journey of transitioning a complex IoT ecosystem, highlighting the technical challenges, strategic decisions, and lessons learned. His talk offered a practical roadmap for organizations navigating the complexities of cloud migration, emphasizing the balance between innovation, scalability, and compliance.

Navigating the Private Cloud Landscape

Paweł began by outlining the initial deployment of Bosch’s e-bike cloud on a private cloud developed internally by the company’s IT group. This proprietary platform, designed to support the e-bike ecosystem, facilitated communication between hardware components—such as drive units, batteries, and controllers—and the mobile app, which interfaced with the cloud. The cloud served multiple stakeholders, including factories for device flashing, manufacturers for configuration, authorized services for diagnostics, and end-users for features like activity tracking and bike locking. However, the private cloud faced significant limitations. Scalability was constrained, requiring manual capacity requests and investments, which hindered agility. Downtimes were frequent, acceptable for development but untenable for production. Additionally, the platform’s bespoke nature made it challenging to hire experienced talent and limited developer engagement due to its lack of market-standard tools.

Despite these drawbacks, the private cloud offered advantages. Its deployment within Bosch’s secure network ensured high performance and simplified compliance with data privacy regulations, critical for an international product subject to data localization laws. Costs were predictable, and the absence of vendor lock-in, thanks to open-source frameworks, provided flexibility. However, the need for modern scalability and developer-friendly tools drove the decision to explore public cloud solutions, with Amazon Web Services (AWS) selected for its robust support.

The Hybrid Cloud Conundrum

Transitioning to a hybrid cloud model introduced a blend of private and public cloud environments, creating new challenges. Bosch’s internal policy of “on-transit data” required data processed in the public cloud to be returned to the private cloud, necessitating complex and secure data transfers. While AWS Direct Connect facilitated this, the hybrid setup led to operational complexities. Only select services ran on AWS, causing a divide among developers eager to work with widely recognized public cloud tools. Technical issues, such as Kafka’s inaccessibility from the private cloud, required significant effort to resolve. Error tracing across clouds was cumbersome, with Splunk used in the private cloud and Elasticsearch in the public cloud, complicating root-cause analysis. The simultaneous migration of Jenkins added further complexity, with duplicated jobs and confusing configurations.

Despite these hurdles, the hybrid model offered benefits. It allowed Bosch to leverage the private cloud’s security for sensitive data while tapping into the public cloud’s scalability for peak loads. This setup supported disaster recovery and compliance with data localization requirements. However, the on-transit data concept proved overly complex, leading to dissatisfaction and prompting a strategic shift toward a cloud-first approach, prioritizing public cloud deployment unless justified otherwise.

Embracing the Public Cloud

The full migration to AWS marked a pivotal phase, divided into three stages. First, the team focused on exploration and training to master AWS products and the pay-as-you-go pricing model, which made every developer accountable for costs. This stage emphasized understanding managed versus unmanaged services, such as Kubernetes and Kafka, and ensuring backup compatibility across clouds. The second stage involved building new applications on AWS, addressing unknowns and ensuring secure communication with external systems. Finally, existing services were migrated from private to public cloud, starting with development and progressing to production. Throughout, the team maintained services in both environments, managing separate repositories and addressing critical bugs, such as Log4j vulnerabilities, across both.

To mitigate vendor lock-in, Bosch adopted a cloud-agnostic approach, using Terraform for infrastructure-as-code instead of AWS-specific CloudFormation. While tools like S3 and DynamoDB were embraced for their market-leading performance, backups were standardized to ensure portability. The public cloud’s vast community, extensive documentation, and readily available resources reduced knowledge silos and enhanced developer satisfaction, making the migration a transformative step for innovation and agility.

Lessons for Cloud Migration

Paweł’s experience underscores the importance of aligning cloud strategy with organizational needs. The public cloud’s immediate resource availability and developer-friendly tools accelerated development, but required careful cost management. Hybrid cloud offered flexibility but introduced complexity, particularly with data transfers. Private cloud provided security and control but lacked scalability. Paweł emphasized defining precise requirements—budget, priorities, and compliance—before choosing a cloud model. Startups may favor public clouds for agility, while regulated industries might opt for private or hybrid solutions to prioritize data security and network performance. This strategic clarity ensures a successful migration tailored to business goals.

Links:

Bosch company website

Posted in en-US | Tags: AWS, Bosch, CloudMigration, DevoxxPL2022, DevoxxPoland, HybridCloud, Kubernetes, PawełPiekut, PrivateCloud, PublicCloud | No Comments »

[DevoxxPL2022] Did Anyone Say SemVer? • Philipp Krenn

Author: Jonathan Lalou

Philipp Krenn, a developer advocate at Elastic, captivated audiences at Devoxx Poland 2022 with a witty and incisive exploration of semantic versioning (SemVer). Drawing from Elastic’s experiences with Elasticsearch, Philipp dissected the nuances of versioning, revealing why SemVer often ignites passionate debates. His talk navigated the ambiguities of defining APIs, the complexities of breaking changes, and the cultural dynamics of open-source versioning, offering a pragmatic lens for developers grappling with version management.

Decoding Semantic versioning

Philipp introduced SemVer, as formalized on semver.org, with its major version structure, where patch fixes bugs, minor adds features, and major introduces breaking changes. This simplicity, however, belies complexity in practice. He posed a sorting challenge with version strings like alpha.-, 2.-, and 11.-, illustrating SemVer’s arcane precedence rules, humorously cautioning against such obfuscation unless “trolling users.” Philipp noted that SemVer’s focus on APIs raises fundamental questions: what constitutes an API? For Elasticsearch, the REST API is sacrosanct, warranting major version bumps for changes, whereas plugin APIs, exposing internal Java packages, tolerate frequent breaks, sparking user frustration when plugins fail.

The Ambiguity of Breaking Changes

The definition of a breaking change varies by perspective, Philipp argued. Upgrading a supported JDK version, for instance, divides opinions—some view it as a system-altering break, others as an implementation detail. Security fixes further muddy the waters, as seen in Elastic’s handling of unintended insecure usage, where API “fixes” disrupted user workflows. Philipp cited the Log4j2 vulnerability, where maintainers supported multiple JDK versions across minor releases, avoiding major version increments. Accidental breaks, common in open-source projects, and asymmetric feature additions—easy to add, hard to remove—compound SemVer’s challenges, often leading to user distrust when expectations misalign.

Cultural and Practical Dilemmas

Philipp explored why SemVer debates are so heated, attributing it to differing interpretations of “correct” versioning. He critiqued version ranges, prevalent in npm but rare in Java, for introducing instability due to transitive dependency updates, advocating for tools like Dependabot to manage updates explicitly. Experimental APIs, marked as unstable, offer an escape hatch for breaking changes without major version bumps, though they demand diligent release note scrutiny. Pre-1.0 versions, dubbed the “Wild West,” lack SemVer guarantees, enabling unfettered changes but risking user confusion. Philipp contrasted SemVer with alternatives like calendar versioning, used by Ubuntu, noting its decline as SemVer dominates modern ecosystems.

Links:

Posted in en-US | Tags: APIManagement, DevoxxPL2022, DevoxxPoland, Elastic, Elasticsearch, PhilippKrenn, SemanticVersioning | No Comments »

[DevoxxPL2022] Challenges Running Planet-Wide Computer: Efficiency • Jacek Bzdak, Beata Strack

Author: Jonathan Lalou

Jacek Bzdak and Beata Strack, software engineers at Google Poland, delivered an engaging session at Devoxx Poland 2022, exploring the intricacies of optimizing Google’s planet-scale computing infrastructure. Their talk focused on achieving efficiency in a distributed system spanning global data centers, emphasizing resource utilization, auto-scaling, and operational strategies. By sharing insights from Google’s internal cloud and Autopilot system, Jacek and Beata provided a blueprint for enhancing service performance while navigating the complexities of large-scale computing.

Defining Efficiency in a Global Fleet

Beata opened by framing Google’s data centers as a singular “planet-wide computer,” where efficiency translates to minimizing operational costs—servers, CPU, memory, data centers, and electricity. Key metrics like fleet-wide utilization, CPU/RAM allocation, and growth rate serve as proxies for these costs, though they are imperfect, often masking quality issues like inflated memory usage. Beata stressed that efficiency begins at the service level, where individual jobs must optimize resource consumption, and extends to the fleet through an ecosystem that maximizes resource sharing. This dual approach ensures that savings at the micro level scale globally, a principle applicable even to smaller organizations.

Auto-Scaling: Balancing Utilization and Reliability

Jacek, a member of Google’s Autopilot team, delved into auto-scaling, a critical mechanism for achieving high utilization without compromising reliability. Autopilot’s vertical scaling adjusts resource limits (CPU/memory) for fixed replicas, while horizontal scaling modifies replica counts. Jacek presented data from an Autopilot paper, showing that auto-scaled services maintain memory slack below 20% for median cases, compared to over 60% for manually managed services. Crucially, automation reduces outage risks by dynamically adjusting limits, as demonstrated in a real-world case where Autopilot preempted a memory-induced crash. However, auto-scaling introduces complexity, particularly feedback loops, where overzealous caching or load shedding can destabilize resource allocation, requiring careful integration with application-specific metrics.

Java-Specific Challenges in Auto-Scaling

The talk transitioned to language-specific hurdles, with Jacek highlighting Java’s unique challenges in auto-scaling environments. Just-in-Time (JIT) compilation during application startup spikes CPU usage, complicating horizontal scaling decisions. Memory management poses further issues, as Java’s heap size is static, and out-of-memory errors may be masked by garbage collection (GC) thrashing, where excessive CPU is devoted to GC rather than request handling. To address this, Google sets static heap sizes and auto-scales non-heap memory, though Jacek envisioned a future where Java aligns with other languages, eliminating heap-specific configurations. These insights underscore the need for language-aware auto-scaling strategies in heterogeneous environments.

Operational Strategies for Resource Reclamation

Beata concluded by discussing operational techniques like overcommit and workload colocation to reclaim unused resources. Overcommit leverages the low probability of simultaneous resource spikes across unrelated services, allowing Google to pack more workloads onto machines. Colocating high-priority serving jobs with lower-priority batch workloads enables resource reclamation, with batch tasks evicted when serving jobs demand capacity. A 2015 experiment demonstrated significant machine savings through colocation, a concept influencing Kubernetes’ design. These strategies, combined with auto-scaling, create a robust framework for efficiency, though they demand rigorous isolation to prevent interference between workloads.

Links:

Google company website

Posted in en-US | Tags: AutoScaling, BeataStrack, CloudComputing, DevoxxPL2022, DevoxxPoland, Efficiency, Google, JacekBzdak, Kubernetes | No Comments »

[NodeCongress2021] From 1 to 101 Lambda Functions in Production: Evolving a Serverless Architecture – Slobodan Stojanovic

Author: Jonathan Lalou

Charting a server’s demise unearths tales of unchecked escalation, yet Slobodan Stojanovic’s chronicle of Vacation Tracker—from solitary Lambda to century-strong ensemble—illuminates adaptive mastery. As co-founder and CTO at Cloud Horizon, Slobodan recounts bootstrapping a PTO sentinel for Slack, evolving through GraphQL mazes to serve millions, all while curbing costs under $2K since 2018.

Slobodan’s saga ignites in 2017: hackathon sparks, landing page lures 100+ waitlisters. 2018’s MVP—single Lambda parses Slack commands, DynamoDB persists—morphs via Serverless Framework, then Claudia.js for API orchestration.

Navigating Architectural Metamorphoses

Hexagonal tenets decouple: ports/adapters insulate cores, easing mocks for units. Early monolith yields to CQRS—separate read/write Lambdas—bolstering scalability. GraphQL unifies: Apollo resolvers dispatch to specialists, DynamoDB queries aggregate.

Migrations pivot: Mongo to Dynamo via interface swaps, data shuttles offline. Integrations? LocalStack emulates AWS; CI spins ephemeral tables, asserts via before/after hooks.

Monitoring, Costs, and Team Triumphs

Datadog dashboards query errs; alerts ping anomalies. Bugs bite—Dynamo scans balloon bills to $300/month, fixed via queries slashing RPS. Onboarding thrives: hexagonal clarity, workshops demystify.

Slobodan’s axioms: evolve with scale, hexagonal/CQRS affinity, integration rigor, vigilant oversight. Free webinars beckon, perpetuating serverless lore.

Links:

Posted in en-US | Tags: ArchitectureEvolution, GraphQL, Lambda, Node, NodeCongress, NodeCongress2021, NodeJS, Serverless, SlobodanStojanovic, VacationTracker | No Comments »

[SpringIO2022] Ahead Of Time and Native in Spring Boot 3.0

Author: Jonathan Lalou

At Spring I/O 2022 in Barcelona, Brian Clozel and Stéphane Nicoll, both engineers at VMware, delivered a comprehensive session on Ahead Of Time (AOT) processing and native compilation in Spring Boot 3.0 and Spring Framework 6.0. Their talk explored the integration of GraalVM native capabilities, detailing the AOT engine’s design, its use by libraries, and practical steps for developers. Through a live demo, they showcased how to transform a Spring application into a native binary, highlighting performance gains and configuration challenges.

GraalVM Native Compilation: Core Concepts

Brian opened by introducing GraalVM, a versatile JVM supporting multiple languages and optimized Just-In-Time (JIT) compilation. The talk focused on its native compilation feature, which transforms Java applications into standalone binaries for specific CPU architectures. This process involves static analysis at build time, processing all classes on a fixed classpath, and determining reachable code. Benefits include memory efficiency (megabytes instead of gigabytes), millisecond startup times, and suitability for CLI tools, serverless functions, and high-density container deployments.

However, challenges exist. Static analysis may require additional reachability metadata for reflection or resources, as GraalVM cannot always infer runtime behavior. Brian demonstrated a case where reflection-based method invocation fails without metadata, as the native image excludes unreachable code. Debugging is less straightforward than with traditional JVMs, and Java agents, like OpenTelemetry, are unsupported. The speakers emphasized that AOT aims to bridge these gaps, making native compilation accessible for Spring applications.

Spring’s AOT Engine: Design and Integration

Stéphane detailed the AOT engine, a core component of Spring Framework 6.0-M4 and Spring Boot 3.0-M3, designed to preprocess application configurations at build time. Unlike annotation processors, it operates post-compilation, analyzing the bean factory and generating Java code to replace dynamic configuration parsing. This code, viewable in modern IDEs like IntelliJ, mimics hand-written configurations but is automatically generated, preserving package visibility and including Javadoc for clarity.

The engine supports two approaches: contributing reachability metadata for reflection or resources, or generating code to simplify static analysis. For example, a demo CLI application used Spring’s RuntimeHints API to register reflection for a SimpleHelloService class and include a classpath resource. The native build tools Gradle plugin, provided by the GraalVM team, integrates with Spring Boot’s plugin to trigger AOT processing and native compilation. Stéphane showed how the generated binary achieved rapid startup and low memory usage, with configuration classes handled automatically by the AOT engine.

Developer Cookbook: Making Applications Native-Ready

The speakers introduced a developer cookbook to guide Spring users toward native compatibility. The first step is running the application in AOT mode on the JVM, validating the engine’s understanding of the configuration without native compilation. This mode pre-processes the bean factory, reducing startup time and exposing issues early. Next, developers should reuse existing test suites, adapting them for AOT using generated sources and JUnit support. This identifies missing metadata, such as reflection or resource hints.

For third-party libraries or custom code, developers can contribute hints via the RuntimeHints API or validate them using a forthcoming Java agent. The GraalVM team is developing a reachability metadata repository, where the Spring team is contributing hints for popular libraries, reducing manual configuration. For advanced cases, developers can hook into the AOT engine to generate custom code, supported by a test compiler API to verify outcomes. Brian emphasized balancing hints and code generation, favoring simplicity unless performance demands otherwise.

Future Directions and Community Collaboration

The talk concluded with a roadmap for Spring Boot 3.0 and Spring Framework 6.0, targeting general availability by late 2022. The current milestones provide robust AOT infrastructure, with future releases expanding support for Spring libraries. The speakers highlighted collaboration with the GraalVM team to simplify native adoption and plans to align with Project Leyden for JVM optimizations. They encouraged feedback via the Spring I/O app and invited developers to explore the demo repository, which includes Maven and Gradle configurations.

This session equipped developers with tools to leverage AOT and native compilation, unlocking new use cases like serverless and high-density deployments while maintaining Spring’s developer-friendly ethos.

Links:

Posted in en-US | Tags: AOT, BrianClozel, GraalVM, NativeCompilation, SpringBoot3, SpringIO2022, StéphaneNicoll, VMware | No Comments »

[DevoxxPL2022] How We Migrate Customers and Internal Teams to Kubernetes • Piotr Bochyński

Author: Jonathan Lalou

At Devoxx Poland 2022, Piotr Bochyński, a seasoned cloud native expert at SAP, shared a compelling narrative on transitioning customers and internal teams from a Cloud Foundry-based platform to Kubernetes. His presentation illuminated the strategic imperatives, technical challenges, and practical solutions that defined SAP’s journey toward a multi-cloud Kubernetes ecosystem. By leveraging open-source projects like Kyma and Gardener, Piotr’s team addressed the limitations of their legacy platform, fostering developer productivity and operational scalability. His insights offer valuable lessons for organizations contemplating a similar migration.

Understanding Platform as a Service

Piotr began by contextualizing Platform as a Service (PaaS), a model that abstracts infrastructure complexities, allowing developers to focus on application development. Unlike Infrastructure as a Service (IaaS), which provides raw virtual machines, PaaS delivers managed runtimes, middleware, and automation, accelerating time-to-market. However, this convenience comes with trade-offs, such as reduced control and potential vendor lock-in, often tied to opinionated frameworks like the 12-factor application methodology. Piotr highlighted SAP’s initial adoption of Cloud Foundry, an open-source PaaS, to avoid vendor dependency while meeting multi-cloud requirements driven by legal and business needs, particularly in sectors like banking. Yet, Cloud Foundry’s constraints, such as single HTTP port exposure and reliance on outdated technologies like BOSH, prompted SAP to explore Kubernetes as a more flexible alternative.

Kubernetes: A Platform for Platforms

Kubernetes, as Piotr elucidated, is not a traditional PaaS but a container orchestration framework that serves as a foundation for building custom platforms. Its declarative API and extensibility distinguish it from predecessors, enabling consistent management of diverse resources like deployments, namespaces, and custom objects. Piotr illustrated this with the thermostat analogy: developers declare a desired state (e.g., 22 degrees), and Kubernetes controllers reconcile the actual state to match it. This pattern, applied uniformly across resources, empowers developers to extend Kubernetes with custom controllers, such as a hypothetical thermostat resource. The Kyma project, an open-source initiative led by SAP, builds on this extensibility, providing opinionated building blocks like Istio-based API gateways, NATS eventing, and serverless functions to bridge the gap between raw Kubernetes and a developer-friendly PaaS.

Overcoming Migration Challenges

The migration to Kubernetes presented multifaceted challenges, from technical complexity to cultural adoption. Piotr emphasized the steep learning curve associated with Kubernetes’ vast resource set, compounded by additional components like Prometheus and Istio. To mitigate this, SAP employed Kyma to abstract complexities, offering simplified resources like API rules that encapsulate Istio configurations for secure service exposure. Another hurdle was ensuring multi-cloud compatibility. SAP’s Gardener project, a managed Kubernetes solution, addressed this by providing a consistent, Kubernetes-compliant layer across providers like AWS, Azure, and Google Cloud. Piotr also discussed operational scalability, managing thousands of clusters for hundreds of teams. By applying the Kubernetes controller pattern, SAP automated cluster provisioning, upgrades, and security patching, reducing manual intervention and ensuring reliability.

Lessons from the Journey

Reflecting on the migration, Piotr candidly shared missteps that shaped SAP’s approach. Early attempts to shield users from Kubernetes’ complexity by mimicking Cloud Foundry’s API failed, as developers craved direct control over Kubernetes resources. Similarly, restricting cluster admin roles to prevent misconfigurations stifled innovation, leading SAP to grant greater flexibility. Some technology choices, like the Service Catalog project, proved inefficient, underscoring the importance of aligning with Kubernetes’ operator pattern. License changes in tools like Grafana also necessitated pivots, highlighting the need for vigilance in open-source dependencies. Piotr’s takeaways resonate broadly: Kubernetes is a long-term investment, requiring a balance of opinionated tooling and developer freedom, with automation as a cornerstone for scalability.

Links:

Posted in en-US | Tags: CloudNative, DevoxxPL2022, DevoxxPoland, Gardener, Kubernetes, Kyma, MultiCloud, PaaS, PiotrBochynski, SAP | No Comments »

[SpringIO2022] Major Migrations Made Easy with OpenRewrite

Author: Jonathan Lalou

Tim te Beek’s Spring I/O 2022 session introduced OpenRewrite, a powerful tool for automating large-scale Java migrations. As a Java consultant at JDriven, Tim shared his passion for updating outdated technology stacks, using OpenRewrite to streamline upgrades across frameworks, libraries, and languages. His talk, delivered on his birthday, combined a compelling narrative with a live demo, showcasing how OpenRewrite transforms tedious migrations into quick, safe operations.

The Migration Challenge: Keeping Up with Open Source

Tim opened with a decade-long perspective on Java and Spring evolution, from Spring Framework 2.5 in 2009 to Java 17 and Spring Boot 2 in 2022. Each release—Java 8’s lambdas, Spring Boot’s reduced boilerplate, JUnit 5, or Java 11’s JAX-B dependencies—introduced valuable features but required manual upgrades across multiple services. Vulnerabilities like Log4Shell further necessitate rapid migrations, often under pressure. For large organizations with thousands of services, manual updates are impractical, making automation essential.

OpenRewrite addresses this by leveraging an abstract syntax tree (AST) to perform precise, safe refactorings. Unlike simple search-and-replace, it understands code context, preserving formatting and ensuring functional integrity. Tim emphasized its ability to handle migrations like JUnit 4 to 5, Log4j to SLF4J, or Spring Boot 1 to 2, reducing technical debt in minutes.

How OpenRewrite Works: Recipes and AST Magic

OpenRewrite’s core strength lies in its recipe-based approach. Recipes are modular, reusable transformations—implemented as Java visitors—that modify the AST. Tim explained how recipes range from simple (changing imports) to complex (converting JUnit 4’s expected exceptions to JUnit 5’s assertThrows). These can be combined into modules for tasks like framework upgrades or style enforcement. The tool supports Java, Groovy, YAML, and XML, enabling changes to Maven/Gradle builds and Spring configurations.

A key differentiator is OpenRewrite’s type attribution and format preservation, ensuring changes blend seamlessly with existing code. Tim’s demo illustrated this by migrating a Spring Pet Clinic project from Spring Boot 1.5 (Java 8) to Spring Boot 2.5 (Java 17). Using Maven’s OpenRewrite plugin, he applied recipes to update dependencies, imports, annotations, and properties, completing the migration in under 15 seconds per step, with only two minor test failures requiring manual fixes.

Spring Boot Migrator: Enhancing OpenRewrite

Tim introduced Spring Boot Migrator, an experimental Spring project built on OpenRewrite, designed to simplify migrations to Spring Boot. Initiated by VMware Labs in 2020 and led by Fabian Krüger, it offers an opinionated API for Spring-specific migrations, such as Java EE to Spring or NetWeaver to Spring Integration. Unlike OpenRewrite’s fully automated recipes, Spring Boot Migrator provides an interactive workflow, generating HTML reports to guide developers through component identification and transformation steps.

Looking ahead, Spring Boot Migrator aims to support Spring Framework 6 and Spring Boot 3, expected in November 2022, and facilitate cloud migrations to GraalVM. Tim encouraged community contributions, noting its role in easing enterprise migrations for VMware customers.

Impact and Community: Scaling Automation

OpenRewrite’s open-source model, backed by Moderne, ensures all recipes are Apache-licensed, fostering community-driven development. Tim highlighted its use in fixing static analysis issues (e.g., Checkstyle, Sonar), enforcing code style, and contributing to open-source projects like WireMock and Apache Maven. He shared his experience migrating thousands of unit tests, urging attendees to explore OpenRewrite’s web interface (app.moderne.io) and contribute recipes.

Tim’s talk inspired developers to embrace automation, reducing migration pain and enabling focus on innovation. His enthusiasm for OpenRewrite’s potential to transform development workflows resonated strongly with the audience.

Links:

Posted in en-US | Tags: JavaMigration, JDriven, OpenRewrite, SpringBoot, SpringBootMigrator, SpringIO2022, TimTeBeek | No Comments »

[DevoxxPL2022] Java 17 & 18: What’s New and Noteworthy • Piotr Przybył

Author: Jonathan Lalou

Piotr Przybył, a seasoned software gardener at AtomicJar, captivated the audience at Devoxx Poland 2022 with a comprehensive deep dive into the new features and enhancements introduced in Java 17 and 18. His presentation, rich with technical insights and practical demonstrations, explored key updates that empower developers to write more robust, maintainable, and efficient code. Piotr’s engaging style, peppered with humor and real-world examples, provided a clear roadmap for leveraging these advancements in modern Java development.

Sealed Classes for Controlled Inheritance

One of the standout features of Java 17 is sealed classes, introduced as JEP 409. Piotr explained how sealed classes allow developers to restrict which classes or interfaces can extend or implement a given type, offering fine-grained control over inheritance. This is particularly useful for library maintainers who want to prevent unintended code reuse while allowing specific extensions. By using the sealed keyword and a permits clause, developers can define a closed set of subclasses, with options to mark them as final, sealed, or non-sealed. Piotr’s demo illustrated this with a library type hierarchy, showing how sealed classes enhance code maintainability and prevent misuse through inheritance.

Enhanced Encapsulation and UTF-8 by Default

Java 17’s JEP 403 strengthens encapsulation by removing illegal reflective access, a change Piotr humorously likened to “closing the gates to reflection demons.” Previously, developers could bypass encapsulation using setAccessible(true), but Java 17 enforces stricter access controls, requiring code fixes or the use of --add-opens flags for legacy systems. Additionally, Java 18’s JEP 400 sets UTF-8 as the default charset for I/O operations, resolving discrepancies across platforms. Piotr demonstrated how to handle encoding issues, advising developers to explicitly specify charsets to ensure compatibility, especially for Windows users.

Deprecating Finalization and Introducing Simple Web Server

Java 18’s JEP 421 marks the deprecation of the finalize method for removal, signaling the end of a problematic mechanism for resource cleanup. Piotr’s demo highlighted the non-deterministic nature of finalization, advocating for try-with-resources as a modern alternative. He also showcased Java 18’s simple web server (JEP 408), a lightweight tool for serving static files during development or testing. Through a programmatic example, Piotr demonstrated how to start a server on port 9000 and dynamically modify CSS files, emphasizing its utility for quick prototyping.

Pattern Matching for Switch and Foreign Function API

Piotr explored Java 18’s pattern matching for switch (JEP 420), a preview feature that enhances switch statements and expressions. This feature supports null handling, guarded patterns, and type-based switching, eliminating the need for cumbersome if-else checks. His demo showed how to switch over objects, handle null cases, and use guards to refine conditions, making code more concise and readable. Additionally, Piotr introduced the Foreign Function and Memory API (JEP 419), an incubator module for safe, efficient interoperation with native code. He demonstrated allocating off-heap memory and calling C functions, highlighting the API’s thread-safety and scope-bound memory management.

Random Generators and Deserialization Filters

Java 17’s JEP 356 introduces enhanced pseudo-random number generators, offering a unified interface for various random number implementations. Piotr’s demo showcased switching between generators like Random, SecureRandom, and ThreadLocalRandom, simplifying random number generation for diverse use cases. Java 17 also improves deserialization filters (JEP 415), allowing per-stream customization to enhance security against malicious data. These updates, combined with other enhancements like macOS Metal rendering and larger G1 heap regions, underscore Java’s commitment to performance and security.

Links:

Posted in en-US | Tags: AtomicJar, Development, DevoxxPL2022, DevoxxPoland, ForeignFunctionAPI, Java17, Java18, PatternMatching, PiotrPrzybył, SealedClasses, UTF8 | No Comments »

[DevoxxPL2022] Integrate Hibernate with Your Elasticsearch Database • Bartosz de Boulange

Author: Jonathan Lalou

At Devoxx Poland 2022, Bartosz de Boulange, a Java developer at BGŻ BNP Paribas, Poland’s national development bank, delivered an insightful presentation on Hibernate Search, a powerful tool that seamlessly integrates traditional Object-Relational Mapping (ORM) with NoSQL databases like Elasticsearch. Bartosz’s talk focused on enabling full-text search capabilities within SQL-based applications, offering a practical solution for developers seeking to enhance search functionality without migrating entirely to a NoSQL ecosystem. Through a blend of theoretical insights and hands-on coding demonstrations, he illustrated how Hibernate Search can address complex search requirements in modern applications.

The Power of Full-Text Search

Bartosz began by addressing the challenge of implementing robust search functionality in applications backed by SQL databases. For instance, in a bookstore application, users might need to search for specific phrases within thousands of reviews. Traditional SQL queries, such as LIKE statements, are often inadequate for such tasks due to their limited ability to handle complex text analysis. Hibernate Search solves this by enabling full-text search, which includes character filtering, tokenization, and normalization. These features allow developers to remove irrelevant characters, break text into searchable tokens, and standardize data for efficient querying. Unlike native SQL full-text search capabilities, Hibernate Search offers a more streamlined and scalable approach, making it ideal for applications requiring sophisticated search features.

Integrating Hibernate with Elasticsearch

The core of Bartosz’s presentation was a step-by-step guide to integrating Hibernate Search with Elasticsearch. He outlined five key steps: creating JPA entities, adding Hibernate Search dependencies, annotating entities for indexing, configuring fields for NoSQL storage, and performing initial indexing. By annotating entities with @Indexed, developers can create indexes in Elasticsearch at application startup. Fields are annotated as @FullTextField for tokenization and search, @KeywordField for sorting, or @GenericField for basic querying. Bartosz emphasized the importance of the @FullTextField, which enables advanced search capabilities like fuzzy matching and phrase queries. His live coding demo showcased how to set up a Docker Compose file with MySQL and Elasticsearch, configure the application, and index a bookstore’s data, demonstrating the ease of integrating these technologies.

Scalability and Synchronization Challenges

A significant advantage of using Elasticsearch with Hibernate Search is its scalability. Unlike Apache Lucene, which is limited to a single node and suited for smaller projects, Elasticsearch supports distributed data across multiple nodes, making it ideal for enterprise applications. However, Bartosz highlighted a key challenge: synchronization between SQL and NoSQL databases. Changes in the SQL database may not immediately reflect in Elasticsearch due to communication overhead. To address this, he introduced an experimental outbox polling coordination strategy, which uses additional SQL tables to maintain update order. While still in development, this feature promises to improve data consistency, a critical aspect for production environments.

Practical Applications and Benefits

Bartosz demonstrated practical applications of Hibernate Search through a bookstore example, where users could search for books by title, description, or reviews. His demo showed how to query Elasticsearch for terms like “Hibernate” or “programming,” retrieving relevant results ranked by relevance. Additionally, Hibernate Search supports advanced features like sorting by distance for geolocation-based queries and projections for retrieving partial documents, reducing reliance on the SQL database for certain operations. These capabilities make Hibernate Search a versatile tool for developers aiming to enhance search performance while maintaining their existing SQL infrastructure.

Links:

BGŻ BNP Paribas company website

Posted in en-US | Tags: BartoszDeBoulange, BGŻBNPParibas, Development, DevoxxPL2022, DevoxxPoland, Elasticsearch, FullTextSearch, HibernateSearch, Java, NoSQL, SQL | No Comments »