Recent Posts
Archives

Archive for the ‘General’ Category

PostHeaderIcon [DevoxxFR 2023] Hexagonal Architecture in 15 Minutes: Simplifying Complex Systems

Introduction

Julien Topçu, a tech lead at LesFurets, delivers a concise yet powerful Devoxx France 2023 quickie titled “L’architecture hexagonale en 15 minutes.” In this 17-minute talk, Topçu introduces hexagonal architecture (also known as ports and adapters) as a solution for building maintainable, testable systems. Drawing from his experience at LesFurets, a French insurance comparison platform, he provides a practical guide for developers navigating complex codebases.

Key Insights

Topçu explains hexagonal architecture as a way to decouple business logic from external systems, like databases or APIs. At LesFurets, where rapid feature delivery is critical, this approach reduced technical debt and improved testing. The architecture organizes code into:

  • Core Business Logic: Pure functions or classes that handle the application’s rules.

  • Ports: Interfaces defining interactions with the outside world.

  • Adapters: Implementations of ports, such as database connectors or HTTP clients.

Topçu shares a refactoring example, where a tightly coupled insurance quote system was restructured. By isolating business rules in a core module, the team simplified unit testing and swapped out a legacy database without changing the core logic. He highlights tools like Java’s interfaces and Spring’s dependency injection to implement ports and adapters efficiently. The talk also addresses trade-offs, such as the initial overhead of defining ports, balanced by long-term flexibility.

Lessons Learned

Topçu’s insights are actionable:

  • Decouple Early: Separating business logic prevents future refactoring pain.

  • Testability First: Hexagonal architecture enables comprehensive unit tests without mocks.

  • Start Small: Apply the pattern incrementally to avoid overwhelming teams.

These lessons resonate with developers maintaining evolving systems or adopting Domain-Driven Design. Topçu’s clear explanations make hexagonal architecture accessible even to newcomers.

Conclusion

Julien Topçu’s quickie offers a masterclass in hexagonal architecture, proving its value in real-world applications. His LesFurets example shows how to build systems that are robust yet adaptable. This talk is essential for developers aiming to create clean, maintainable codebases.

PostHeaderIcon Event Sourcing Without a Framework: A Practical Approach

Introduction

In his Devoxx France 2023 quickie, “Et si on faisait du Event Sourcing sans framework ?”, Jonathan Lermitage, a developer at Worldline, challenges the reliance on complex frameworks for event sourcing. This 17-minute talk explores how his team implemented event sourcing from scratch to meet the needs of a payment processing system. Lermitage’s practical approach, grounded in Worldline’s high-stakes environment, offers developers a clear path to adopting event sourcing without overwhelming dependencies.

Key Insights

Lermitage begins by explaining event sourcing, where application state is derived from a sequence of events rather than a static database. At Worldline, which processes millions of transactions daily, event sourcing ensures auditability and resilience. However, frameworks like Axon or EventStore introduced complexity that clashed with the team’s need for simplicity and control.

Instead, Lermitage’s team built a custom solution using:

  • PostgreSQL for Event Storage: Storing events as JSON objects in a single table, with indexes for performance.

  • Kafka for Event Streaming: Ensuring scalability and real-time processing.

  • Java for Business Logic: Simple classes to handle event creation, storage, and replay.

He shares a case study of tracking payment statuses, where events like PaymentInitiated or PaymentConfirmed formed an auditable trail. Lermitage emphasizes minimalism, avoiding over-engineered patterns and focusing on readable code. The talk also covers challenges, such as managing event schema evolution and ensuring idempotency during replays, solved with versioned events and unique identifiers.

Lessons Learned

Lermitage’s experience offers key takeaways:

  • Keep It Simple: Avoid frameworks if your use case demands lightweight solutions.

  • Prioritize Auditability: Event sourcing shines in systems requiring traceability, like payments.

  • Plan for Evolution: Design events with versioning in mind to handle future changes.

These insights are valuable for developers in regulated industries or those wary of framework lock-in. Lermitage’s focus on practicality makes event sourcing approachable for teams of varying expertise.

Conclusion

Jonathan Lermitage’s talk demystifies event sourcing by showing how to implement it without heavy frameworks. His Worldline case study proves that simplicity and control can coexist in complex systems. This quickie is a must-watch for developers seeking flexible, auditable architectures.

PostHeaderIcon “A monolith, or nothing!”: Embracing the Monolith at Ornikar

Introduction

In “Un monolithe sinon rien,” presented at Devoxx France 2023, Nicolas Demengel, a tech lead at Ornikar, makes a bold case for sticking with a monolithic architecture. In this 14-minute quickie, Demengel challenges the microservices trend, arguing that a well-structured monolith can be a powerful choice for startups like Ornikar, a French online driving school platform. His talk offers a refreshing perspective for developers weighing architectural trade-offs.

Key Insights

Demengel begins by acknowledging the allure of microservices: scalability, independence, and modern appeal. However, he argues that for Ornikar, a monolith provided simplicity and speed during rapid growth. The talk details Ornikar’s architecture, where a single Ruby on Rails application handles everything from user onboarding to payment processing. This centralized approach reduced complexity for a small team, enabling faster feature delivery.

Demengel shares how Ornikar maintains its monolith’s health through rigorous testing and modular design. He highlights practices like domain-driven boundaries within the codebase to prevent spaghetti code. The talk also addresses scaling challenges, such as handling increased traffic during peak enrollment periods, which Ornikar solved with database optimizations rather than a microservices overhaul.

Lessons Learned

Demengel’s talk offers practical takeaways:

  • Simplicity First: A monolith can accelerate development for startups with limited resources.

  • Discipline Matters: Modular design and testing keep a monolith maintainable.

  • Context is Key: Architectural choices should align with team size, expertise, and business goals.

These insights are valuable for startups and small teams evaluating whether to follow industry trends or stick with simpler solutions. Demengel’s pragmatic approach encourages developers to prioritize outcomes over dogma.

Conclusion

Nicolas Demengel’s “Un monolithe sinon rien” is a thought-provoking defense of the monolith in an era dominated by microservices hype. By sharing Ornikar’s success story, Demengel inspires developers to make context-driven architectural decisions. This talk is a must-watch for teams navigating the monolith vs. microservices debate.

PostHeaderIcon Navigating the Challenges of Legacy Systems

Introduction

In her Devoxx France 2023 quickie, “Votre pire cauchemar : être responsable du legacy,” Camille Pillot, a consultant at Takima, tackles the daunting reality of managing legacy code. With humor and pragmatism, Pillot shares strategies for transforming legacy systems from a developer’s nightmare into an opportunity for growth. This 14-minute talk, rooted in her experience at Takima, a consultancy specializing in software modernization, offers actionable advice for developers tasked with maintaining aging codebases.

Key Insights

Pillot opens by defining legacy code as software that’s critical yet outdated, often poorly documented and resistant to change. She draws from her work at Takima, where teams frequently inherit complex systems. The talk outlines a three-step approach to managing legacy:

  1. Assessment: Understand the system’s architecture and dependencies, using tools like code audits and dependency graphs.

  2. Stabilization: Implement tests and monitoring to prevent regressions, even if the code remains brittle.

  3. Modernization: Gradually refactor or rewrite components, prioritizing high-impact areas.

Pillot shares a case study from a Takima project, where a legacy e-commerce platform was stabilized by introducing unit tests, then partially refactored to improve performance. She emphasizes the importance of stakeholder buy-in, as modernization efforts often require time and budget. The talk also addresses the emotional toll of legacy work, encouraging developers to find value in incremental improvements.

Lessons Learned

Pillot’s insights are a lifeline for developers facing legacy challenges:

  • Start Small: Small, targeted improvements build momentum and trust.

  • Communicate Value: Articulate the business benefits of modernization to secure resources.

  • Embrace Patience: Legacy work is a marathon, not a sprint, requiring resilience.

These strategies are particularly relevant for consultancy roles, where developers must balance technical debt with client expectations. Pillot’s empathetic approach makes the talk relatable and inspiring.

Conclusion

Camille Pillot’s talk transforms the fear of legacy code into a call to action. By offering a clear framework and real-world examples, she empowers developers to tackle legacy systems with confidence. This quickie is essential viewing for anyone navigating the complexities of maintaining critical but outdated software.

PostHeaderIcon “All Architects !”: Empowering Every Developer as an Architect

Introduction

In the Devoxx France 2023 quickie “Tous architectes !”, Simon Maurin, Lead Architect at Leboncoin, delivers a compelling case for democratizing software architecture. Drawing from his decade-long experience at France’s leading classified ads platform, Maurin argues that architecture isn’t the sole domain of designated architects but a shared responsibility across development teams. This 15-minute talk explores how Leboncoin evolved its architectural practices to scale with growth, offering practical insights for developers and tech leads navigating large organizations.

Key Insights

Maurin begins by reflecting on Leboncoin’s early days, where small teams naturally collaborated on architecture through organic discussions. As the company grew to serve 30 million monthly users, this informal approach became unsustainable. The introduction of formal architects risked creating bottlenecks and disconnects. Maurin highlights the pivotal shift to empowering all developers as architects, fostering a culture where everyone contributes to design decisions. This approach aligns with Domain-Driven Design principles, which Maurin champions as a tool for maintaining clarity in complex systems.

A key mechanism introduced at Leboncoin was Architecture Decision Records (ADRs). These lightweight documents capture the rationale behind architectural choices, ensuring transparency and continuity. Maurin shares a case study where ADRs helped Leboncoin transition from a monolith to microservices, reducing coupling and enabling faster iterations. The talk also touches on data engineering challenges, such as scaling to handle 10 million daily events, underscoring the need for shared ownership in high-traffic environments.

Lessons Learned

Maurin’s talk offers several takeaways for developers:

  • Shared Responsibility: Architecture thrives when all team members, not just architects, engage in decision-making.

  • ADRs as a Tool: Documenting decisions prevents knowledge silos and aids onboarding.

  • Cultural Shift: Scaling architecture requires fostering a mindset where developers feel empowered to challenge and contribute.

These lessons are particularly relevant for growing tech organizations facing the tension between agility and structure. Maurin’s emphasis on collaboration over hierarchy resonates with modern software engineering trends.

Conclusion

Simon Maurin’s “Tous architectes !” is a rallying cry for developers to embrace their role in shaping software architecture. By sharing Leboncoin’s journey, Maurin provides a roadmap for balancing freedom and formality in large teams. This talk is a must-watch for developers and architects seeking to foster inclusive, scalable practices in their organizations.

PostHeaderIcon [PyConUS 2023] Fixing Legacy Code, One Pull Request at a Time

At PyCon US 2023, Guillaume Dequenne from Sonar presented a compelling workshop on modernizing legacy codebases through incremental improvements. Sponsored by Sonar, this session focused on integrating code quality tools into development workflows to enhance maintainability and sustainability, using a Flask application as a practical example. Guillaume’s approach, dubbed “Clean as You Code,” offers a scalable strategy for tackling technical debt without overwhelming developers.

The Legacy Code Conundrum

Legacy codebases often pose significant challenges, accumulating technical debt that hinders development efficiency and developer morale. Guillaume illustrated this with a vivid metaphor: analyzing a legacy project for the first time can feel like drowning in a sea of issues. Traditional approaches to fixing all issues at once are unscalable, risking functional regressions and requiring substantial resources. Instead, Sonar advocates for a pragmatic methodology that focuses on ensuring new code adheres to high-quality standards, gradually reducing technical debt over time.

Clean as You Code Methodology

The “Clean as You Code” approach hinges on two principles: ownership of new code and incremental improvement. Guillaume explained that developers naturally understand and take responsibility for code they write today, making it easier to enforce quality standards. By ensuring that each pull request introduces clean code, teams can progressively refurbish their codebase. Over time, as new code replaces outdated sections, the overall quality improves without requiring a massive upfront investment. This method aligns with continuous integration and delivery (CI/CD) practices, allowing teams to maintain high standards while delivering features systematically.

Leveraging SonarCloud for Quality Assurance

Guillaume demonstrated the practical application of this methodology using SonarCloud, a cloud-based static analysis tool. By integrating SonarCloud into a Flask application’s CI/CD pipeline, developers can automatically analyze pull requests for issues like bugs, security vulnerabilities, and code smells. He showcased how SonarCloud’s quality gates enforce standards on new code, ensuring that only clean contributions are merged. For instance, Guillaume highlighted a detected SQL injection vulnerability due to unsanitized user input, emphasizing the tool’s ability to provide contextual data flow analysis to pinpoint and resolve issues efficiently.

Enhancing Developer Workflow with SonarLint

To catch issues early, Guillaume introduced SonarLint, an IDE extension for PyCharm and VSCode that performs real-time static analysis. This tool allows developers to address issues before committing code, streamlining the review process. He demonstrated how SonarLint highlights issues like unraised exceptions and offers quick fixes, enhancing productivity. Additionally, the connected mode between SonarLint and SonarCloud synchronizes issue statuses, ensuring consistency across development and review stages. This integration empowers developers to maintain high-quality code from the outset, reducing the burden of post-commit fixes.

Sustaining Codebase Health

The workshop underscored the long-term benefits of the “Clean as You Code” approach, illustrated by a real-world project where issue counts decreased over time as new rules were introduced. By focusing on new code and leveraging tools like SonarCloud and SonarLint, teams can achieve sustainable codebases that are maintainable, reliable, and secure. Guillaume’s presentation offered a roadmap for developers to modernize legacy systems incrementally, fostering a culture of continuous improvement.

Hashtags: #LegacyCode #CleanCode #StaticAnalysis #SonarCloud #SonarLint #Python #Flask #GuillaumeDequenne #PyConUS2023

PostHeaderIcon [Spring I/O 2023] Do You Really Need Hibernate?

Simon Martinelli’s thought-provoking session at Spring I/O 2023 challenges the default adoption of Hibernate in Java applications. With decades of experience, Simon advocates for jOOQ as a performant, SQL-centric alternative for database-centric projects. Using a track-and-field event management system as a case study, he illustrates how jOOQ simplifies data access, avoids common ORM pitfalls, and complements Hibernate when needed. This presentation is a masterclass in rethinking persistence strategies for modern Java development.

Questioning the ORM Paradigm

Simon begins by questioning the reflexive use of Hibernate and JPA in Java projects. While powerful for complex domain models, ORMs introduce overhead—such as dirty checking or persistence by reachability—that may not suit all applications. For CRUD-heavy systems, like his 25-year-old event management application, a simpler approach is often sufficient. By focusing on database tables rather than object graphs, developers can streamline data operations, avoiding the complexity of managing entity state transitions.

jOOQ: A Database-First Approach

jOOQ’s database-first philosophy is central to Simon’s argument. By generating type-safe Java code from database schemas, jOOQ enables developers to write SQL-like queries using a fluent DSL. This approach, as Simon demonstrates, ensures compile-time safety and eliminates runtime errors from mismatched SQL strings. The tool supports a wide range of databases, including legacy systems with stored procedures, making it versatile for both modern and enterprise environments. Integration with Flyway and Testcontainers further simplifies schema migrations and code generation.

Efficient Data Retrieval with Nested Structures

A highlight of Simon’s talk is jOOQ’s ability to handle nested data structures efficiently. Using the event management system’s ranking list—a tree of competitions, categories, athletes, and results—he showcases jOOQ’s MULTISET feature. This leverages JSON functionality in modern databases to fetch hierarchical data in a single SQL statement, avoiding the redundancy of JPA’s join fetches. This capability is particularly valuable for REST APIs and reporting, where nested data is common, eliminating the need for DTO mapping.

Combining jOOQ and Hibernate for Flexibility

Rather than advocating for jOOQ as a complete replacement, Simon proposes a hybrid approach. jOOQ excels in querying, bulk operations, and legacy database integration, while Hibernate shines in entity state management and cascading operations. By combining both in a single application, developers can leverage their respective strengths. Simon warns against using JPA for raw SQL, as it lacks jOOQ’s type safety, reinforcing the value of choosing the right tool for each task.

Practical Insights and Tooling

Simon’s demo, backed by a GitHub repository, illustrates jOOQ’s integration with Maven, Testcontainers, and a new Testcontainers Flyway plugin. He highlights practical considerations, such as whether to version generated code and jOOQ’s licensing model for commercial databases. The talk also addresses limitations, like MULTISET’s incompatibility with MariaDB, offering candid advice for database selection. These insights ground the presentation in real-world applicability, making it accessible to developers of varying experience levels.

A Call to Rethink Persistence

Simon’s presentation is a compelling call to reassess persistence strategies. By showcasing jOOQ’s performance and flexibility, he encourages developers to align their tools with application needs. His track-and-field application, evolved over decades, serves as a testament to the enduring value of SQL-driven development. For Java developers seeking to optimize data access, this talk offers a clear, actionable path forward, blending modern tooling with pragmatic wisdom.

Hashtags: #SpringIO2023 #jOOQ #Hibernate #Java #SQL #Database #SimonMartinelli #Testcontainers #Flyway

PostHeaderIcon [Spring I/O 2023] Multitenant Mystery: Only Rockers in the Building by Thomas Vitale

In the vibrant atmosphere of Spring I/O 2023, Thomas Vitale, a seasoned software engineer and cloud architect at Systematic in Denmark, captivated the audience with his exploration of multitenant architectures in Spring Boot applications. Through a compelling narrative involving a stolen guitar in a building inhabited by rock bands, Thomas unraveled the complexities of ensuring data isolation, security, and observability in multi-tenant systems. His presentation, rich with practical insights and live coding, offered a masterclass in building robust SaaS solutions using Java, Spring, and related technologies.

Understanding Multitenancy

Thomas began by defining multitenancy as an architecture where a single application instance serves multiple clients, or tenants, simultaneously. This approach, prevalent in software-as-a-service (SaaS) solutions, optimizes operational costs by sharing infrastructure across customers. He illustrated this with an analogy of a building housing rock bands, where each band (tenant) shares common facilities like staircases but maintains private storage for their instruments. This setup underscores the need for meticulous data isolation to prevent cross-tenant data leakage, a critical concern in industries like healthcare where regulatory compliance is paramount.

Implementing Tenant Resolution

A cornerstone of Thomas’s approach was establishing a tenant context within a Spring Boot application. He demonstrated how to resolve tenant information from HTTP requests using a custom header, X-Tenant-ID. By implementing a tenant resolver and interceptor, Thomas ensured that each request’s tenant identifier is stored in a thread-local context, accessible throughout the request lifecycle. His live coding showcased the integration of Spring MVC’s HandlerInterceptor to seamlessly extract and manage tenant data, setting the stage for further customization. This mechanism allows developers to process requests in a tenant-specific manner, enhancing the application’s flexibility.

Data Isolation Strategies

Data isolation emerged as the most critical aspect of multitenancy. Thomas outlined three strategies: discriminator-based partitioning, separate schemas, and separate databases. He focused on the separate schema approach, leveraging Hibernate and Spring Data JPA to manage tenant-specific schemas within a single PostgreSQL database. By configuring Hibernate’s CurrentTenantIdentifierResolver and MultiTenantConnectionProvider, Thomas ensured that database connections dynamically switch schemas based on the tenant context. His demo highlighted the effectiveness of this strategy, showing how instruments stored for one tenant (e.g., “Dukes”) remained isolated from another (“Beans”), thus safeguarding data integrity.

Security and Observability

Security and observability were pivotal in Thomas’s narrative. He addressed the challenge of dynamic authentication by integrating Keycloak, allowing tenant-specific identity providers to be resolved at runtime. This approach avoids hardcoding configurations, enabling seamless onboarding of new tenants. For observability, Thomas emphasized the importance of tenant-specific logging, metrics, and tracing. Using Micrometer and OpenTelemetry, he enriched logs and traces with tenant identifiers, facilitating debugging and monitoring. A critical lesson emerged during his demo: a caching oversight led to data leakage across tenants, underscoring the need for tenant-specific cache keys. Thomas resolved this by implementing a custom key generator, restoring data isolation.

Solving the Mystery

The stolen guitar mystery served as a metaphor for real-world multitenancy pitfalls. By tracing the issue to a caching flaw, Thomas illustrated how seemingly minor oversights can have significant consequences. His resolution—ensuring tenant-specific caching—reinforced the importance of vigilance in multi-tenant systems. The presentation concluded with a call to prioritize data isolation, offering attendees a blueprint for building scalable, secure SaaS applications with Spring Boot.

Hashtags: #Multitenancy #SpringBoot #Java #SaaS #DataIsolation #Security #Observability #ThomasVitale #Systematic #Keycloak #Hibernate #SpringIO2023

PostHeaderIcon [Spring I/O 2023] Managing Spring Boot Application Secrets: Badr Nass Lahsen

In a compelling session at Spring I/O 2023, Badr Nasslahsen, a DevSecOps expert at CyberArk, tackled the critical challenge of securing secrets in Spring Boot applications. With the rise of cloud-native architectures and Kubernetes, secrets like database credentials or API keys have become prime targets for attackers. Badr’s talk, enriched with demos and real-world insights, introduced CyberArk’s Conjur solution and various patterns to eliminate hard-coded credentials, enhance authentication, and streamline secrets management, fostering collaboration between developers and security teams.

The Growing Threat to Application Secrets

Badr opened with alarming statistics: in 2021, software supply chain attacks surged by 650%, with 71% of organizations experiencing such breaches. He cited the 2022 Uber attack, where a PowerShell script with hard-coded credentials enabled attackers to escalate privileges across AWS, Google Suite, and other systems. Using the SALSA threat model, Badr highlighted vulnerabilities like compromised source code (e.g., Okta’s leaked access token) and build processes (e.g., SolarWinds). These examples underscored the need to eliminate hard-coded secrets, which are difficult to rotate, track, or audit, and often exposed inadvertently. Badr advocated for “shifting security left,” integrating security from the design phase to mitigate risks early.

Introducing Application Identity Security

Badr introduced the concept of non-human identities, noting that machine identities (e.g., SSH keys, database credentials) outnumber human identities 45 to 1 in enterprises. These secrets, if compromised, grant attackers access to critical resources. To address this, Badr presented CyberArk’s Conjur, an open-source secrets management solution that authenticates workloads, enforces policies, and rotates credentials. He emphasized the “secret zero problem”—the initial secret needed at application startup—and proposed authenticators like JWT or certificate-based authentication to solve it. Conjur’s attribute-based access control (ABAC) ensures least privilege, enabling scalable, auditable workflows that balance developer autonomy and security requirements.

Patterns for Securing Spring Boot Applications

Through a series of demos using the Spring Pet Clinic application, Badr showcased five patterns for secrets management in Kubernetes. The API pattern integrates Conjur’s SDK, using Spring’s @Value annotations to inject secrets without changing developer workflows. The Secrets Provider pattern updates Kubernetes secrets from Conjur, minimizing code changes but offering less security. The Push-to-File pattern stores secrets in shared memory, updating application YAML files securely. The Summon pattern uses a process wrapper to inject secrets as environment variables, ideal for apps relying on such variables. Finally, the Secretless Broker pattern proxies connections to resources like MySQL, hiding secrets entirely from applications and developers. Badr demonstrated credential rotation with zero downtime using Spring Cloud Kubernetes, ensuring resilience for critical applications.

Enhancing Kubernetes Security and Auditing

Badr cautioned that Kubernetes secrets, being base64-encoded and unencrypted by default, are insecure without etcd encryption. He introduced KubeScan, an open-source tool to identify risky roles and permissions in clusters. His demos highlighted Conjur’s auditing capabilities, logging access to secrets and enabling security teams to track usage. By centralizing secrets management, Conjur eliminates “security islands” created by disparate tools like AWS Secrets Manager or Azure Key Vault, ensuring compliance and visibility. Badr stressed the need for a federated governance model to manage secrets across diverse technologies, empowering developers while maintaining robust security controls.

Links:

Hashtags: #SecretsManagement #SpringIO2023 #SpringBoot #CyberArk #BadrNassLahsen

PostHeaderIcon [DevoxxPL2022] Accelerating Big Data: Modern Trends Enable Product Analytics • Boris Trofimov

Boris Trofimov, a big data expert from Sigma Software, delivered an insightful presentation at Devoxx Poland 2022, exploring modern trends in big data that enhance product analytics. With experience building high-load systems like the AOL data platform for Verizon Media, Boris provided a comprehensive overview of how data platforms are evolving. His talk covered architectural innovations, data governance, and the shift toward serverless and ELT (Extract, Load, Transform) paradigms, offering actionable insights for developers navigating the complexities of big data.

The Evolving Role of Data Platforms

Boris began by demystifying big data, often misconstrued as a magical solution for business success. He clarified that big data resides within data platforms, which handle ingestion, processing, and analytics. These platforms typically include data sources, ETL (Extract, Transform, Load) pipelines, data lakes, and data warehouses. Boris highlighted the growing visibility of big data beyond its traditional boundaries, with data engineers playing increasingly critical roles. He noted the rise of cross-functional teams, inspired by Martin Fowler’s ideas, where subdomains drive team composition, fostering collaboration between data and backend engineers.

The convergence of big data and backend practices was a key theme. Boris pointed to technologies like Apache Kafka and Spark, which are now shared across both domains, enabling mutual learning. He emphasized that modern data platforms must balance complexity with efficiency, requiring specialized expertise to avoid pitfalls like project failures due to inadequate practices.

Architectural Innovations: From Lambda to Delta

Boris delved into big data architectures, starting with the Lambda architecture, which separates data processing into speed (real-time) and batch layers for high availability. While effective, Lambda’s complexity increases development and maintenance costs. As an alternative, he introduced the Kappa architecture, which simplifies processing by using a single streaming layer, reducing latency but potentially sacrificing availability. Boris then highlighted the emerging Delta architecture, which leverages data lakehouses—hybrid systems combining data lakes and warehouses. Technologies like Snowflake and Databricks support Delta, minimizing data hops and enabling both batch and streaming workloads with a single storage layer.

The Delta architecture’s rise reflects the growing popularity of data lakehouses, which Boris praised for their ability to handle raw, processed, and aggregated data efficiently. By reducing technological complexity, Delta enables faster development and lower maintenance, making it a compelling choice for modern data platforms.

Data Mesh and Governance

Boris introduced data mesh as a response to monolithic data architectures, drawing parallels with domain-driven design. Data mesh advocates for breaking down data platforms into bounded contexts, each owned by a dedicated team responsible for its pipelines and decisions. This approach avoids the pitfalls of monolithic pipelines, such as chaotic dependencies and scalability issues. Boris outlined four “temptations” to avoid: building monolithic pipelines, combining all pipelines into one application, creating chaotic pipeline networks, and mixing domains in data tables. Data mesh, he argued, promotes modularity and ownership, treating data as a product.

Data governance, or “data excellence,” was another critical focus. Boris stressed the importance of practices like data monitoring, quality validation, and retention policies. He advocated for a proactive approach, where engineers address these concerns early to ensure platform reliability and cost-efficiency. By treating data governance as a checklist, teams can mitigate risks and enhance platform maturity.

Serverless and ELT: Simplifying Big Data

Boris highlighted the shift toward serverless technologies and ELT paradigms. Serverless solutions, available across transformation, storage, and analytics tiers, reduce infrastructure management burdens, allowing faster time-to-market. He cited AWS and other cloud providers as enablers, noting that while not always cost-effective, serverless minimizes maintenance efforts. Similarly, ELT—where transformation occurs after loading data into a warehouse—leverages modern databases like Snowflake and BigQuery. Unlike traditional ETL, ELT reduces latency and complexity by using database capabilities for transformations, making it ideal for early-stage projects.

Boris also noted the resurgence of SQL as a domain-specific language across big data tiers, from transformation to governance. By building frameworks that express business logic in SQL, developers can accelerate feature delivery, despite SQL’s perceived limitations. He emphasized that well-designed SQL queries can be powerful, provided engineers avoid poorly structured code.

Productizing Big Data and Business Intelligence

The final trend Boris explored was the productization of big data solutions. He likened this to Intel’s microprocessor revolution, where standardized components accelerated hardware development. Companies like Absorber offer “data platform as a service,” enabling rapid construction of data pipelines through drag-and-drop interfaces. While limited for complex use cases, such solutions cater to organizations seeking quick deployment. Boris also discussed the rise of serverless business intelligence (BI) tools, which support ELT and allow cross-cloud data queries. These tools, like Mode and Tableau, enable self-service analytics, reducing the need for custom platforms in early stages.

Links: