Posts Tagged ‘Kubernetes’
Understanding Kubernetes for Docker and Docker Compose Users
TL;DR
Kubernetes may look like an overly complicated version of Docker Compose, but it operates on a different level entirely. Where Compose excels at quick, local orchestration of containers, Kubernetes is a robust, distributed platform designed for automated scaling, fault-tolerance, and production-grade deployments across multi-node clusters. This article provides a comprehensive comparison and shows how ArgoCD enhances GitOps-based Kubernetes workflows.
Docker Compose vs Kubernetes – Similarities and First Impressions
At a high level, Docker Compose and Kubernetes share similar concepts: containers, services, configuration, and volumes. This often leads to the assumption that Kubernetes is just a verbose, harder-to-write Compose replacement. However, Kubernetes is more than a runtime. It’s a control plane, a state manager, and a policy enforcer.
| Concept | Docker Compose | Kubernetes |
|---|---|---|
| Service definition | docker-compose.yml |
Deployment, Service, etc. YAML manifests |
| Networking | Shared bridge network, service discovery by name | DNS, internal IPs, ClusterIP, NodePort, Ingress |
| Volume management | volumes: |
PersistentVolume, PersistentVolumeClaim, StorageClass |
| Secrets and configs | .env, environment: |
ConfigMap, Secret, ServiceAccount |
| Dependency management | depends_on |
initContainers, readinessProbe, livenessProbe |
| Scaling | Manual (scale flag or duplicate services) | Declarative (replicas), automatic via HPA |
Real-Life Use Cases – Docker Compose vs Kubernetes Examples
Tomcat + Oracle + MongoDB + NGINX Stack
Docker Compose
version: '3'
services:
nginx:
image: nginx:latest
ports:
- "80:80"
depends_on:
- tomcat
tomcat:
image: tomcat:9
ports:
- "8080:8080"
environment:
DB_URL: jdbc:oracle:thin:@oracle:1521:orcl
oracle:
image: oracle/database:19.3.0-ee
environment:
ORACLE_PWD: secretpass
volumes:
- oracle-data:/opt/oracle/oradata
mongo:
image: mongo:5
volumes:
- mongo-data:/data/db
volumes:
oracle-data:
mongo-data:
Kubernetes Equivalent
- Each service becomes a
Deploymentand aService. - Environment variables and passwords are stored in
Secrets. - Volumes are defined with
PVCandStorageClass.
apiVersion: v1
kind: Secret
metadata:
name: oracle-secret
type: Opaque
data:
ORACLE_PWD: c2VjcmV0cGFzcw==
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: tomcat
spec:
replicas: 2
selector:
matchLabels:
app: tomcat
template:
metadata:
labels:
app: tomcat
spec:
containers:
- name: tomcat
image: tomcat:9
ports:
- containerPort: 8080
env:
- name: DB_URL
value: jdbc:oracle:thin:@oracle:1521:orcl
NodeJS + Express + MySQL + NGINX
Docker Compose
services:
mysql:
image: mysql:8
environment:
MYSQL_ROOT_PASSWORD: rootpass
volumes:
- mysql-data:/var/lib/mysql
api:
build: ./api
environment:
DB_USER: root
DB_PASS: rootpass
DB_HOST: mysql
nginx:
image: nginx:latest
ports:
- "80:80"
Kubernetes Equivalent
apiVersion: v1
kind: Secret
metadata:
name: mysql-secret
type: Opaque
data:
MYSQL_ROOT_PASSWORD: cm9vdHBhc3M=
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: api
spec:
replicas: 2
template:
spec:
containers:
- name: api
image: node-app:latest
env:
- name: DB_PASS
valueFrom:
secretKeyRef:
name: mysql-secret
key: MYSQL_ROOT_PASSWORD
⚙️ Docker Compose vs kubectl – Command Mapping
| Task | Docker Compose | Kubernetes |
|---|---|---|
| Start services | docker-compose up -d |
kubectl apply -f . |
| Stop/cleanup | docker-compose down |
kubectl delete -f . |
| View logs | docker-compose logs -f |
kubectl logs -f pod-name |
| Scale a service | docker-compose up --scale web=3 |
kubectl scale deployment web --replicas=3 |
| Shell into container | docker-compose exec app sh |
kubectl exec -it pod-name -- /bin/sh |
ArgoCD – GitOps Made Practical
ArgoCD is a Kubernetes-native continuous deployment tool. It uses Git as the single source of truth, enabling declarative infrastructure and GitOps workflows.
✨ Key Features
- Declarative sync of Git and cluster state
- Drift detection and automatic repair
- Multi-environment and multi-namespace support
- CLI and Web UI available
Example ArgoCD Commands
argocd login argocd.myorg.com
argocd app create my-app \
--repo https://github.com/org/app.git \
--path k8s \
--dest-server https://kubernetes.default.svc \
--dest-namespace production
argocd app sync my-app
argocd app get my-app
argocd app diff my-app
Sample ArgoCD Application Manifest
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: my-api
spec:
destination:
namespace: default
server: https://kubernetes.default.svc
project: default
source:
path: k8s/app
repoURL: https://github.com/org/api.git
targetRevision: HEAD
syncPolicy:
automated:
prune: true
selfHeal: true
✅ Conclusion
Docker Compose is perfect for prototyping and local dev. Kubernetes is built for cloud-native workloads, distributed systems, and high availability. ArgoCD makes declarative, Git-based continuous deployment simple, scalable, and observable.
[DevoxxUK2025] Platform Engineering: Shaping the Future of Software Delivery
Paula Kennedy, co-founder and COO of Cintaso, delivered a compelling lightning talk at DevoxxUK2025, tracing the evolution of platform engineering and its impact on software delivery. Drawing from over a decade of experience, Paula explored how platforms have shifted from siloed operations to force multipliers for developer productivity. Referencing the journey from DevOps to PaaS to Kubernetes, she highlighted current trends like inner sourcing and offered practical strategies for assessing platform maturity. Her narrative, infused with lessons from the past and present, underscored the importance of a user-centered approach to avoid the pitfalls of hype and ensure platforms drive innovation.
The Evolution of Platforms
Paula began by framing platforms as foundations that elevate development, drawing on Gregor Hohpe’s analogy of a Volkswagen chassis enabling diverse car models. She recounted her career, starting in 2002 at Acturus, a SaaS provider with rigid silos between developers and operations. The DevOps movement, sparked in 2009, sought to bridge these divides, but its “you build it, you run it” mantra often overwhelmed teams. The rise of Platform-as-a-Service (PaaS), exemplified by Cloud Foundry, simplified infrastructure management, allowing developers to focus on code. However, Paula noted, the complexity of Kubernetes led organizations to build custom internal platforms, sometimes losing sight of the original value proposition.
Current Trends and Challenges
Today, platform engineering is at a crossroads, with Gartner predicting that by 2026, 80% of large organizations will have dedicated teams. Paula highlighted principles like self-service APIs, internal developer portals (e.g., Backstage), and golden paths that guide developers to best practices. She emphasized treating platforms as products, applying product management practices to align with user needs. However, the 2024 DORA report reveals challenges: while platforms boost organizational performance, they often fail to improve software reliability or delivery throughput. Paula attributed this to automation complacency and “platform complacency,” where trust in internal platforms leads to reduced scrutiny, urging teams to prioritize observability and guardrails.
Links:
[DevoxxGR2025] Optimized Kubernetes Scaling with Karpenter
Alex König, an AWS expert, delivered a 39-minute talk at Devoxx Greece 2025, exploring how Karpenter enhances Kubernetes cluster autoscaling for speed, cost-efficiency, and availability.
Karpenter’s Dynamic Autoscaling
König introduced Karpenter as an open-source, Kubernetes-native autoscaling solution, contrasting it with the traditional Cluster Autoscaler. Unlike the latter, which relies on uniform node groups (e.g., nodes with four CPUs and 16GB RAM), Karpenter uses the EC2 Fleet API to dynamically provision nodes tailored to workload needs. For instance, if a pod requires one CPU, Karpenter allocates a node with minimal excess capacity, avoiding resource waste. This right-sizing, combined with groupless scaling, enables faster and more cost-effective scaling, especially in dynamic environments.
Ensuring Availability with Constraints
König addressed availability challenges reported by users, emphasizing Kubernetes-native scheduling constraints to mitigate disruptions. Topology spread constraints distribute pods across availability zones, reducing the risk of downtime if a node fails. Pod disruption budgets, affinity/anti-affinity rules, and priority classes further ensure critical workloads are scheduled appropriately. For stateful workloads using EBS, König recommended setting the volume binding mode to “wait for first consumer” to avoid pod-volume mismatches across zones, preventing crashes and ensuring reliability.
Integrating with KEDA for Application Scaling
For advanced scaling, König highlighted combining Karpenter with KEDA for event-driven, application-specific scaling. KEDA scales pods based on metrics like Kafka topic sizes or SQS queues, beyond CPU/memory. Karpenter then provisions nodes for pending pods, enabling seamless scaling for workloads like flash sales. König outlined a four-step migration from Cluster Autoscaler to Karpenter, emphasizing its simplicity and open-source documentation.
Links
[DevoxxGR2024] The Art of Debugging Inside K8s Environment at Devoxx Greece 2024 by Andrii Soldatenko
At Devoxx Greece 2024, Andrii Soldatenko, a seasoned software engineer and tech evangelist at Dynatrace, delivered an engaging presentation on mastering the art of debugging within Kubernetes (K8s) environments. With a blend of humor, practical insights, and real-world strategies, Andrii illuminated the complexities of troubleshooting cloud-native applications. Drawing from his extensive experience, he provided actionable techniques to enhance debugging efficiency, making the session a valuable resource for developers navigating the intricacies of Kubernetes. His talk emphasized proactive design, robust tooling, and a systematic approach to resolving issues in distributed systems.
The Challenges of Debugging in Kubernetes
Andrii began by acknowledging the inherent difficulties of debugging in modern cloud-native environments. Unlike traditional development, where a local debugger suffices, Kubernetes introduces layers of complexity with containers, pods, and distributed architectures. He humorously outlined his “eight stages of debugging,” from denial (“this can’t happen”) to self-realization (“I wrote this code”), resonating with developers who face similar emotional journeys. These stages underscore the psychological and technical hurdles of troubleshooting in K8s, where issues often stem from accidental complexities like misconfigured resources or network policies.
The dynamic nature of Kubernetes, with its orchestration of pods, nodes, and services, demands a shift in debugging mindset. Andrii emphasized that while writing YAML manifests for K8s is straightforward, ensuring they function as intended is not. He highlighted the absence of comprehensive debugging guides, noting that most literature focuses on deployment rather than troubleshooting. This gap inspired his talk, which aimed to equip developers with practical strategies to diagnose and resolve issues effectively.
Strategies for Effective Debugging
To tackle Kubernetes debugging, Andrii proposed a structured approach, starting with a high-level mind map for assessing pod states. For instance, a pod in a “Pending” state might indicate resource shortages or port conflicts, while a “Crashing” pod could signal health probe failures. He focused on scenarios where pods are running but behaving unexpectedly, a common yet challenging issue. Andrii advocated revisiting init containers, which perform setup tasks like data migrations. By temporarily replacing their commands with a sleep directive, developers can use kubectl exec to inspect the container’s state, checking volumes, permissions, or network access.
For containers lacking debugging tools, Andrii introduced ephemeral containers, a Kubernetes feature since version 1.8 designed for interactive troubleshooting. By launching an ephemeral container with tools like netcat or a debugger, developers can inspect a pod’s state without altering its primary container. He shared a practical example of debugging a Go application by sharing process namespaces, allowing access to the application’s processes. This approach enables setting breakpoints and navigating code, even in minimal, distroless containers.
Leveraging Tools for Enhanced Debugging
Andrii showcased several tools to streamline Kubernetes debugging. He recommended building custom debug containers tailored to specific needs, such as including sqlite, python, or network utilities, and shared his own debug container on GitHub. For network-related issues, he highlighted a pre-existing container with tools like tcpdump, which simplifies packet inspection without requiring manual installations. Andrii also praised Stern, a CLI tool for tailing logs across multiple pods in a replica set, making it easier to trace requests and identify exceptions.
For developers using Visual Studio Code, Andrii demonstrated remote debugging by configuring a launch.json file to connect to a Kubernetes pod. By exposing a debug port and using tools like Telepresence, developers can intercept cluster traffic and test changes locally, bypassing slow CI/CD cycles. He also highlighted K9s, a terminal-based UI for Kubernetes, with a custom plugin for initiating debug sessions via kubectl debug. These tools collectively enhance efficiency, allowing developers to focus on problem-solving rather than manual configuration.
Best Practices for Proactive Debugging
Andrii concluded with actionable best practices to prevent and address debugging challenges. He stressed embedding version information, like Git commit SHAs, into container images to synchronize codebases during remote debugging. Scaling down traffic to a single pod ensures consistent debugging sessions, avoiding request distribution across replicas. He also advocated for a blameless culture, where developers use debuggers to slow down and analyze issues methodically rather than rushing to fix symptoms.
By sharing his GitHub repository and additional resources, Andrii encouraged attendees to experiment with these techniques. His talk was a compelling call to action for developers to embrace robust debugging practices, ensuring resilience and reliability in Kubernetes environments. Through practical demonstrations and a lighthearted approach, he demystified the complexities of cloud-native debugging, empowering developers to tackle issues with confidence.
Links:
[KCD UK 2024] Deep Dive into Kubernetes Runtime Security
Saeid Bostandoust, founder of CubeDemi.io, delivered an in-depth presentation at KCDUK2024 on Kubernetes runtime security, focusing on tools and techniques to secure containers during execution. As a CNCF project contributor, Saeid explored Linux security features like Linux Capabilities, SELinux, AppArmor, Seccomp-BPF, and KubeArmor, providing a comprehensive overview of how these can be applied in Kubernetes to mitigate zero-day attacks and enforce policies. His talk emphasized practical implementation, observability, and policy enforcement, aligning with KCDUK2024’s focus on securing cloud-native environments.
Understanding Runtime Security
Saeid defined runtime security as protecting applications during execution, contrasting it with pre-runtime measures like static code analysis. Runtime security focuses on mitigating zero-day attacks and malicious behavior through real-time intrusion detection, process isolation, policy enforcement, and monitoring. Linux offers over 30 security mechanisms, including SELinux, AppArmor, Linux Capabilities, Seccomp-BPF, and namespaces, alongside kernel drivers to counter threats like Meltdown and Spectre. Saeid focused on well-known features, explaining their roles and Kubernetes integration.
Linux Capabilities: Historically, processes were either privileged (with full root permissions) or unprivileged, leading to vulnerabilities like privilege escalation via commands like ping. Linux Capabilities, introduced to granularly assign permissions, allow processes to perform specific actions (e.g., opening raw sockets for ping) without full root privileges. In Kubernetes, capabilities can be configured in pod manifests to drop unnecessary permissions, enhancing security even for root-run containers.
Seccomp-BPF: Seccomp (Secure Computing) restricts system calls a process can make. Originally limited to basic calls (read, write, exit), Seccomp-BPF extends this with customizable profiles. In Kubernetes, a Seccomp-BPF profile can be defined in a JSON file and applied via a pod’s security context, terminating processes that attempt unauthorized system calls. Saeid demonstrated a restrictive profile that limits a container to basic operations, preventing it from running if additional system calls are needed.
LSM Modules (AppArmor, SELinux, BPF-LSM): Linux Security Modules (LSM) provide hooks to intercept operations, such as file access or network communication. AppArmor uses path-based profiles, while SELinux employs label-based policies. BPF-LSM, a newer option, allows dynamic policy injection via eBPF code, offering flexibility without requiring application restarts. Saeid noted that BPF-LSM, available in kernel versions 5.7 and later, supports stacking with other LSMs, enhancing Kubernetes security.
KubeArmor: Simplifying Policy Enforcement
KubeArmor, a CNCF project, simplifies runtime security by allowing users to define policies via Kubernetes Custom Resource Definitions (CRDs). These policies are translated into AppArmor, SELinux, or BPF-LSM profiles, depending on the worker node’s LSM. KubeArmor addresses the challenge of syncing profiles across cluster nodes, automating deployment and updates. It uses eBPF for observability, monitoring system calls and generating telemetry for tools like Prometheus and Elasticsearch. Saeid showcased KubeArmor’s architecture, including a daemon set with an init container for compiling eBPF code and a relay server for aggregating logs and alerts.
An example policy demonstrated KubeArmor denying access to sensitive files (e.g., /etc/passwd, /etc/shadow) and commands (e.g., apt, apt-get), with logs showing enforcement details. Unlike manual AppArmor or SELinux profiles, which are complex and hard to scale, KubeArmor’s declarative approach and default deny policies simplify securing containers, preventing access to dangerous assets like /proc mounts.
Practical Implementation and Community Engagement
Saeid provided practical examples, such as configuring a pod to drop all capabilities except those needed, applying a Seccomp-BPF profile to restrict system calls, and using KubeArmor to enforce file access policies. He highlighted KubeArmor’s integration with tools like OPA Gatekeeper to block unauthorized commands (e.g., kubectl exec) when BPF-LSM is unavailable. For further learning, Saeid offered a 50% discount on CubeDemi.io’s container security workshop, encouraging KCDUK2024 attendees to deepen their Kubernetes security expertise.
Links:
Hashtags: #Kubernetes #RuntimeSecurity #KubeArmor #CNCF #LinuxSecurity #SaeidBostandoust #KCDUK2024
[DevoxxGR2024] Devoxx Greece 2024 Sustainability Chronicles: Innovate Through Green Technology With Kepler and KEDA
At Devoxx Greece 2024, Katie Gamanji, a senior field engineer at Apple and a technical oversight committee member for the Cloud Native Computing Foundation (CNCF), delivered a compelling presentation on advancing environmental sustainability within the cloud-native ecosystem. With Kubernetes celebrating its tenth anniversary, Katie emphasized the urgent need for technologists to integrate green practices into their infrastructure strategies. Her talk explored how tools like Kepler and KEDA’s carbon-aware operator enable practitioners to measure and mitigate carbon emissions, while fostering a vibrant, inclusive community to drive these efforts forward. Drawing from her extensive experience and leadership in the CNCF, Katie provided a roadmap for aligning technological innovation with climate responsibility.
The Imperative of Cloud Sustainability
Katie began by underscoring the critical role of sustainability in the tech sector, particularly given the industry’s contribution to global greenhouse gas emissions. She highlighted that the tech sector accounts for 1.4% of global emissions, a figure that could soar to 10% within a decade without intervention. However, by leveraging renewable energy, emissions could be reduced by up to 80%. International agreements like COP21 and the United Nations’ Sustainable Development Goals (SDGs) have spurred national regulations, compelling organizations to assess and report their carbon footprints. Major cloud providers, such as Google Cloud Platform (GCP), have set ambitious net-zero targets, with GCP already operating on renewable energy since 2022. Yet, Katie stressed that sustainability cannot be outsourced solely to cloud providers; organizations must embed these principles internally.
The emergence of “GreenOps,” inspired by FinOps, encapsulates the processes, tools, and cultural shifts needed to achieve digital sustainability. By optimizing infrastructure—through strategies like using spot instances or serverless architectures—organizations can reduce both costs and emissions. Katie introduced a four-phase strategy proposed by the FinOps Foundation’s Environmental Sustainability Working Group: awareness, discovery, roadmap, and execution. This framework encourages organizations to educate stakeholders, benchmark emissions, implement automated tools, and iteratively pursue ambitious sustainability goals.
Measuring Emissions with Kepler
To address emissions within Kubernetes clusters, Katie introduced Kepler, a CNCF sandbox project developed by Red Hat and IBM. Kepler, a Kubernetes Efficient Power Level Exporter, utilizes eBPF to probe system statistics and export power consumption metrics to Prometheus for visualization in tools like Grafana. Deployed as a daemon set, Kepler collects node- and container-level metrics, focusing on power usage and resource utilization. By tracing CPU performance counters and Linux kernel trace points, it calculates energy consumption in joules, converting this to kilowatt-hours and multiplying by region-specific emission factors for gases like coal, petroleum, and natural gas.
Katie demonstrated Kepler’s practical application using a Grafana dashboard, which displayed emissions per gas and allowed granular analysis by container, day, or namespace. This visibility enables organizations to identify high-emission components, such as during traffic spikes, and optimize accordingly. As a sandbox project, Kepler is gaining momentum, and Katie encouraged attendees to explore it, provide feedback, or contribute to its development, reinforcing its potential to establish a baseline for carbon accounting in cloud-native environments.
Scaling Sustainably with KEDA’s Carbon-Aware Operator
Complementing Kepler’s observational capabilities, Katie introduced KEDA (Kubernetes Event-Driven Autoscaler), a graduated CNCF project, and its carbon-aware operator. KEDA, created by Microsoft and Red Hat, scales applications based on external events, offering a rich catalog of triggers. The carbon-aware operator optimizes emissions by scaling applications according to carbon intensity—grams of CO2 equivalent emitted per kilowatt-hour consumed. In scenarios where infrastructure is powered by renewable sources like solar or wind, carbon intensity approaches zero, allowing for maximum application replicas. Conversely, high carbon intensity, such as from coal-based energy, prompts scaling down to minimize emissions.
Katie illustrated this with a custom resource definition (CRD) that configures scaling behavior based on carbon intensity forecasts from providers like WattTime or Electricity Maps. In her demo, a Grafana dashboard showed an application scaling from 15 replicas at a carbon intensity of 530 to a single replica at 580, dynamically responding to grid data. This proactive approach ensures sustainability is embedded in scheduling decisions, aligning resource usage with environmental impact.
Nurturing a Sustainable Community
Beyond technology, Katie emphasized the pivotal role of the Kubernetes community in driving sustainability. Operating on principles of inclusivity, open governance, and transparency, the community fosters innovation through technical advisory groups (TAGs) focused on domains like observability, security, and environmental sustainability. The TAG Environmental Sustainability, established just over a year ago, aims to benchmark emissions across graduated CNCF projects, raising awareness and encouraging greener practices.
To sustain this momentum, Katie highlighted the need for education and upskilling. Resources like the Kubernetes and Cloud Native Associate (KCNA) certification and her own Cloud Native Fundamentals course on Udacity lower entry barriers for newcomers. By diversifying technical and governing boards, the community can continue to evolve, ensuring it scales alongside technological advancements. Katie’s vision is a cloud-native ecosystem where innovation and sustainability coexist, supported by a nurturing, inclusive community.
Conclusion
Katie Gamanji’s presentation at Devoxx Greece 2024 was a clarion call for technologists to prioritize environmental sustainability. By leveraging tools like Kepler and KEDA’s carbon-aware operator, practitioners can measure and mitigate emissions within Kubernetes clusters, aligning infrastructure with climate goals. Equally important is the community’s role in fostering education, inclusivity, and collaboration to sustain these efforts. Katie’s insights, grounded in her leadership at Apple and the CNCF, offer a blueprint for innovating through green technology while building a resilient, forward-thinking ecosystem.
Links:
[DevoxxBE2023] Securing the Supply Chain for Your Java Applications by Thomas Vitale
At Devoxx Belgium 2023, Thomas Vitale, a software engineer and architect at Systematic, delivered an authoritative session on securing the software supply chain for Java applications. As the author of Cloud Native Spring in Action and a passionate advocate for cloud-native technologies, Thomas provided a comprehensive exploration of securing every stage of the software lifecycle, from source code to deployment. Drawing on the SLSA framework and CNCF research, he demonstrated practical techniques for ensuring integrity, authenticity, and resilience using open-source tools like Gradle, Sigstore, and Kyverno. Through a blend of theoretical insights and live demonstrations, Thomas illuminated the critical importance of supply chain security in today’s threat landscape.
Safeguarding Source Code with Git Signatures
Thomas began by defining the software supply chain as the end-to-end process of delivering software, encompassing code, dependencies, tools, practices, and people. He emphasized the risks at each stage, starting with source code. Using Git as an example, Thomas highlighted its audit trail capabilities but cautioned that commit authorship can be manipulated. In a live demo, he showed how he could impersonate a colleague by altering Git’s username and email, underscoring the need for signed commits. By enforcing signed commits with GPG or SSH keys—or preferably a keyless approach via GitHub’s single sign-on—developers can ensure commit authenticity, establishing a verifiable provenance trail critical for supply chain security.
Managing Dependencies with Software Bills of Materials (SBOMs)
Moving to dependencies, Thomas stressed the importance of knowing exactly what libraries are included in a project, especially given vulnerabilities like Log4j. He introduced Software Bills of Materials (SBOMs) as a standardized inventory of software components, akin to a list of ingredients. Using the CycloneDX plugin for Gradle, Thomas demonstrated generating an SBOM during the build process, which provides precise dependency details, including versions, licenses, and hashes for integrity verification. This approach, integrated into Maven or Gradle, ensures accuracy over post-build scanning tools like Snyk, enabling developers to identify vulnerabilities, check license compliance, and verify component integrity before production.
Thomas further showcased Dependency-Track, an OWASP project, to analyze SBOMs and flag vulnerabilities, such as a critical issue in SnakeYAML. He introduced the Vulnerability Exploitability Exchange (VEX) standard, which complements SBOMs by documenting whether vulnerabilities affect an application. In his demo, Thomas marked a SnakeYAML vulnerability as a false positive due to Spring Boot’s safe deserialization, demonstrating how VEX communicates security decisions to stakeholders, reducing unnecessary alerts and ensuring compliance with emerging regulations.
Building Secure Artifacts with Reproducible Builds
The build phase, Thomas explained, is another critical juncture for security. Using Spring Boot as an example, he outlined three packaging methods: JAR files, native executables, and container images. He critiqued Dockerfiles for introducing non-determinism and maintenance overhead, advocating for Cloud Native Buildpacks as a reproducible, secure alternative. In a demo, Thomas built a container image with Buildpacks, highlighting its fixed creation timestamp (January 1, 1980) to ensure identical outputs for unchanged inputs, enhancing security by eliminating variability. This reproducibility, coupled with SBOM generation during the build, ensures artifacts are both secure and traceable.
Signing and Verifying Artifacts with SLSA
To ensure artifact integrity, Thomas introduced the SLSA framework, which provides guidelines for securing software artifacts across the supply chain. He demonstrated signing container images with Sigstore’s Cosign tool, using a keyless approach to avoid managing private keys. This process, integrated into a GitHub Actions pipeline, ensures that artifacts are authentically linked to their creator. Thomas further showcased SLSA’s provenance generation, which documents the artifact’s origin, including the Git commit hash and build steps. By achieving SLSA Level 3, his pipeline provided non-falsifiable provenance, ensuring traceability from source code to deployment.
Securing Deployments with Policy Enforcement
The final stage, deployment, requires validating artifacts to ensure they meet security standards. Thomas demonstrated using Cosign and the SLSA Verifier to validate signatures and provenance, ensuring only trusted artifacts are deployed. On Kubernetes, he introduced Kyverno, a policy engine that enforces signature and provenance checks, automatically rejecting non-compliant deployments. This approach ensures that production environments remain secure, aligning with the principle of validating metadata to prevent unauthorized or tampered artifacts from running.
Conclusion: A Holistic Approach to Supply Chain Security
Thomas’s session at Devoxx Belgium 2023 provided a robust framework for securing Java application supply chains. By addressing source code integrity, dependency management, build reproducibility, artifact signing, and deployment validation, he offered a comprehensive strategy to mitigate risks. His practical demonstrations, grounded in open-source tools and standards like SLSA and VEX, empowered developers to adopt these practices without overwhelming complexity. Thomas’s emphasis on asking “why” at each step encouraged attendees to tailor security measures to their context, ensuring both compliance and resilience in an increasingly regulated landscape.
Links:
[DevoxxPL2022] From Private Through Hybrid to Public Cloud – Product Migration • Paweł Piekut
At Devoxx Poland 2022, Paweł Piekut, a seasoned software developer at Bosch, delivered an insightful presentation on the migration of their e-bike cloud platform from a private cloud to a public cloud environment. Drawing from his expertise in Java, Kotlin, and .NET, Paweł narrated the intricate journey of transitioning a complex IoT ecosystem, highlighting the technical challenges, strategic decisions, and lessons learned. His talk offered a practical roadmap for organizations navigating the complexities of cloud migration, emphasizing the balance between innovation, scalability, and compliance.
Navigating the Private Cloud Landscape
Paweł began by outlining the initial deployment of Bosch’s e-bike cloud on a private cloud developed internally by the company’s IT group. This proprietary platform, designed to support the e-bike ecosystem, facilitated communication between hardware components—such as drive units, batteries, and controllers—and the mobile app, which interfaced with the cloud. The cloud served multiple stakeholders, including factories for device flashing, manufacturers for configuration, authorized services for diagnostics, and end-users for features like activity tracking and bike locking. However, the private cloud faced significant limitations. Scalability was constrained, requiring manual capacity requests and investments, which hindered agility. Downtimes were frequent, acceptable for development but untenable for production. Additionally, the platform’s bespoke nature made it challenging to hire experienced talent and limited developer engagement due to its lack of market-standard tools.
Despite these drawbacks, the private cloud offered advantages. Its deployment within Bosch’s secure network ensured high performance and simplified compliance with data privacy regulations, critical for an international product subject to data localization laws. Costs were predictable, and the absence of vendor lock-in, thanks to open-source frameworks, provided flexibility. However, the need for modern scalability and developer-friendly tools drove the decision to explore public cloud solutions, with Amazon Web Services (AWS) selected for its robust support.
The Hybrid Cloud Conundrum
Transitioning to a hybrid cloud model introduced a blend of private and public cloud environments, creating new challenges. Bosch’s internal policy of “on-transit data” required data processed in the public cloud to be returned to the private cloud, necessitating complex and secure data transfers. While AWS Direct Connect facilitated this, the hybrid setup led to operational complexities. Only select services ran on AWS, causing a divide among developers eager to work with widely recognized public cloud tools. Technical issues, such as Kafka’s inaccessibility from the private cloud, required significant effort to resolve. Error tracing across clouds was cumbersome, with Splunk used in the private cloud and Elasticsearch in the public cloud, complicating root-cause analysis. The simultaneous migration of Jenkins added further complexity, with duplicated jobs and confusing configurations.
Despite these hurdles, the hybrid model offered benefits. It allowed Bosch to leverage the private cloud’s security for sensitive data while tapping into the public cloud’s scalability for peak loads. This setup supported disaster recovery and compliance with data localization requirements. However, the on-transit data concept proved overly complex, leading to dissatisfaction and prompting a strategic shift toward a cloud-first approach, prioritizing public cloud deployment unless justified otherwise.
Embracing the Public Cloud
The full migration to AWS marked a pivotal phase, divided into three stages. First, the team focused on exploration and training to master AWS products and the pay-as-you-go pricing model, which made every developer accountable for costs. This stage emphasized understanding managed versus unmanaged services, such as Kubernetes and Kafka, and ensuring backup compatibility across clouds. The second stage involved building new applications on AWS, addressing unknowns and ensuring secure communication with external systems. Finally, existing services were migrated from private to public cloud, starting with development and progressing to production. Throughout, the team maintained services in both environments, managing separate repositories and addressing critical bugs, such as Log4j vulnerabilities, across both.
To mitigate vendor lock-in, Bosch adopted a cloud-agnostic approach, using Terraform for infrastructure-as-code instead of AWS-specific CloudFormation. While tools like S3 and DynamoDB were embraced for their market-leading performance, backups were standardized to ensure portability. The public cloud’s vast community, extensive documentation, and readily available resources reduced knowledge silos and enhanced developer satisfaction, making the migration a transformative step for innovation and agility.
Lessons for Cloud Migration
Paweł’s experience underscores the importance of aligning cloud strategy with organizational needs. The public cloud’s immediate resource availability and developer-friendly tools accelerated development, but required careful cost management. Hybrid cloud offered flexibility but introduced complexity, particularly with data transfers. Private cloud provided security and control but lacked scalability. Paweł emphasized defining precise requirements—budget, priorities, and compliance—before choosing a cloud model. Startups may favor public clouds for agility, while regulated industries might opt for private or hybrid solutions to prioritize data security and network performance. This strategic clarity ensures a successful migration tailored to business goals.
Links:
[DevoxxPL2022] Challenges Running Planet-Wide Computer: Efficiency • Jacek Bzdak, Beata Strack
Jacek Bzdak and Beata Strack, software engineers at Google Poland, delivered an engaging session at Devoxx Poland 2022, exploring the intricacies of optimizing Google’s planet-scale computing infrastructure. Their talk focused on achieving efficiency in a distributed system spanning global data centers, emphasizing resource utilization, auto-scaling, and operational strategies. By sharing insights from Google’s internal cloud and Autopilot system, Jacek and Beata provided a blueprint for enhancing service performance while navigating the complexities of large-scale computing.
Defining Efficiency in a Global Fleet
Beata opened by framing Google’s data centers as a singular “planet-wide computer,” where efficiency translates to minimizing operational costs—servers, CPU, memory, data centers, and electricity. Key metrics like fleet-wide utilization, CPU/RAM allocation, and growth rate serve as proxies for these costs, though they are imperfect, often masking quality issues like inflated memory usage. Beata stressed that efficiency begins at the service level, where individual jobs must optimize resource consumption, and extends to the fleet through an ecosystem that maximizes resource sharing. This dual approach ensures that savings at the micro level scale globally, a principle applicable even to smaller organizations.
Auto-Scaling: Balancing Utilization and Reliability
Jacek, a member of Google’s Autopilot team, delved into auto-scaling, a critical mechanism for achieving high utilization without compromising reliability. Autopilot’s vertical scaling adjusts resource limits (CPU/memory) for fixed replicas, while horizontal scaling modifies replica counts. Jacek presented data from an Autopilot paper, showing that auto-scaled services maintain memory slack below 20% for median cases, compared to over 60% for manually managed services. Crucially, automation reduces outage risks by dynamically adjusting limits, as demonstrated in a real-world case where Autopilot preempted a memory-induced crash. However, auto-scaling introduces complexity, particularly feedback loops, where overzealous caching or load shedding can destabilize resource allocation, requiring careful integration with application-specific metrics.
Java-Specific Challenges in Auto-Scaling
The talk transitioned to language-specific hurdles, with Jacek highlighting Java’s unique challenges in auto-scaling environments. Just-in-Time (JIT) compilation during application startup spikes CPU usage, complicating horizontal scaling decisions. Memory management poses further issues, as Java’s heap size is static, and out-of-memory errors may be masked by garbage collection (GC) thrashing, where excessive CPU is devoted to GC rather than request handling. To address this, Google sets static heap sizes and auto-scales non-heap memory, though Jacek envisioned a future where Java aligns with other languages, eliminating heap-specific configurations. These insights underscore the need for language-aware auto-scaling strategies in heterogeneous environments.
Operational Strategies for Resource Reclamation
Beata concluded by discussing operational techniques like overcommit and workload colocation to reclaim unused resources. Overcommit leverages the low probability of simultaneous resource spikes across unrelated services, allowing Google to pack more workloads onto machines. Colocating high-priority serving jobs with lower-priority batch workloads enables resource reclamation, with batch tasks evicted when serving jobs demand capacity. A 2015 experiment demonstrated significant machine savings through colocation, a concept influencing Kubernetes’ design. These strategies, combined with auto-scaling, create a robust framework for efficiency, though they demand rigorous isolation to prevent interference between workloads.
Links:
[DevoxxPL2022] How We Migrate Customers and Internal Teams to Kubernetes • Piotr Bochyński
At Devoxx Poland 2022, Piotr Bochyński, a seasoned cloud native expert at SAP, shared a compelling narrative on transitioning customers and internal teams from a Cloud Foundry-based platform to Kubernetes. His presentation illuminated the strategic imperatives, technical challenges, and practical solutions that defined SAP’s journey toward a multi-cloud Kubernetes ecosystem. By leveraging open-source projects like Kyma and Gardener, Piotr’s team addressed the limitations of their legacy platform, fostering developer productivity and operational scalability. His insights offer valuable lessons for organizations contemplating a similar migration.
Understanding Platform as a Service
Piotr began by contextualizing Platform as a Service (PaaS), a model that abstracts infrastructure complexities, allowing developers to focus on application development. Unlike Infrastructure as a Service (IaaS), which provides raw virtual machines, PaaS delivers managed runtimes, middleware, and automation, accelerating time-to-market. However, this convenience comes with trade-offs, such as reduced control and potential vendor lock-in, often tied to opinionated frameworks like the 12-factor application methodology. Piotr highlighted SAP’s initial adoption of Cloud Foundry, an open-source PaaS, to avoid vendor dependency while meeting multi-cloud requirements driven by legal and business needs, particularly in sectors like banking. Yet, Cloud Foundry’s constraints, such as single HTTP port exposure and reliance on outdated technologies like BOSH, prompted SAP to explore Kubernetes as a more flexible alternative.
Kubernetes: A Platform for Platforms
Kubernetes, as Piotr elucidated, is not a traditional PaaS but a container orchestration framework that serves as a foundation for building custom platforms. Its declarative API and extensibility distinguish it from predecessors, enabling consistent management of diverse resources like deployments, namespaces, and custom objects. Piotr illustrated this with the thermostat analogy: developers declare a desired state (e.g., 22 degrees), and Kubernetes controllers reconcile the actual state to match it. This pattern, applied uniformly across resources, empowers developers to extend Kubernetes with custom controllers, such as a hypothetical thermostat resource. The Kyma project, an open-source initiative led by SAP, builds on this extensibility, providing opinionated building blocks like Istio-based API gateways, NATS eventing, and serverless functions to bridge the gap between raw Kubernetes and a developer-friendly PaaS.
Overcoming Migration Challenges
The migration to Kubernetes presented multifaceted challenges, from technical complexity to cultural adoption. Piotr emphasized the steep learning curve associated with Kubernetes’ vast resource set, compounded by additional components like Prometheus and Istio. To mitigate this, SAP employed Kyma to abstract complexities, offering simplified resources like API rules that encapsulate Istio configurations for secure service exposure. Another hurdle was ensuring multi-cloud compatibility. SAP’s Gardener project, a managed Kubernetes solution, addressed this by providing a consistent, Kubernetes-compliant layer across providers like AWS, Azure, and Google Cloud. Piotr also discussed operational scalability, managing thousands of clusters for hundreds of teams. By applying the Kubernetes controller pattern, SAP automated cluster provisioning, upgrades, and security patching, reducing manual intervention and ensuring reliability.
Lessons from the Journey
Reflecting on the migration, Piotr candidly shared missteps that shaped SAP’s approach. Early attempts to shield users from Kubernetes’ complexity by mimicking Cloud Foundry’s API failed, as developers craved direct control over Kubernetes resources. Similarly, restricting cluster admin roles to prevent misconfigurations stifled innovation, leading SAP to grant greater flexibility. Some technology choices, like the Service Catalog project, proved inefficient, underscoring the importance of aligning with Kubernetes’ operator pattern. License changes in tools like Grafana also necessitated pivots, highlighting the need for vigilance in open-source dependencies. Piotr’s takeaways resonate broadly: Kubernetes is a long-term investment, requiring a balance of opinionated tooling and developer freedom, with automation as a cornerstone for scalability.