Jonathan Lalou's Blog

Posts Tagged ‘GenerativeAI’

[AWSReInvent2025] Supercharging DevOps with AI-Driven Observability: The Next Frontier in SRE

Lecturer

Elizabeth Fuentes is a Senior Developer Advocate at Amazon Web Services (AWS), specializing in the intersection of Artificial Intelligence and DevOps practices. With extensive experience in cloud architecture and software engineering, Elizabeth focuses on how Generative AI can streamline complex CI/CD pipelines and enhance Site Reliability Engineering (SRE). She is a key contributor to AWS educational initiatives, having co-developed advanced courses on AI-driven automation. Joining her is Laas Alina, a software architect and open-source enthusiast who focuses on implementing multi-agent systems and the Model Context Protocol (MCP) to solve observability challenges at scale.

Abstract

As software systems grow increasingly distributed and complex, traditional observability—centered on manual log analysis and reactive dashboards—is becoming insufficient. This article explores the paradigm shift toward AI-driven observability, where Generative AI serves not just as a query tool, but as an active participant in failure detection, correlation, and resolution. By leveraging Amazon Bedrock and Amazon Q, organizations can transition from “reactive” to “predictive” DevOps. The discussion analyzes the methodology of building AI agents that simulate architectural stress, automatically explain multi-layered failures, and provide traceable, actionable recommendations. We examine the implementation of the Model Context Protocol (MCP) in establishing sophisticated multi-agent systems (MAS) that transform raw data into contextual understanding, ultimately reducing the Mean Time to Resolution (MTTR) and enhancing systemic resilience.

The Evolution of Observability: From Metrics to Contextual Understanding

The traditional pillars of observability—metrics, logs, and traces—provide the “what” of a system’s state but often fail to provide the “why” in real-time. In high-velocity DevOps environments, the sheer volume of telemetry data can overwhelm human operators, leading to “alert fatigue” and delayed responses to critical incidents. Elizabeth posits that the integration of Generative AI marks the fourth pillar of observability: Contextual Intelligence. This evolution moves the industry beyond simple threshold-based monitoring toward systems that understand the semantic relationship between a failed deployment, a spike in latency, and a specific line of code.

By utilizing Large Language Models (LLMs) through Amazon Bedrock, DevOps teams can ingest vast amounts of unstructured log data and receive summaries that highlight anomalies that might be missed by traditional regex-based filters. The methodology involves training the AI to recognize “normal” operational patterns and identifying deviations not just by value, but by the intent of the system’s behavior. This contextual layer allows for a more nuanced interpretation of system health, where the AI can distinguish between a benign resource spike and a precursor to a cascading failure.

Architecting AI Agents for Predictive Troubleshooting

The transition to AI-driven observability is characterized by the deployment of “Micro-agents”—specialized AI entities designed to handle specific segments of the DevOps lifecycle. These agents operate within a Multi-Agent System (MAS), where they collaborate to solve complex incidents. For instance, a “Monitoring Agent” might detect a performance degradation and immediately trigger a “Diagnosis Agent” to correlate the event with recent CI/CD pipeline changes.

Elizabeth and Laas Alina emphasize the importance of the Model Context Protocol (MCP) in this architecture. MCP acts as the communication backbone, allowing agents to share context without losing the “lineage” of a decision. When an AI agent recommends a specific architectural change or a rollback, it must provide clear traceability. This is crucial for maintaining trust in automated systems. The agents do not operate in a vacuum; they interact with tools like Amazon Q to provide developers with instant explanations of failures directly within their Integrated Development Environment (IDE) or chat interface.

// Example of an AI-driven Observability Agent Configuration
agent:
  name: "IncidentDiagnosticAgent"
  provider: "AmazonBedrock"
  model: "claude-3-sonnet"
  capabilities:
    - log_analysis
    - metric_correlation
    - trace_summarization
  mcp_config:
    protocol_version: "1.0"
    shared_context: "deployment_metadata"
  safety_guardrails:
    - max_token_usage: 4000
    - human_in_the_loop_required: true

Transforming CI/CD through Generative AI and Simulation

Beyond reactive troubleshooting, AI-driven observability empowers proactive system design. One of the most innovative concepts discussed is the use of AI agents to simulate “stress-test” scenarios within a digital twin of the production environment. These agents can intentionally inject failures—similar to Chaos Engineering—and then observe how the observability stack responds. This creates a feedback loop where the AI helps engineers identify “blind spots” in their monitoring before a real incident occurs.

Furthermore, Generative AI transforms the CI/CD pipeline by automatically generating “failure explanations.” Instead of a developer sifting through a 5,000-line build log, Amazon Q can provide a concise summary: “The build failed because the new database schema in commit X is incompatible with the connection pool settings in environment Y.” This level of automated insight accelerates the “inner loop” of development, allowing engineers to focus on innovation rather than infrastructure archeology.

The Human-AI Partnership: Strategic Implications

A common concern in the industry is the replacement of human engineers by AI. However, Elizabeth argues that the future belongs to the “augmented engineer.” AI is a force multiplier that automates the repetitive, “drudge work” of observability—log parsing and initial triage—allowing human experts to focus on high-level strategy and complex architectural decisions. The goal is to transform teams from being “reactive” (fighting fires) to “proactive” (preventing fires).

Implementing these systems requires a cultural shift toward AI-literacy within DevOps teams. Organizations must establish safety guardrails to ensure that AI-driven recommendations are validated and that automated actions (like auto-remediation) have clear rollback paths. By embracing AI as a strategic tool, DevOps and SRE teams can achieve a level of operational excellence that was previously unattainable, ensuring that as systems grow in scale, their reliability grows in parallel.

Links:

Posted in en-US | Tags: AI, AmazonBedrock, AmazonQ, Automation, AWS, AWSReInvent2025, CloudComputing, devops, ElizabethFuentes, GenerativeAI, LaasAlina, Observability, SRE | No Comments »

[AWSReInvent2025] Accelerating Enterprise Modernization: The Architecture of Composable AI Agents

Author: Jonathan Lalou

Lecturer

Mortaza Chowri is the Head of Product Management for the AWS Transform team, where he leads the development of next-generation tools for complex workload migration. He is an expert in leveraging generative AI to automate technical debt reduction for large-scale enterprises. Joining him are Alexi and Ravi, who serve as senior architects within the AWS Transform division, specializing in agentic AI implementation and the creation of composable system frameworks. The session also features strategic insights from the leadership team at Capgemini, who collaborate with AWS to deliver industry-specific modernization solutions for global banking and automotive clients.

Abstract

Enterprise modernization is frequently paralyzed by the extreme complexity of legacy systems, particularly decades-old mainframes and aging Windows-bound .NET applications. This article explores the innovative framework of AWS Transform, a centralized service that utilizes “Agentic AI” to automate and streamline the migration process. The methodology centers on the concept of composability, which allows AWS partners to integrate their proprietary industry knowledge and specialized tools with foundational AI agents. By utilizing a sophisticated chat-based interface and automated business rule extraction, the platform enables a seamless transition from legacy COBOL and .NET Framework 4.x to modern, cloud-native architectures. The analysis demonstrates how these composable agents create a continuous feedback loop that significantly reduces manual effort, improves documentation, and ensures business logic remains intact during high-risk migrations.

Context: The Burden of Technical Debt and Knowledge Atrophy

Many of the world’s most critical systems, particularly in finance and manufacturing, are still dependent on infrastructure built in the late 20th century. These legacy environments present three primary obstacles that prevent organizations from achieving modern agility. First, knowledge atrophy has become a critical risk, as the original architects of these mainframe systems have often retired, leaving behind “black box” applications that lack contemporary documentation. Second, the technical debt associated with older languages like COBOL is immense, as these systems were never designed to leverage modern cloud features such as serverless compute or elastic auto-scaling.

Third, the mission-critical nature of these systems creates a state of risk aversion, where the fear of breaking a core business process during a manual rewrite often leads to stagnation. AWS Transform was specifically developed to break this cycle of inertia. By providing a unified experience that integrates discovery, assessment, and modernization into a single platform, AWS allows enterprises to view their legacy code as an asset to be reimagined rather than a liability to be feared.

Methodology: Agentic AI and the Composable Framework

The core technical innovation of AWS Transform is the transition from static point solutions to a dynamic, “unified experience” powered by specialized AI agents. These agents are designed to perform complex technical tasks with a level of autonomy that far exceeds traditional automation scripts. The methodology is built upon several key pillars of agentic behavior. Discovery agents are tasked with automatically mapping technical artifacts, such as physical servers and complex database schemas, to their optimal cloud-native equivalents.

Modernization agents, specifically those tuned for mainframe environments, perform the difficult work of extracting business rules from legacy code. This process generates comprehensive documentation that allows current engineers to “comprehend” the underlying logic of systems they did not build. The most transformative aspect of this methodology is its composability for partners. AWS provides the foundational intelligence and large language models, while partners such as Capgemini can “compose” these with their own specialized knowledge bases and custom transformation rules. This enables the creation of industry-specific agents, such as a modernization assistant specifically optimized for banking regulations or complex automotive production logic.

Technical Analysis of Mainframe Rule Extraction

The implementation of these agents in real-world scenarios, particularly through the collaboration with Capgemini, highlights a sophisticated “forward engineering” approach. In this workflow, the AI agents first scan the legacy code to identify core business logic and immutable rules. This extraction phase is critical because it ensures that while the code is updated, the essential business functions remain perfectly intact. Following extraction, the reimagination phase begins, where these rules are integrated into a modern architecture that meets cloud-native standards for security and performance.

Practitioners interact with these systems through a chat experience within the AWS Transform interface, allowing them to query both the AI agents and integrated domain experts directly. This interaction model democratizes the modernization process, making it accessible to developers who may not have expertise in COBOL but are proficient in modern languages like Java or Python. The platform serves as a bridge, translating the “what” of legacy business logic into the “how” of modern cloud execution.

Outcomes: Efficiency, Consistency, and Continuous Learning

The deployment of composable AI agents has fundamentally altered the economics and speed of enterprise modernization. By automating the most labor-intensive parts of code comprehension and translation, organizations have reported a reduction in manual effort by as much as 80%. This allows teams to focus on high-value innovation rather than the repetitive task of line-by-line code migration. Furthermore, the platform ensures architectural consistency across a large organization, preventing the fragmentation that often occurs when different teams use varying migration tools.

One of the most significant consequences of this approach is the continuous improvement of the agents themselves. Every modernization task performed through the platform provides feedback data that enhances the underlying AI models. As these agents encounter more diverse enterprise environments, their ability to handle edge cases and complex business rules grows exponentially. This creates a virtuous cycle where each successful migration makes the next one faster and more reliable, effectively solving the problem of knowledge atrophy for the long term.

Conclusion

The shift toward agentic AI and composable architectures represents a milestone in the evolution of enterprise IT. AWS Transform provides a robust framework that allows organizations to tackle their most daunting legacy challenges with a level of confidence and speed that was previously impossible. By allowing partners to integrate their unique industry expertise into a centralized AI system, AWS has created a scalable ecosystem that transforms modernization from a risky, multi-year endeavor into a manageable and continuous strategic process.

Links:

Posted in en-US | Tags: AgenticAI, AWSReInvent2025, AWSTransform, Capgemini, CloudMigration, ComposableArchitecture, EnterpriseIT, GenerativeAI, MainframeModernization, Modernization, MortazaChowri | No Comments »

[AWSReInvent2025] Scaling Customer Support, Compliance, and Productivity with Conversational AI at Coinbase

Author: Jonathan Lalou

Lecturer

Joshua Smith is a Senior Solutions Architect at Amazon Web Services (AWS), specializing in financial services. He collaborates closely with major institutions to design scalable, secure cloud architectures.
Vara Maharivan serves as Director of Machine Learning and Artificial Intelligence at Coinbase, leading the company’s efforts to integrate advanced AI and machine learning capabilities across its cryptocurrency platform.

Abstract

This session examines how Coinbase, a leading cryptocurrency exchange, has deployed a unified generative AI platform built on Amazon Bedrock to transform three critical operational domains: customer support, regulatory compliance, and internal developer productivity. The presentation details the architectural approach, key AWS services leveraged, real-world performance metrics, and the strategic roadmap ahead. By combining retrieval-augmented generation (RAG), tool execution, and domain-specific agents, Coinbase has achieved substantial automation, cost efficiencies, and enhanced user experiences while maintaining rigorous security and compliance standards.

The Evolution of Generative AI in Financial Services

Joshua Smith opened the discussion by contextualizing the rapid maturation of generative AI within financial services. In 2023, early adoption centered on foundational concerns such as data trust and secure retrieval mechanisms. By 2024, the introduction of Amazon Bedrock enabled broader experimentation in areas like customer support, with focus shifting toward scalability, granular access controls, and integration with existing enterprise tools. Entering 2025, the landscape has progressed toward fully agentic, multi-agent systems capable of autonomously orchestrating complex workflows.

Smith emphasized that the primary challenge is no longer prototyping conversational interfaces but rather re-engineering entire business processes to deliver measurable impact on key performance indicators. This shift demands robust infrastructure, advanced security primitives, and operational frameworks tailored for agentic workloads.

AWS Services Enabling Production-Grade Agentic AI

Central to the discussion was Amazon Bedrock, a fully managed service providing access to leading foundation models through a unified API. Bedrock supports private model customization, guardrails for safety, cost-latency optimization, and, notably, Agent Core—a suite of capabilities designed to operationalize agents at scale.

Agent Core addresses critical production gaps: a serverless runtime supporting long-running multimodal agents (up to eight hours), checkpointing and recovery, identity management compatible with existing providers, secure token vaults, shared and private memory, tool discovery with fine-grained controls, and centralized observability combining logs, traces, and metrics. These components collectively mitigate risks highlighted in industry reports, such as escalating costs, unclear value, and insufficient security, which threaten the viability of agentic initiatives.

Coinbase’s Strategic Vision for AI Integration

Vara Maharivan outlined Coinbase’s mission to increase economic freedom through a trusted global cryptocurrency platform. The company rests on three pillars: building trust via top-tier security, enhancing accessibility through intuitive experiences, and scaling operations efficiently across more than 100 countries.

AI and machine learning have long underpinned fraud detection, risk assessment, personalization, and infrastructure scaling at Coinbase. Recent innovations include graph neural network-based risk scoring for blockchain addresses, ERC-20 scam token detection combining smart contract auditing with ML, and predictive scaling models to handle market volatility.

With the advent of large language models, Coinbase identified three high-impact generative AI domains: customer support automation, compliance process acceleration, and developer productivity enhancement.

Transforming Customer Support with Agentic Workflows

Crypto markets exhibit extreme volatility, driving unpredictable spikes in user inquiries that challenge traditional human-staffed support models. Coinbase addressed this through a unified generative AI platform granting fluid access to models and internal data via standardized interfaces.

The architecture features a virtual assistant handling routine interactions autonomously and an agent-assist tool empowering human representatives. The virtual assistant resolves straightforward cases end-to-end, while the assistive tool synthesizes real-time information from knowledge bases and tools, providing agents with contextual summaries, suggested responses, and multilingual capabilities.

Results demonstrate significant impact: approximately 65% of customer contacts are now automated, yielding nearly five million annualized employee-hour savings. Automated cases resolve in under ten minutes—contrasting sharply with up to forty minutes for human-handled escalations—dramatically improving customer satisfaction and operational efficiency.

Streamlining Compliance through AI-Augmented Investigations

Regulatory compliance in financial services demands rigorous processes such as KYC, KYB, and transaction monitoring. These workflows are labor-intensive, require exhaustive explainability, and must adapt to diverse jurisdictional requirements.

Coinbase augmented traditional ML-based risk detection models (deployed via Anyscale on AWS EKS) with generative AI. A compliance-assist tool aggregates data from internal systems and open-source intelligence, producing narrative summaries and risk signals for human reviewers.

At the core lies an autoresolution engine orchestrating holistic reviews. Upon a high-risk alert, the engine coordinates data synthesis, automated actions, human-in-the-loop feedback, and customer information requests. Final decisions—such as filing Suspicious Activity Reports—remain with human compliance officers, preserving accountability while accelerating throughput and consistency.

Boosting Developer Productivity across the SDLC

Developer efficiency emerged as another strategic priority. Coinbase provides multiple best-in-class coding assistants (e.g., Claude Code, Cursor) powered by Anthropic models via Bedrock, allowing engineers to select preferred tools.

A custom GitHub Action automates pull-request reviews: summarizing changes, generating natural-language comments, enforcing conventions, identifying testing gaps, and offering debugging guidance for CI failures. This shifts human review toward higher-value architectural concerns.

For quality assurance, an in-house UI testing tool translates natural-language test descriptions into autonomous browser actions across form factors, achieving parity with human accuracy, triple the bug-detection rate, and 86% cost reduction versus manual testing.

Quantifiable outcomes include nearly 40% of daily code being AI-generated or influenced (targeting 50%), 75,000 annual hours saved via automated PR reviews, and dramatically faster test introduction.

Future Directions and Platform Modernization

Coinbase aims to democratize agentic AI across the organization, enabling every employee to experiment and innovate. Ongoing efforts focus on modernizing existing tools and scaling enterprise-wide impact.

Agent Core features—secure deployment, robust identity management, advanced memory, and interoperability—are viewed as pivotal for the next phase of expansion.

Conclusion

The Coinbase case illustrates a mature approach to generative AI deployment: leveraging a unified platform on Amazon Bedrock to address volatility-driven operational challenges while upholding security and regulatory standards. By combining autonomous agents, human augmentation, and rigorous evaluation, the company has realized substantial automation, cost savings, and quality improvements across support, compliance, and engineering functions. As agentic systems evolve, such integrated architectures offer a blueprint for financial institutions seeking transformative efficiency without compromising trust.

Links:

Lecture video

Posted in en-US | Tags: AgenticAI, AmazonBedrock, AWSreInvent, AWSReInvent2025, Coinbase, Compliance, Crypto, CustomerSupport, DeveloperProductivity, FinancialServices, GenerativeAI, JoshuaSmith, MachineLearning, VaraMaharivan | No Comments »

[AWSReInventPartnerSessions2024] Usage

Author: Jonathan Lalou

spec = “Sort a list of numbers”
code = generate_code(spec)
tests = [([3, 1, 2], [1, 2, 3]), ([5, 4], [4, 5])]
if test_code(code, tests):
print(“Code passes tests”)
“`

This exemplifies the iterative process of generation and validation central to the platform.

Analytical Implications for Efficiency and Innovation

The deployment of GenWizard reveals profound implications for operational efficiency. By automating repetitive tasks, it allows teams to focus on high-value activities, reducing project timelines by up to seventy percent in some cases. This efficiency stems from the platform’s ability to handle complex correlations and predictions, as seen in incident management where noise reduction leads to faster resolutions.

Innovation is fostered through enhanced decision-making. The system’s knowledge base, enriched with historical data and AI insights, supports proactive strategies like predictive maintenance and application rationalization. For instance, analyzing application portfolios identifies redundancies, enabling cost savings and streamlined operations.

Collaboration with technology partners like AWS amplifies these benefits. Amazon Q’s integration ensures seamless natural language interactions, democratizing access to advanced tools and promoting a culture of continuous improvement.

Consequences for Enterprise Adoption and Future Directions

Enterprise adoption of such platforms mitigates risks associated with legacy systems, facilitating smoother migrations and modernizations. However, challenges include ensuring data privacy and model accuracy, addressed through robust governance frameworks.

Future directions involve expanding agentic capabilities to encompass more lifecycle stages, potentially incorporating multimodal AI for broader applications. This could revolutionize industries by enabling autonomous operations, where systems self-optimize based on real-time data.

In conclusion, the fusion of generative AI with service delivery platforms like GenWizard, powered by AWS, represents a paradigm shift toward intelligent, efficient technology management, promising sustained competitive advantages.

Links:

Posted in en-US | Tags: Accenture, AmazonQ, AWS, AWSReInventPartnerSessions2024, GenerativeAI, Innovation, KishorPanth, LukeHiggins, ServiceDelivery, TechnologyLifecycle | No Comments »

[GoogleIO2025] What’s new in Go

Author: Jonathan Lalou

Keynote Speakers

Cameron Balahan serves as the Group Product Manager and lead for the Go programming language at Google, overseeing its strategic development and integration within cloud ecosystems. With a background from The George Washington University, he focuses on enhancing developer productivity and scaling tools for mission-critical applications.

Marc Dougherty functions as the lead for Developer Relations in Go at Google, bridging the community with advancements in the language. His expertise lies in site reliability engineering turned developer advocacy, emphasizing practical implementations for reliable software systems.

Abstract

This scholarly examination probes the recent evolutions in the Go programming language, particularly version 1.24, spotlighting enhancements in cryptography, type systems, and runtime efficiency. It dissects foundational principles guiding Go’s design, methodologies for AI infrastructure integration, and forward-looking initiatives like SIMD optimizations. Through code demonstrations and contextual analyses, the narrative evaluates implications for scalable, secure software engineering, underscoring Go’s role in contemporary cloud and generative AI landscapes.

Foundational Principles and Historical Context

Cameron Balahan and Marc Dougherty commence by delineating Go’s origins, conceived over 15 years ago at Google to reconcile productivity in dynamic languages with the robustness of compiled ones. Balahan articulates Go’s ethos: a language engineered for scalability from inception, addressing modern software architectures, operational environments, and collaborative teams. This premise manifests in three pillars: productivity through simplicity and readability; a holistic developer ecosystem spanning IDE to deployment; and production readiness emphasizing reliability, efficiency, and security.

Contextually, Go emerged amid Google’s challenges in maintaining vast systems, evolving into a cornerstone of cloud infrastructure. Dougherty highlights its adoption in pivotal technologies like Kubernetes and Docker, attributing this to inherent cloud-native features rather than retrofits. User satisfaction metrics, exceptionally high, reflect this alignment, with Go’s growth surpassing developer population trends.

The discourse transitions to version 1.24’s innovations, building on 1.23’s iterator additions and runtime telemetry. Balahan explains post-quantum cryptography integration, fortifying against quantum threats via hybrid key exchanges in TLS. This methodology combines classical and quantum-resistant algorithms, ensuring forward compatibility without immediate overhauls.

Type alias generics, now fully supported, enhance code modularity by permitting aliases with type parameters, facilitating incremental migrations in large codebases. Runtime optimizations, including profile-guided enhancements, reduce CPU overhead by 2-3%, optimizing garbage collection and scheduling for high-throughput scenarios.

Implications extend to enterprise adoption, where Go’s backward compatibility—unchanged since version 1.0—assures long-term stability, contrasting with languages prone to breaking changes.

AI Infrastructure and Generative Applications

Dougherty pivots to Go’s burgeoning role in AI, leveraging its concurrency model and efficiency for infrastructure like vector databases and serving frameworks. He posits Go’s simplicity as ideal for AI’s rapid evolution, where readable code withstands complexity.

Methodologies for AI workloads involve embedding models and vector stores, demonstrated via integrations with Gemini and Weaviate. Code samples illustrate query handling:

func handleQuery(query string) {
    // Embed query using Gemini
    embedding := gemini.Embed(query)

    // Query Weaviate via GraphQL
    docs := weaviate.Query(embedding)

    // Generate response
    response := gemini.Generate(docs)
}

Frameworks like LangChain Go and Firebase Genkit abstract LLM and database interactions, promoting modularity. Genkit’s observability tools enhance debugging in production.

Contextually, Go’s provenance in cloud-native tools positions it for AI’s distributed nature, implying reduced latency in inference pipelines. Implications include seamless migrations amid technological shifts, bolstered by interfaces and embedding.

Future Directions and Community Ecosystem

Balahan outlines forthcoming enhancements in Go 1.25, emphasizing SIMD for vectorized operations crucial to AI optimizations. Multi-core advancements target non-uniform memory access, refining garbage collection for modern hardware.

Language polish focuses on generic flexibility, with community discussions on GitHub informing iterations. Compatibility remains sacrosanct, ensuring legacy code viability.

The ecosystem’s vitality—robust libraries for AI, vibrant meetups—underscores collaborative growth. Dougherty credits community contributions for Go’s relevance, implying sustained innovation through open-source synergy.

Analytically, these trajectories affirm Go’s adaptability, with implications for AI-driven economies where efficient, secure languages predominate.

Links:

Posted in en-US | Tags: CameronBalahan, GenerativeAI, GoLanguage, Google, GoogleIO2025, MarcDougherty, PostQuantumCryptography, VectorDatabases | No Comments »

[AWSReInvent2025] Revolutionizing DevSecOps: How Cathay Pacific Achieved 75% Faster Security with Agentic AI

Author: Jonathan Lalou

Lecturer

Mike Markell is a Practice Manager for AWS Professional Services in Hong Kong, where he leads digital transformation and security initiatives for major enterprises across Asia. Naresh Sharma is a senior technology leader at Cathay Pacific Airways, overseeing the airline’s global application security and DevSecOps strategy. Tony Leong is a Senior Security Architect at Cathay, specialized in building AI-powered security tooling and integrating AppSec-as-Code into high-velocity deployment pipelines.

Abstract

In the highly regulated and high-stakes environment of global aviation, managing security across more than 4,000 annual deployments presents a massive operational challenge. This article details how Cathay Pacific Airways revolutionized its “security-first” culture by moving beyond traditional security scanning to a comprehensive DevSecOps model. The core methodology centers on the implementation of Agentic AI and a RAG-based (Retrieval-Augmented Generation) assistant to solve the industry’s “false positive crisis.” By deploying “AI-powered security champions” and customized scanning rules, Cathay achieved a 75% reduction in vulnerability remediation time and a 50% reduction in security operations costs. The analysis explores the technical and cultural shifts required to empower over 1,000 developers to become proactive security practitioners while maintaining the airline’s rapid pace of innovation.

Context: The Bottleneck of Manual Security Reviews

For a global leader like Cathay Pacific, the pace of digital innovation is essential for maintaining a competitive edge in the aviation industry. However, this speed was being severely hindered by the limitations of traditional security scanning tools. The primary conflict centered on a high noise-to-signal ratio, where approximately 78% of the vulnerabilities identified by standard tools were determined to be false positives. This created a crisis where security teams were overwhelmed by alerts, leading to significant delays in the deployment of features for the airline’s fleet.

Furthermore, the manual review process required to validate these alerts created significant friction between the security and development teams. Developers often viewed security requirements as a hurdle that slowed down their ability to deliver value, while security professionals struggled to keep up with the volume of code being produced. To overcome these challenges, Cathay needed a solution that could scale with their deployment frequency—which covers everything from customer-facing apps to critical flight operation systems—without compromising on the rigorous safety standards that define the brand.

Methodology: Implementing Shift-Left Security with AI

The solution implemented by Cathay Pacific and AWS Professional Services involved a comprehensive “shift-left” strategy, which integrates security at the very beginning of the software development lifecycle. The cornerstone of this methodology is the use of Agentic AI. Unlike traditional static scanners, these AI agents act as “security champions” that provide real-time, context-aware guidance to developers as they write code. This allows for the identification of security anti-patterns and the suggestion of defensive coding practices before the code is even committed to a repository.

Another critical component of the methodology is the AppSec-as-Code library. This centralized knowledge base translates complex security policies into programmatic requirements that can be automatically enforced within CI/CD pipelines. To make this information accessible to developers, the team developed a RAG-based (Retrieval-Augmented Generation) assistant. This tool allows developers to query internal security standards using natural language, receiving accurate and context-specific advice instantly. Finally, the team moved away from “out of the box” tool configurations in favor of highly customized scanning rules. This technical fine-tuning was essential for drastically reducing the false-positive rate and ensuring that the security team only focused on legitimate threats.

Technical Analysis of Operational Gains

The implementation of AI-driven DevSecOps has yielded remarkable quantitative results for Cathay Pacific. The most significant outcome is a 75% reduction in the time required to remediate vulnerabilities. Because the AI agents filter out the vast majority of false positives and provide developers with clear, actionable fix suggestions, the entire security lifecycle has been compressed. Qualitatively, this has led to a 70% improvement in developer security capability, as the tools effectively serve as an automated, on-the-job training system that reinforces secure coding habits.

From a financial perspective, the automation of manual reviews and the reduction in wasted engineering time have led to a 50% cost reduction in security operations. The airline is now able to manage over 4,000 deployments annually with a higher level of confidence and lower overhead than was previously possible. A critical technical lesson learned during the journey was that “by default, no tool is perfect.” Success required a commitment to continuous customization and a willingness to collaborate with product vendors to tune their tools to the specific needs of the aviation industry. This iterative feedback loop was the key to moving from “human-in-the-loop” automation to a more efficient “AI-informed” model.

Consequences: A Cultural and Technical Transformation

The transformation at Cathay Pacific extended far beyond the technical architecture; it required a fundamental shift in the organization’s culture. The success of the project was predicated on a “can-do” spirit and the setting of ambitious targets that challenged the status quo. By providing developers with the tools to take ownership of security, the organization has fostered a culture where security is seen as a shared responsibility rather than an external constraint.

The implications for the global aviation and enterprise sectors are significant. Cathay has proven that it is possible to maintain a high-velocity deployment schedule in a safety-critical environment by leveraging the power of generative AI. Looking forward, the organization plans to develop even more insightful dashboards to provide security leaders with real-time visibility into the health of the application portfolio. The journey serves as a powerful testament to how Agentic AI can bridge the gap between agility and security, turning a potential bottleneck into a powerful competitive advantage.

Links:

Posted in en-US | Tags: AgenticAI, Automation, AWS, AWSReInvent2025, CathayPacific, Cybersecurity, DevSecOps, GenerativeAI, MikeMarkell, NareshSharma, ShiftLeft, TonyLeong | No Comments »

[NDCMelbourne2025] How to Work with Generative AI in JavaScript – Phil Nash

Author: Jonathan Lalou

Phil Nash, a developer relations engineer at DataStax, delivers a comprehensive guide to leveraging generative AI in JavaScript at NDC Melbourne 2025. His talk demystifies the process of building AI-powered applications, emphasizing that JavaScript developers can harness existing skills to create sophisticated solutions without needing deep machine learning expertise. Through practical examples and insights into tools like Gemini and retrieval-augmented generation (RAG), Phil empowers developers to explore this rapidly evolving field.

Understanding Generative AI Fundamentals

Phil begins by addressing the excitement surrounding generative AI, noting its accessibility since the release of the GPT-3.5 API two years ago. He emphasizes that JavaScript developers are well-positioned to engage with AI due to robust tooling and APIs, despite the field’s Python-centric origins. Using Google’s Gemini model as an example, Phil demonstrates how to generate content with minimal code, highlighting the importance of understanding core concepts like token generation and model behavior.

He explains tokenization, using OpenAI’s byte pair encoding as an example, where text is broken into probabilistic tokens. Parameters like top-k, top-p, and temperature allow developers to control output randomness, with Phil cautioning against overly high settings that produce nonsensical results, humorously illustrated by a chaotic AI-generated story about a gnome.

Enhancing AI with Prompt Engineering

Prompt engineering emerges as a critical skill for refining AI outputs. Phil contrasts zero-shot prompting, which offers minimal context, with techniques like providing examples or system prompts to guide model behavior. For instance, a system prompt defining a “capital city assistant” ensures concise, accurate responses. He also explores chain-of-thought prompting, where instructing the model to think step-by-step improves its ability to solve complex problems, such as a modified river-crossing riddle.

Phil underscores the need for evaluation to ensure prompt reliability, as slight changes can significantly alter outcomes. This structured approach transforms prompt engineering from guesswork into a disciplined practice, enabling developers to tailor AI responses effectively.

Retrieval-Augmented Generation for Contextual Awareness

To address AI models’ limitations, such as outdated or private data, Phil introduces retrieval-augmented generation (RAG). RAG enhances models by integrating external data, like conference talk descriptions, into prompts. He explains how vector embeddings—multidimensional representations of text—enable semantic searches, using cosine similarity to find relevant content. With DataStax’s Astra DB, developers can store and query vectorized data efficiently, as demonstrated in a demo where Phil’s bot retrieves details about NDC Melbourne talks.

This approach allows AI to provide contextually relevant answers, such as identifying AI-related talks or conference events, making it a powerful tool for building intelligent applications.

Streaming Responses and Building Agents

Phil highlights the importance of user experience, noting that AI responses can be slow. Streaming, supported by APIs like Gemini’s generateContentStream, delivers tokens incrementally, improving perceived performance. He demonstrates streaming results to a webpage using JavaScript’s fetch and text decoder streams, showcasing how to create responsive front-end experiences.

The talk culminates with AI agents, which Phil describes as systems that perceive, reason, plan, and act using tools. By defining functions in JSON schema, developers can enable models to perform tasks like arithmetic or fetching web content. A demo bot uses tools to troubleshoot a keyboard issue and query GitHub, illustrating agents’ potential to solve complex problems dynamically.

Conclusion: Empowering JavaScript Developers

Phil concludes by encouraging developers to experiment with generative AI, leveraging tools like Langflow for visual prototyping and exploring browser-based models like Gemini Nano. His talk is a call to action, urging JavaScript developers to build innovative applications by combining AI capabilities with their existing expertise. By mastering prompt engineering, RAG, streaming, and agents, developers can create powerful, user-centric solutions.

Links:

Posted in en-US | Tags: AI, DataStax, GenerativeAI, JavaScript, NDCConferences, NDCMelbourne2025, PhilNash, PromptEngineering, RAG | No Comments »

[DevoxxUK2025] Concerto for Java and AI: Building Production-Ready LLM Applications

Author: Jonathan Lalou

At DevoxxUK2025, Thomas Vitale, a software engineer at Systematic, delivered an inspiring session on integrating generative AI into Java applications to enhance his music composition process. Combining his passion for music and software engineering, Thomas showcased a “composer assistant” application built with Spring AI, addressing real-world use cases like text classification, semantic search, and structured data extraction. Through live coding and a musical performance, he demonstrated how Java developers can leverage large language models (LLMs) for production-ready applications, emphasizing security, observability, and developer experience. His talk culminated in a live composition for an audience-chosen action movie scene, blending AI-driven suggestions with human creativity.

The Why Factor for AI Integration

Thomas introduced his “Why Factor” to evaluate hype technologies like generative AI. First, identify the problem: for his composer assistant, he needed to organize and access musical data efficiently. Second, assess production readiness: LLMs must be secure and reliable for real-world use. Third, prioritize developer experience: tools like Spring AI simplify integration without disrupting workflows. By focusing on these principles, Thomas avoided blindly adopting AI, ensuring it solved specific issues, such as automating data classification to free up time for creative tasks like composing music.

Enhancing Applications with Spring AI

Using a Spring Boot application with a Thymeleaf frontend, Thomas integrated Spring AI to connect to LLMs like those from Ollama (local) and Mistral AI (cloud). He demonstrated text classification by creating a POST endpoint to categorize musical data (e.g., “Irish tin whistle” as an instrument) using a chat client API. To mitigate risks like prompt injection attacks, he employed Java enumerations to enforce structured outputs, converting free text into JSON-parsed Java objects. This approach ensured security and usability, allowing developers to swap models without code changes, enhancing flexibility for production environments.

Semantic Search and Retrieval-Augmented Generation

Thomas addressed the challenge of searching musical data by meaning, not just keywords, using semantic search. By leveraging embedding models in Spring AI, he converted text (e.g., “melancholic”) into numerical vectors stored in a PostgreSQL database, enabling searches for related terms like “sad.” He extended this with retrieval-augmented generation (RAG), where a chat client advisor retrieves relevant data before querying the LLM. For instance, asking, “What instruments for a melancholic scene?” returned suggestions like cello, based on his dataset, improving search accuracy and user experience.

Structured Data Extraction and Human Oversight

To streamline data entry, Thomas implemented structured data extraction, converting unstructured director notes (e.g., from audio recordings) into JSON objects for database storage. Spring AI facilitated this by defining a JSON schema for the LLM to follow, ensuring structured outputs. Recognizing LLMs’ potential for errors, he emphasized keeping humans in the loop, requiring users to review extracted data before saving. This approach, applied to his composer assistant, reduced manual effort while maintaining accuracy, applicable to scenarios like customer support ticket processing.

Tools and MCP for Enhanced Functionality

Thomas enhanced his application with tools, enabling LLMs to call internal APIs, such as saving composition notes. Using Spring Data, he annotated methods to make them accessible to the model, allowing automated actions like data storage. He also introduced the Model Context Protocol (MCP), implemented in Quarkus, to integrate with external music software via MIDI signals. This allowed the LLM to play chord progressions (e.g., in A minor) through his piano software, demonstrating how MCP extends AI capabilities across local processes, though he cautioned it’s not yet production-ready.

Observability and Live Composition

To ensure production readiness, Thomas integrated OpenTelemetry for observability, tracking LLM operations like token usage and prompt augmentation. During the session, he invited the audience to choose a movie scene (action won) and used his application to generate a composition plan, suggesting chord progressions (e.g., I-VI-III-VII) and instruments like percussion and strings. He performed the music live, copy-pasting AI-suggested notes into his software, fixing minor bugs, and adding creative touches, showcasing a practical blend of AI automation and human artistry.

Links:

Posted in en-US | Tags: DevoxxUK2025, GenerativeAI, Java, LLMApplications, SpringAI, ThomasVitale | No Comments »

[AWSReInventPartnerSessions2024] Constructing Real-Time Generative AI Systems through Integrated Streaming, Managed Models, and Safety-Centric Language Architectures

Author: Jonathan Lalou

Lecturer

Pascal Vuylsteker serves as Senior Director of Innovation at Confluent, where he spearheads advancements in scalable data streaming platforms designed to empower enterprise artificial intelligence initiatives. Mario Rodriguez operates as Senior Partner Solutions Architect at AWS, concentrating on seamless integrations of generative AI services within cloud ecosystems. Gavin Doyle heads the Applied AI team at Anthropic, directing efforts toward developing reliable, interpretable, and ethically aligned large language models.

Abstract

This comprehensive scholarly analysis investigates the foundational principles and practical methodologies for deploying real-time generative AI applications by harmonizing Confluent’s data streaming capabilities with Amazon Bedrock’s fully managed foundation model access and Anthropic’s advanced language models. The discussion centers on establishing robust data governance frameworks, implementing retrieval-augmented generation with continuous contextual updates, and leveraging Flink SQL for instantaneous inference. Through detailed architectural examinations and illustrative configurations, the article elucidates how these components dismantle data silos, ensure up-to-date relevance in AI responses, and facilitate scalable, secure innovation across organizational boundaries.

Establishing Governance-Centric Modern Data Infrastructures

Contemporary enterprise environments increasingly acknowledge the indispensable role of data streaming in fostering operational agility. Empirical insights reveal that seventy-nine percent of information technology executives consider real-time data flows essential for maintaining competitive advantage. Nevertheless, persistent obstacles—ranging from fragmented technical competencies and isolated data repositories to escalating governance complexities and heightened expectations from generative AI adoption—continue to hinder comprehensive exploitation of these potentials.

To counteract such impediments, contemporary data architectures prioritize governance as the pivotal nucleus. This core ensures that information remains secure, compliant with regulatory standards, and readily accessible to authorized stakeholders. Encircling this nucleus are interdependent elements including data warehouses for structured storage, streaming analytics for immediate processing, and generative AI applications that derive actionable intelligence. Such a holistic configuration empowers institutions to eradicate silos, achieve elastic scalability, and satisfy burgeoning demands for instantaneous insights.

Confluent emerges as the vital connective framework within this paradigm, facilitating uninterrupted real-time data synchronization across disparate systems. By bridging ingestion pipelines, data lakes, and batch-oriented workflows, Confluent guarantees that information arrives at designated destinations precisely when required. Absent this foundational layer, the construction of cohesive generative AI solutions becomes substantially more arduous, often resulting in delayed or inconsistent outputs.

Complementing this streaming backbone, Amazon Bedrock delivers a fully managed service granting access to an array of foundation models sourced from leading providers such as AI21 Labs, Anthropic, Cohere, Meta, Mistral AI, Stability AI, and Amazon itself. Bedrock supports diverse experimentation modalities, enables model customization through fine-tuning or extended pre-training, and permits the orchestration of intelligent agents without necessitating extensive coding expertise. From a security perspective, Bedrock rigorously prohibits the incorporation of customer data into baseline models, maintains isolation for fine-tuned variants, implements encryption protocols, enforces granular access controls aligned with AWS identity management, and adheres to certifications including HIPAA, GDPR, SOC, ISO, and CSA STAR.

The differentiation of generative AI applications hinges predominantly on proprietary datasets. Organizations possessing comparable access to foundation models achieve superiority by capitalizing on unique internal assets. Three principal techniques harness this advantage: retrieval-augmented generation incorporates external knowledge directly into prompt engineering; fine-tuning crafts specialized models tailored to domain-specific corpora; continued pre-training broadens model comprehension using enterprise-scale information repositories.

For instance, an online travel agency might synthesize personalized itineraries by amalgamating live flight availability, client profiles, inventory levels, and historical preferences. AWS furnishes an extensive suite of services accommodating unstructured, structured, streaming, and vectorized data formats, thereby enabling seamless integration across heterogeneous sources while preserving lifecycle security.

Orchestrating Real-Time Contextual Enrichment and Inference Mechanisms

Confluent assumes a critical position by directly interfacing with vector databases, thereby assuring that conversational AI frameworks consistently operate upon the most pertinent and current information. This integration transcends basic data translocation, emphasizing the delivery of contextualized, AI-actionable content.

Central to this orchestration is Flink Inference, a sophisticated capability within Confluent Cloud that facilitates instantaneous machine learning predictions through Flink SQL syntax. This approach dramatically simplifies the embedding of predictive models into operational workflows, yielding immediate analytical outcomes and supporting real-time decision-making grounded in accurate, contemporaneous data.

Configuration commences with establishing connectivity between Flink environments and target models utilizing the Confluent command-line interface. Parameters specify endpoints, authentication credentials, and model identifiers—accommodating various Claude iterations alongside other compatible architectures. Subsequent commands define reusable prompt templates, allowing baseline instructions to persist while dynamic elements vary per invocation. Finally, data insertion invokes the ML_PREDICT function, passing relevant parameters for processing.

Architecturally, the pipeline initiates with document or metadata publication to Kafka topics, forming ingress points for downstream transformation. Where appropriate, documents undergo segmentation into manageable chunks to promote parallel execution and enhance computational efficiency. Embeddings are then generated for each segment leveraging Bedrock or Anthropic services, after which these vector representations—accompanied by original chunks—are indexed within a vector store such as MongoDB Atlas.

To accelerate adoption, dedicated quick-start repositories provide deployable templates encapsulating this workflow. Notably, these templates incorporate structured document summarization via Claude, converting tabular or hierarchical data into narrative abstracts suitable for natural language querying.

Interactive sessions begin through API gateways or direct Kafka clients, enabling bidirectional real-time communication. User queries generate embeddings, which subsequently retrieve semantically aligned documents from the vector repository. Retrieved artifacts, augmented by available streaming context, inform prompt construction to maximize relevance and precision. The resultant engineered prompt undergoes processing by Claude on Anthropic Cloud, producing responses that reflect both historical knowledge and live situational awareness.

Efficiency enhancements include conversational summarization to mitigate token proliferation and refine large language model performance. Empirical observations indicate that Claude-generated query reformulations for vector retrieval substantially outperform direct human phrasing, yielding markedly superior document recall.

CREATE MODEL anthropic_claude WITH (
  'connector' = 'anthropic',
  'endpoint' = 'https://api.anthropic.com/v1/messages',
  'api.key' = 'sk-ant-your-key-here',
  'model' = 'claude-3-opus-20240229'
);

CREATE TABLE refined_queries AS
SELECT ML_PREDICT(
  'anthropic_claude',
  CONCAT('Rephrase for vector search: ', user_query)
) AS optimized_query
FROM raw_interactions;

Flink’s value proposition extends beyond connectivity to encompass cost-effectiveness, automatic scaling for voluminous workloads, and native interoperability with extensive ecosystems. Confluent maintains certified integrations across major AWS offerings, prominent data warehouses including Snowflake and Databricks, and leading vector databases such as MongoDB. Anthropic models remain comprehensively accessible via Bedrock, reflecting strategic collaborations spanning product interfaces to silicon-level optimizations.

Analytical Implications and Strategic Trajectories for Enterprise AI Deployment

The methodological synthesis presented—encompassing streaming orchestration, managed model accessibility, and safety-oriented language processing—fundamentally reconfigures retrieval-augmented generation from static knowledge injection to dynamic reasoning augmentation. This evolution proves indispensable for domains requiring precise interpretation, such as regulatory compliance or legal analysis.

Strategic ramifications are profound. Organizations unlock domain-specific differentiation by leveraging proprietary datasets within real-time contexts, achieving decision-making superiority unattainable through generic models alone. Governance frameworks scale securely, accommodating enterprise-grade requirements without sacrificing velocity.

Persistent challenges, including data provenance assurance and model drift mitigation, necessitate ongoing refinement protocols. Future pathways envision declarative inference paradigms wherein prompts and policies are codified as infrastructure, alongside hybrid architectures merging vector search with continuous streaming for anticipatory intelligence.

Links:

Video: How to build a real-time gen AI app with AWS, Confluent & Anthropic

Posted in en-US | Tags: AmazonBedrock, Anthropic, AWSreInvent, AWSReInventPartnerSessions2024, Confluent, DataStreaming, EnterpriseArchitecture, FlinkInference, GavinDoyle, GenerativeAI, MarioRodriguez, PascalVuylsteker, RealTimeAI, RetrievalAugmentedGeneration | No Comments »

[GoogleIO2024] Under the Hood with Google AI: Exploring Research, Impact, and Future Horizons

Author: Jonathan Lalou

Delving into AI’s foundational elements, Jeff Dean, James Manyika, and Koray Kavukcuoglu, moderated by Laurie Segall, discussed Google’s trajectory. Their dialogue traced historical shifts, current breakthroughs, and societal implications, offering profound perspectives on technology’s evolution.

Tracing AI’s Evolution and Key Milestones

Jeff recounted AI’s journey from rule-based systems to machine learning, highlighting neural networks’ resurgence around 2010 due to computational advances. Early applications at Google, like spelling corrections, paved the way for vision, speech, and language tasks. Koray noted hardware investments’ role in enabling generative methods, transforming content creation across fields.

James emphasized AI’s multiplier effect, reshaping sciences like biology and software development. The panel agreed that multimodal, long-context models like Gemini represent culminations of algorithmic and infrastructural progress, allowing generalization to novel challenges.

Addressing Societal Impacts and Ethical Considerations

James stressed AI’s mirror to humanity, prompting grapples with bias, fairness, and values—issues societies must collectively resolve. Koray advocated responsible deployment, integrating safety from inception through techniques like watermarking and red-teaming. Jeff highlighted balancing innovation with safeguards, ensuring models align with human intent while mitigating harms.

Discussions touched on global accessibility, with efforts to support underrepresented languages and equitable benefits. The leaders underscored collaborative approaches, involving diverse stakeholders to navigate complexities.

Envisioning AI’s Future Applications and Challenges

Koray envisioned AI accelerating healthcare, solving diseases efficiently worldwide. Jeff foresaw enhancements across human endeavors, from education to scientific discovery, if pursued thoughtfully. James hoped AI fosters better humanity, aiding complex problem-solving.

Challenges include advancing agentic systems for multi-step reasoning, improving evaluation beyond benchmarks, and ensuring inclusivity. The panel expressed optimism, viewing AI as an amplifier for positive change when guided responsibly.

Links:

Posted in en-US | Tags: AIResearch, DeepMind, EthicalAI, GenerativeAI, GoogleAI, GoogleIO2024, HealthcareAI, JamesManyika, JeffDean, KorayKavukcuoglu, LaurieSegall | No Comments »