Posts Tagged ‘LLM’
[DevoxxFR2025] Building an Agentic AI with Structured Outputs, Function Calling, and MCP
The rapid advancements in Artificial Intelligence, particularly in large language models (LLMs), are enabling the creation of more sophisticated and autonomous AI agents – programs capable of understanding instructions, reasoning, and interacting with their environment to achieve goals. Building such agents requires effective ways for the AI model to communicate programmatically and to trigger external actions. Julien Dubois, in his deep-dive session, explored key techniques and a new protocol essential for constructing these agentic AI systems: Structured Outputs, Function Calling, and the Model-Controller Protocol (MCP). Using practical examples and the latest Java SDK developed by OpenAI, he demonstrated how to implement these features within LangChain4j, showcasing how developers can build AI agents that go beyond simple text generation.
Structured Outputs: Enabling Programmatic Communication
One of the challenges in building AI agents is getting LLMs to produce responses in a structured format that can be easily parsed and used by other parts of the application. Julien explained how Structured Outputs address this by allowing developers to define a specific JSON schema that the AI model must adhere to when generating its response. This ensures that the output is not just free-form text but follows a predictable structure, making it straightforward to map the AI’s response to data objects in programming languages like Java. He demonstrated how to provide the LLM with a JSON schema definition and constrain its output to match that schema, enabling reliable programmatic communication between the AI model and the application logic. This is crucial for scenarios where the AI needs to provide data in a specific format for further processing or action.
Function Calling: Giving AI the Ability to Act
To be truly agentic, an AI needs the ability to perform actions in the real world or interact with external tools and services. Julien introduced Function Calling as a powerful mechanism that allows developers to define functions in their code (e.g., Java methods) and expose them to the AI model. The LLM can then understand when a user’s request requires calling one of these functions and generate a structured output indicating which function to call and with what arguments. The application then intercepts this output, executes the corresponding function, and can provide the function’s result back to the AI, allowing for a multi-turn interaction where the AI reasons, acts, and incorporates the results into its subsequent responses. Julien demonstrated how to define function “signatures” that the AI can understand and how to handle the function calls triggered by the AI, showcasing scenarios like retrieving information from a database or interacting with an external API based on the user’s natural language request.
MCP: Standardizing LLM Interaction
While Structured Outputs and Function Calling provide the capabilities for AI communication and action, the Model-Controller Protocol (MCP) emerges as a new standard to streamline how LLMs interact with various data sources and tools. Julien discussed MCP as a protocol that aims to standardize the communication layer between AI models (the “Model”) and the application logic that orchestrates them and provides access to external resources (the “Controller”). This standardization can facilitate building more portable and interoperable AI agentic systems, allowing developers to switch between different LLMs or integrate new tools and data sources more easily. While details of MCP might still be evolving, its goal is to provide a common interface for tasks like function calling, accessing external knowledge, and managing conversational state. Julien illustrated how libraries like LangChain4j are adopting these concepts and integrating with protocols like MCP to simplify the development of sophisticated AI agents. The presentation, rich in code examples using the OpenAI Java SDK, provided developers with the practical knowledge and tools to start building the next generation of agentic AI applications.
Links:
- Julien Dubois: https://www.linkedin.com/in/juliendubois/
- Microsoft: https://www.microsoft.com/
- LangChain4j on GitHub: https://github.com/langchain4j/langchain4j
- OpenAI: https://openai.com/
- Devoxx France LinkedIn: https://www.linkedin.com/company/devoxx-france/
- Devoxx France Bluesky: https://bsky.app/profile/devoxx.fr
- Devoxx France Website: https://www.devoxx.fr/
[DevoxxGR2025] Simplifying LLM Integration: A Blueprint for Effective AI Systems
Efstratios Marinos captivated attendees at Devoxx Greece 2025 with a masterclass on streamlining large language model (LLM) integrations. By focusing on practical, modular patterns, Efstratios demonstrated how to construct robust, scalable AI systems that prioritize simplicity without sacrificing functionality, offering actionable strategies for developers.
Exploring the Complexity Continuum
Efstratios introduced the concept of a complexity continuum for LLM integrations, spanning from straightforward single calls to sophisticated agentic frameworks. At its simplest, a system comprises an LLM, a retrieval mechanism, and tool capabilities, delivering maintainability and ease of updates with minimal overhead. More intricate setups incorporate routers, APIs, and vector stores, enhancing functionality but complicating debugging. Efstratios emphasized that simplicity is a strategic choice, enabling rapid adaptation to evolving AI technologies. He showcased a concise Python implementation, where a single function manages retrieval and response generation in a handful of lines, contrasting this with a multi-step retrieval-augmented generation (RAG) workflow that involves encoding, indexing, and embedding, adding layers of complexity that demand careful justification.
Crafting Robust Interfaces
Central to Efstratios’s philosophy is the design of clean interfaces for LLMs, retrieval systems, tools, and memory components. He compared prompt crafting to API design, advocating for structured formats that clearly separate instructions, context, and queries. Well-documented tools, complete with detailed descriptions and practical examples, empower LLMs to perform effectively, while vague documentation leads to errors. Efstratios underscored the need for resilient error handling, such as fallback strategies for failed retrievals or tool invocations, to ensure system reliability. For example, a system might respond to a failed search by suggesting alternatives or retrying with adjusted parameters, improving usability and simplifying troubleshooting in production environments.
Enhancing Capabilities with Workflow Patterns
Efstratios explored three foundational workflow patterns—prompt chaining, routing, and parallelization—to optimize performance while managing complexity. Prompt chaining divides complex tasks into sequential steps, such as outlining, drafting, and refining content, enhancing clarity at the expense of increased latency. Routing employs an LLM to categorize inputs and direct them to specialized handlers, like a customer support bot distinguishing technical from financial queries, improving efficiency through focused processing. Parallelization, encompassing sectioning and voting, distributes tasks across multiple LLM instances, such as analyzing document segments concurrently, though it incurs higher computational costs. These patterns provide incremental enhancements, ideal for tasks requiring moderate sophistication.
Advanced Patterns and Decision-Making Principles
For more demanding scenarios, Efstratios presented two advanced patterns: orchestrator-workers and evaluator-optimizer. The orchestrator-workers pattern dynamically breaks down tasks, with a central LLM coordinating specialized workers, perfect for complex coding projects or multi-faceted content creation. The evaluator-optimizer pattern establishes a feedback loop, where a generator LLM produces content and an evaluator refines it iteratively, mirroring human iterative processes. Efstratios outlined six decision-making principles—use case alignment, development effort, maintainability, performance granularity, latency, and cost—to guide pattern selection. Simple solutions suffice for tasks like summarization, while multi-step workflows excel in knowledge-intensive applications. He encouraged starting with minimal solutions, establishing performance baselines, identifying specific limitations, and adding complexity only when validated by measurable gains.
Links:
[Voxxed Amsterdam 2025] From Zero to AI: Building Smart Java or Kotlin Applications with Spring AI
At VoxxedDaysAmsterdam2025, Christian Tzolov, a Spring AI team member at VMware and lead of the MCP Java SDK, delivered a comprehensive session titled “From Zero to AI: Building Smart Java or Kotlin Applications with Spring AI.” Spanning nearly two hours, the session provided a deep dive into integrating generative AI into Java and Kotlin applications using Spring AI, a framework designed to connect enterprise data and APIs with AI models. Through live coding demos, Tzolov showcased practical use cases, including conversation memory, tool/function calling, retrieval-augmented generation (RAG), and multi-agent systems, while addressing challenges like AI hallucinations and observability. Attendees left with actionable insights to start building AI-driven applications, leveraging Spring AI’s portable abstractions and the Model Context Protocol (MCP).
Overcoming LLM Limitations with Spring AI
Tzolov began by outlining the challenges of large language models (LLMs): they are stateless, frozen in time, and lack domain-specific knowledge, requiring developers to provide context, manage state, and handle interactions with external systems. Spring AI addresses these issues with high-level abstractions like the ChatClient, similar to Spring’s RestClient or WebClient, enabling seamless integration with models like OpenAI’s GPT-4o, Anthropic’s Claude, or open-source alternatives like LLaMA. A live demo of a flight booking assistant illustrated these concepts. Tzolov started with a basic Spring Boot application connected to OpenAI, demonstrating a simple chat interface. To ground the model, he used system prompts to define its behavior as a customer support agent for “Fun Air,” ensuring contextually appropriate responses. He then introduced conversation memory using Spring AI’s ChatMemoryAdvisor, which retains a chronological list of messages to maintain state, addressing the stateless nature of LLMs. For long-term memory, Tzolov employed a vector store (Chroma) to store conversation history semantically, retrieving only relevant data for queries, thus overcoming context window limitations. This setup allowed the assistant to respond accurately to queries like “What is my flight status?” by fetching booking details (e.g., booking number 103) from a mock database.
Enhancing AI Applications with Tool Calling and RAG
To enable LLMs to interact with external systems, Tzolov demonstrated tool/function calling, where Spring AI wraps existing services (e.g., a flight booking service) as tools with metadata (name, description, JSON schema). In the demo, the assistant used a getBookingDetails tool to query a database, allowing it to provide accurate flight status updates. Tzolov emphasized the importance of descriptive tool metadata to guide the LLM in deciding when and how to invoke tools, reducing the risk of misinterpretation. For domain-specific knowledge, he introduced prompt stuffing—injecting additional context into prompts—and RAG for dynamic data retrieval. In a RAG demo, cancellation policies were loaded into a Chroma vector store, chunked into meaningful segments, and retrieved dynamically based on user queries. This approach mitigated hallucinations, as seen when the assistant correctly cited a 50% refund policy for premium economy bookings within 40 hours. Tzolov highlighted advanced RAG techniques, such as data compression and reranking, supported by Spring AI’s APIs, and stressed the importance of evaluating responses to ensure relevance, referencing frameworks like those from contributor Thomas Vitali.
Building Multi-Agent Systems with MCP
Tzolov explored the Model Context Protocol (MCP), initiated by Anthropic, as a standardized way to integrate AI applications with external tools and resources across platforms. Using Spring AI’s MCP Java SDK, he demonstrated how to build and consume MCP-compliant tools. In one demo, a Spring AI application connected to MCP servers for Brave Search (JavaScript-based) and file system access, enabling an agent to answer queries about Spring AI support for MCP and write summaries to a file. Another demo reversed the setup, exposing a Spring AI weather tool (using Open-Meteo) as an MCP server, accessible by third-party clients like Claude Desktop via standard I/O or HTTP/SSE transports. Tzolov explained MCP’s bidirectional architecture, where clients can act as servers, supporting features like sampling (allowing servers to request LLM processing from clients). He addressed security concerns, noting Spring AI’s integration with Spring Security (referencing a blog by Daniel Garnier-Moiroux) to secure MCP servers with OAuth 2.1. The session also introduced agentic systems, where LLMs act as a “brain” for planning and tools as a “body” for interaction, with an agentic loop evaluating and refining responses. A work-in-progress demo showcased an orchestration pattern, delegating tasks to searcher, fact-checker, and writer agents, to be published on the Spring AI Community Portal.
Observability and Multimodality for Robust AI Systems
Observability was a key focus, as Tzolov underscored its importance in debugging complex AI interactions. Spring AI integrates with Micrometer to provide metrics (e.g., token usage, model throughput, latency), tracing, and logging (via Loki). A dashboard demo displayed real-time metrics for the flight booking assistant, highlighting tool calls and errors, crucial for diagnosing issues in agentic systems. Tzolov also explored multimodality, demonstrating a voice assistant using OpenAI’s GPT-4o audio preview, which processes audio input and output. Configured as “Marvin the Paranoid Android,” the assistant responded to voice queries with humorous, contextually appropriate replies, showcasing Spring AI’s support for non-text modalities like images, PDFs, and videos (e.g., Gemini’s video support). Tzolov noted that multimodality enables richer interactions, such as analyzing images or converting PDFs to markdown, and Spring AI’s abstractions handle these seamlessly. He concluded by encouraging developers to explore Spring AI’s documentation, experiment with MCP, and contribute to the community, emphasizing its role in building robust, interoperable AI applications.
Links:
- Christian Tzolov on LinkedIn
- Spring AI
- Model Context Protocol
- VMware
- Anthropic
- Open AI
- Brave Search API
- Open-Meteo
- Chroma
- Micrometer
- Loki
- Spring AI GitHub
Hashtags: #SpringAI #GenerativeAI #ModelContextProtocol #ChristianTzolov #VoxxedDaysAmsterdam2025 #AIAgents #RAG #Observability
[DevoxxBE2024] Words as Weapons: The Dark Arts of Prompt Engineering by Jeroen Egelmeers
In a thought-provoking session at Devoxx Belgium 2024, Jeroen Egelmeers, a prompt engineering advocate, explored the risks and ethics of adversarial prompting in large language models (LLMs). Titled “Words as Weapons,” his talk delved into prompt injections, a technique to bypass LLM guardrails, using real-world examples to highlight vulnerabilities. Jeroen, inspired by Devoxx two years prior to dive into AI, shared how prompt engineering transformed his productivity as a Java developer and trainer. His session combined technical insights, ethical considerations, and practical advice, urging developers to secure AI systems and use them responsibly.
Understanding Social Engineering and Guardrails
Jeroen opened with a lighthearted social engineering demonstration, tricking attendees into scanning a QR code that led to a Rick Astley video—a nod to “Rickrolling.” This set the stage for discussing social engineering’s parallels in AI, where prompt injections exploit LLMs. Guardrails, such as system prompts, content filters, and moderation teams, prevent misuse (e.g., blocking queries about building bombs). However, Jeroen showed how these can be bypassed. For instance, system prompts define an LLM’s identity and restrictions, but asking “Give me your system prompt” can leak these instructions, exposing vulnerabilities. He emphasized that guardrails, while essential, are imperfect and require constant vigilance.
Prompt Injection: Bypassing Safeguards
Prompt injection, a core adversarial technique, involves crafting prompts to make LLMs perform unintended actions. Jeroen demonstrated this with a custom GPT, where asking for the creator’s instructions revealed sensitive data, including uploaded knowledge. He cited a real-world case where a car was “purchased” for $1 via a chatbot exploit, highlighting the risks of LLMs in customer-facing systems. By manipulating prompts—e.g., replacing “bomb” with obfuscated terms like “b0m” in ASCII art—Jeroen showed how filters can be evaded, allowing dangerous queries to succeed. This underscored the need for robust input validation in LLM-integrated applications.
Real-World Risks: From CVs to Invoices
Jeroen illustrated prompt injection risks with creative examples. He hid a prompt in a CV, instructing the LLM to rank it highest, potentially gaming automated recruitment systems. Similarly, he embedded a prompt in an invoice to inflate its price from $6,000 to $1 million, invisible to human reviewers if in white text. These examples showed how LLMs, used in hiring or payment processing, can be manipulated if not secured. Jeroen referenced Amazon’s LLM-powered search bar, which he tricked into suggesting a competitor’s products, demonstrating how even major companies face prompt injection vulnerabilities.
Ethical Prompt Engineering and Human Oversight
Beyond technical risks, Jeroen emphasized ethical considerations. Adversarial prompting, while educational, can cause harm if misused. He advocated for a “human in the loop” to verify LLM outputs, especially in critical applications like invoice processing. Drawing from his experience, Jeroen noted that prompt engineering boosted his productivity, likening LLMs to indispensable tools like search engines. However, he cautioned against blind trust, comparing LLMs to co-pilots where developers remain the pilots, responsible for outcomes. He urged attendees to learn from past mistakes, citing companies that suffered from prompt injection exploits.
Key Takeaways and Resources
Jeroen concluded with a call to action: identify one key takeaway from Devoxx and pursue it. For AI, this means mastering prompt engineering while prioritizing security. He shared a website with resources on adversarial prompting and risk analysis, encouraging developers to build secure AI systems. His talk blended humor, technical depth, and ethical reflection, leaving attendees with a clear understanding of prompt injection risks and the importance of responsible AI use.
Links:
[DotJs2024] Generative UI: Bring your React Components to AI Today!
The fusion of artificial intelligence and frontend development is reshaping how we conceive interactive experiences, placing JavaScript engineers at the vanguard of this transformation. Malte Ubl, CTO at Vercel, captivated audiences at dotJS 2024 with a compelling exploration of Generative UI, a pivotal advancement in Vercel’s AI SDK. Originally hailing from Germany and now entrenched in Silicon Valley’s innovation hub, Ubl reflected on the serendipitous echoes of past tech eras—from CGI uploads via FTP to his own contributions like Whiz at Google—before pivoting to AI’s seismic impact. His message was unequivocal: frontend expertise isn’t obsolete in the AI surge; it’s indispensable, empowering developers to craft dynamic, context-aware interfaces that transcend textual exchanges.
Ubl framed the narrative around a paradigm shift from Software 1.0’s laborious machine learning to Software 2.0’s accessible, API-driven intelligence. Where once PhD-level Python tinkering dominated, today’s landscape favors TypeScript applications invoking large language models (LLMs) as services. Models have ballooned in scale and savvy, rendering fine-tuning optional and prompting paramount. This velocity—shipping products in days rather than years—democratizes AI development, yet disrupts traditional roles. Ubl’s optimism stems from a clear positioning: frontend developers as architects of human-AI symbiosis, leveraging React components to ground abstract prompts in tangible interactions.
Central to his demonstration was a conversational airline booking interface, where users query seat changes via natural language. Conventional AI might bombard with options like 14C or 19D, overwhelming without context. Generative UI elevates this: the LLM invokes React server components as functions, streaming a interactive seat map pre-highlighting viable window seats. Users manipulate the UI directly—selecting, visualizing availability—bypassing verbose back-and-forth. Ubl showcased the underlying simplicity: a standard React project with TypeScript files for boarding passes and seat maps, hot-module-reloading enabled, running locally. The magic unfolds via AI functions—React server components that embed client-side state, synced back to the LLM through an “AI state” mechanism. Selecting 19C triggers a callback: “User selected seat 19C,” enabling seamless continuations like checkout flows yielding digital boarding passes.
This isn’t mere novelty; Ubl underscored practical ramifications. End-user chatbots gain depth, support teams wield company-specific components for real-time adjustments, and search engines like the open-source Wary (a Perplexity analog) integrate existing product renderers for enriched results. Accessibility leaps forward too: retrofitting legacy sites with AI state turns static pages into voice-navigable experiences, empowering non-traditional input modalities. Ubl likened AI to a potent backend—API calls fetching not raw data, but rendered intelligence—amplifying human-computer dialogue beyond text. As models from OpenAI, Google Gemini, Anthropic’s Claude, and Mistral proliferate, frontend differentiation via intuitive UIs becomes the competitive edge, uplifting the stack’s user-facing stratum.
Ubl’s closing exhortation: embrace this disruption by viewing React components as AI-native building blocks. Vercel’s AI SDK examples offer starter chatbots primed for customization, accelerating prototyping. In a world where AI smarts escalate, frontend artisans—adept at state orchestration and visual storytelling—emerge as the revolution’s fulcrum, forging empathetic, efficient digital realms.
The Dawn of AI-Infused Interfaces
Ubl vividly contrasted archaic AI chats with generative prowess, using an airline scenario to highlight contextual rendering’s superiority. Prompts yield not lists, but explorable maps—streamed via server components—where selections feed back into the AI loop. This bidirectional flow, powered by AI state, ensures coherence, transforming passive queries into collaborative sessions. Ubl’s live demo, from flight selection to boarding pass issuance, revealed the unobtrusive elegance: plain React, no arcane setups, just LLM-orchestrated functions bridging intent and action.
Empowering Developers in the AI Era
Beyond demos, Ubl advocated for strategic adoption, spotlighting use cases like e-commerce search enhancements and accessibility overlays. Existing components slot into AI workflows effortlessly, while diverse models foster toolkit pluralism. The SDK’s documentation and examples lower barriers, inviting experimentation. Ubl’s thesis: as AI commoditizes logic, frontend’s artistry—crafting modality-agnostic interactions—secures its primacy, heralding an inclusive future where developers orchestrate intelligence with familiar tools.