Recent Posts
Archives

Posts Tagged ‘ThomasVitale’

PostHeaderIcon [SpringIO2025] Real-World AI Patterns with Spring AI and Vaadin by Marcus Hellberg / Thomas Vitale

Lecturer

Marcus Hellberg is the Vice President of AI Research at Vaadin, a company specializing in tools for Java developers to build web applications. As a Java Champion with nearly 20 years of experience in Java and web development, he focuses on integrating AI capabilities into Java ecosystems. Thomas Vitale is a software engineer at Systematic, a Danish software company, with expertise in cloud-native solutions, Java, and AI. He is the author of “Cloud Native Spring in Action” and an upcoming book on developer experience on Kubernetes, and serves as a CNCF Ambassador.

Abstract

This article examines practical patterns for incorporating artificial intelligence into Java applications using Spring AI and Vaadin, transitioning from experimental to production-ready implementations. It analyzes techniques for memory management, guardrails, multimodality, retrieval-augmented generation, tool calling, and agents, with implications for security, user experience, and system integration. Insights emphasize robust, observable AI workflows in on-premises or cloud environments.

Memory Management and Streaming in AI Interactions

Integrating large language models (LLMs) into applications requires addressing their stateless nature, where each interaction lacks inherent context from prior exchanges. Spring AI provides advisors—interceptor-like mechanisms—to augment prompts with conversation history, enabling short-term memory. For instance, a MessageChatMemoryAdvisor retains the last N messages, ensuring continuity without manual tracking.

This pattern enhances user interactions in chat-based interfaces, built here with Vaadin’s component model for server-side Java UIs. A vertical layout hosts message lists and inputs, injecting a ChatClientBuilder to construct clients with advisors. Basic interactions involve prompting the model and appending responses, but for realism, streaming via reactive fluxes improves responsiveness, subscribing to token streams and updating UI progressively.

Code illustration:

ChatClient chatClient = builder.build();
messageInput.addSubmitListener(submitEvent -> {
    String message = submitEvent.getMessage();
    MessageItem userItem = messageList.addMessage("You", message);
    chatClient.stream(new Prompt(message))
        .subscribe(response -> {
            userItem.append(response.getResult().getOutput().getContent());
        });
});

Streaming suits verbose responses, reducing perceived latency, while observability integrations (e.g., OpenTelemetry) trace interactions for debugging nondeterministic behaviors.

Guardrails for Security and Validation

AI workflows must mitigate risks like sensitive data leaks or invalid outputs. Input guardrails intercept prompts, using on-premises models to check for compliance with policies, blocking unauthorized queries (e.g., personal information). Output guardrails validate responses, reprompting for corrections if deserialization fails.

Advisors enable this: a default advisor with a local chat model filters inputs/outputs. For example, querying an address might be blocked if flagged, preventing cloud exposure. This ensures determinism in structured outputs, converting unstructured text to Java objects via JSON instructions.

Implications include privacy preservation in regulated sectors and integration with Spring Security for role-based tool access.

Multimodality and Retrieval-Augmented Generation

LLMs extend beyond text through multimodality, processing images, audio, or videos. Spring AI’s entity methods augment prompts for structured extraction, e.g., parsing attendee details from images into tables for programmatic use.

Retrieval-augmented generation (RAG) combats hallucinations by embedding external data as vectors in stores like PostgreSQL. A RetrievalAugmentationAdvisor retrieves relevant documents via similarity search, augmenting prompts. Customizations allow empty contexts for fallback to model knowledge.

Example:

VectorStore vectorStore = // PostgreSQL vector store
DocumentRetriever retriever = new VectorStoreDocumentRetriever(vectorStore);
RetrievalAugmentationAdvisor advisor = RetrievalAugmentationAdvisor.builder()
    .documentRetriever(retriever)
    .queryAugmentor(QueryAugmentor.contextual().allowEmptyContext(true))
    .build();

This pattern grounds responses in proprietary data, with thresholds controlling retrieval scope.

Tool Calling, Agents, and Dynamic Integrations

Tool calling empowers LLMs as agents, invoking external functions for tasks like database queries. Annotations describe tools, passed to clients for dynamic selection. For products, a service might expose query/update methods:

@Tool(description = "Fetch products from database")
public List<Product> getProducts(@P(description = "Category filter") String category) {
    // Database query
}

Agents orchestrate tools, potentially via Model Context Protocol for external services. Demonstrations include theme generation from screenshots, editing CSS via file system tools, highlighting nondeterminism and the need for safeguards.

In conclusion, these patterns enable production AI, emphasizing modularity, security, and observability for robust Java applications.

Links:

PostHeaderIcon [DevoxxUK2025] Concerto for Java and AI: Building Production-Ready LLM Applications

At DevoxxUK2025, Thomas Vitale, a software engineer at Systematic, delivered an inspiring session on integrating generative AI into Java applications to enhance his music composition process. Combining his passion for music and software engineering, Thomas showcased a “composer assistant” application built with Spring AI, addressing real-world use cases like text classification, semantic search, and structured data extraction. Through live coding and a musical performance, he demonstrated how Java developers can leverage large language models (LLMs) for production-ready applications, emphasizing security, observability, and developer experience. His talk culminated in a live composition for an audience-chosen action movie scene, blending AI-driven suggestions with human creativity.

The Why Factor for AI Integration

Thomas introduced his “Why Factor” to evaluate hype technologies like generative AI. First, identify the problem: for his composer assistant, he needed to organize and access musical data efficiently. Second, assess production readiness: LLMs must be secure and reliable for real-world use. Third, prioritize developer experience: tools like Spring AI simplify integration without disrupting workflows. By focusing on these principles, Thomas avoided blindly adopting AI, ensuring it solved specific issues, such as automating data classification to free up time for creative tasks like composing music.

Enhancing Applications with Spring AI

Using a Spring Boot application with a Thymeleaf frontend, Thomas integrated Spring AI to connect to LLMs like those from Ollama (local) and Mistral AI (cloud). He demonstrated text classification by creating a POST endpoint to categorize musical data (e.g., “Irish tin whistle” as an instrument) using a chat client API. To mitigate risks like prompt injection attacks, he employed Java enumerations to enforce structured outputs, converting free text into JSON-parsed Java objects. This approach ensured security and usability, allowing developers to swap models without code changes, enhancing flexibility for production environments.

Semantic Search and Retrieval-Augmented Generation

Thomas addressed the challenge of searching musical data by meaning, not just keywords, using semantic search. By leveraging embedding models in Spring AI, he converted text (e.g., “melancholic”) into numerical vectors stored in a PostgreSQL database, enabling searches for related terms like “sad.” He extended this with retrieval-augmented generation (RAG), where a chat client advisor retrieves relevant data before querying the LLM. For instance, asking, “What instruments for a melancholic scene?” returned suggestions like cello, based on his dataset, improving search accuracy and user experience.

Structured Data Extraction and Human Oversight

To streamline data entry, Thomas implemented structured data extraction, converting unstructured director notes (e.g., from audio recordings) into JSON objects for database storage. Spring AI facilitated this by defining a JSON schema for the LLM to follow, ensuring structured outputs. Recognizing LLMs’ potential for errors, he emphasized keeping humans in the loop, requiring users to review extracted data before saving. This approach, applied to his composer assistant, reduced manual effort while maintaining accuracy, applicable to scenarios like customer support ticket processing.

Tools and MCP for Enhanced Functionality

Thomas enhanced his application with tools, enabling LLMs to call internal APIs, such as saving composition notes. Using Spring Data, he annotated methods to make them accessible to the model, allowing automated actions like data storage. He also introduced the Model Context Protocol (MCP), implemented in Quarkus, to integrate with external music software via MIDI signals. This allowed the LLM to play chord progressions (e.g., in A minor) through his piano software, demonstrating how MCP extends AI capabilities across local processes, though he cautioned it’s not yet production-ready.

Observability and Live Composition

To ensure production readiness, Thomas integrated OpenTelemetry for observability, tracking LLM operations like token usage and prompt augmentation. During the session, he invited the audience to choose a movie scene (action won) and used his application to generate a composition plan, suggesting chord progressions (e.g., I-VI-III-VII) and instruments like percussion and strings. He performed the music live, copy-pasting AI-suggested notes into his software, fixing minor bugs, and adding creative touches, showcasing a practical blend of AI automation and human artistry.

Links:

PostHeaderIcon [DevoxxBE2023] Securing the Supply Chain for Your Java Applications by Thomas Vitale

At Devoxx Belgium 2023, Thomas Vitale, a software engineer and architect at Systematic, delivered an authoritative session on securing the software supply chain for Java applications. As the author of Cloud Native Spring in Action and a passionate advocate for cloud-native technologies, Thomas provided a comprehensive exploration of securing every stage of the software lifecycle, from source code to deployment. Drawing on the SLSA framework and CNCF research, he demonstrated practical techniques for ensuring integrity, authenticity, and resilience using open-source tools like Gradle, Sigstore, and Kyverno. Through a blend of theoretical insights and live demonstrations, Thomas illuminated the critical importance of supply chain security in today’s threat landscape.

Safeguarding Source Code with Git Signatures

Thomas began by defining the software supply chain as the end-to-end process of delivering software, encompassing code, dependencies, tools, practices, and people. He emphasized the risks at each stage, starting with source code. Using Git as an example, Thomas highlighted its audit trail capabilities but cautioned that commit authorship can be manipulated. In a live demo, he showed how he could impersonate a colleague by altering Git’s username and email, underscoring the need for signed commits. By enforcing signed commits with GPG or SSH keys—or preferably a keyless approach via GitHub’s single sign-on—developers can ensure commit authenticity, establishing a verifiable provenance trail critical for supply chain security.

Managing Dependencies with Software Bills of Materials (SBOMs)

Moving to dependencies, Thomas stressed the importance of knowing exactly what libraries are included in a project, especially given vulnerabilities like Log4j. He introduced Software Bills of Materials (SBOMs) as a standardized inventory of software components, akin to a list of ingredients. Using the CycloneDX plugin for Gradle, Thomas demonstrated generating an SBOM during the build process, which provides precise dependency details, including versions, licenses, and hashes for integrity verification. This approach, integrated into Maven or Gradle, ensures accuracy over post-build scanning tools like Snyk, enabling developers to identify vulnerabilities, check license compliance, and verify component integrity before production.

Thomas further showcased Dependency-Track, an OWASP project, to analyze SBOMs and flag vulnerabilities, such as a critical issue in SnakeYAML. He introduced the Vulnerability Exploitability Exchange (VEX) standard, which complements SBOMs by documenting whether vulnerabilities affect an application. In his demo, Thomas marked a SnakeYAML vulnerability as a false positive due to Spring Boot’s safe deserialization, demonstrating how VEX communicates security decisions to stakeholders, reducing unnecessary alerts and ensuring compliance with emerging regulations.

Building Secure Artifacts with Reproducible Builds

The build phase, Thomas explained, is another critical juncture for security. Using Spring Boot as an example, he outlined three packaging methods: JAR files, native executables, and container images. He critiqued Dockerfiles for introducing non-determinism and maintenance overhead, advocating for Cloud Native Buildpacks as a reproducible, secure alternative. In a demo, Thomas built a container image with Buildpacks, highlighting its fixed creation timestamp (January 1, 1980) to ensure identical outputs for unchanged inputs, enhancing security by eliminating variability. This reproducibility, coupled with SBOM generation during the build, ensures artifacts are both secure and traceable.

Signing and Verifying Artifacts with SLSA

To ensure artifact integrity, Thomas introduced the SLSA framework, which provides guidelines for securing software artifacts across the supply chain. He demonstrated signing container images with Sigstore’s Cosign tool, using a keyless approach to avoid managing private keys. This process, integrated into a GitHub Actions pipeline, ensures that artifacts are authentically linked to their creator. Thomas further showcased SLSA’s provenance generation, which documents the artifact’s origin, including the Git commit hash and build steps. By achieving SLSA Level 3, his pipeline provided non-falsifiable provenance, ensuring traceability from source code to deployment.

Securing Deployments with Policy Enforcement

The final stage, deployment, requires validating artifacts to ensure they meet security standards. Thomas demonstrated using Cosign and the SLSA Verifier to validate signatures and provenance, ensuring only trusted artifacts are deployed. On Kubernetes, he introduced Kyverno, a policy engine that enforces signature and provenance checks, automatically rejecting non-compliant deployments. This approach ensures that production environments remain secure, aligning with the principle of validating metadata to prevent unauthorized or tampered artifacts from running.

Conclusion: A Holistic Approach to Supply Chain Security

Thomas’s session at Devoxx Belgium 2023 provided a robust framework for securing Java application supply chains. By addressing source code integrity, dependency management, build reproducibility, artifact signing, and deployment validation, he offered a comprehensive strategy to mitigate risks. His practical demonstrations, grounded in open-source tools and standards like SLSA and VEX, empowered developers to adopt these practices without overwhelming complexity. Thomas’s emphasis on asking “why” at each step encouraged attendees to tailor security measures to their context, ensuring both compliance and resilience in an increasingly regulated landscape.

Links: