Recent Posts
Archives

Posts Tagged ‘SpringAI’

PostHeaderIcon [SpringIO2025] Spring I/O 2025 Keynote

Lecturer

The keynote features Spring leadership: Juergen Hoeller (Framework Lead), Rossen Stoyanchev (Web), Ana Maria Mihalceanu (AI), Moritz Halbritter (Boot), Mark Paluch (Data), Josh Long (Advocate), Mark Pollack (Messaging). Collectively, they steer the Spring portfolio’s technical direction and community engagement.

Abstract

The keynote unveils Spring Framework 7.0 and Boot 4.0, establishing JDK 21 and Jakarta EE 11 as baselines while advancing AOT compilation, virtual threads, structured concurrency, and AI integration. Live demonstrations and roadmap disclosures illustrate how these enhancements—combined with refined observability, web capabilities, and data access—position Spring as the preeminent platform for cloud-native Java development.

Baseline Evolution: JDK 21 and Jakarta EE 11

Spring Framework 7.0 mandates JDK 21, embracing virtual threads for lightweight concurrency and records for immutable data carriers. Jakarta EE 11 introduces the Core Profile and CDI Lite, trimming enterprise bloat. The demonstration showcases a virtual thread-per-request web handler processing 100,000 concurrent connections with minimal heap, contrasting traditional thread pools. This baseline shift enables native image compilation via Spring AOT, reducing startup to milliseconds and memory footprint by 90%.

AOT and Native Image Optimization

Spring Boot 4.0 refines AOT processing through Project Leyden integration, pre-computing bean definitions and proxy classes at build time. Native executables startup in under 50ms, suitable for serverless platforms. The live demo compiles a Kafka Streams application to GraalVM native image, achieving sub-second cold starts and 15MB RSS—transforming deployment economics for event-driven microservices.

AI Integration and Modern Web Capabilities

Spring AI matures with function calling, tool integration, and vector database support. A live-coded agent retrieves beans from a running context to answer natural language queries about application metrics. WebFlux enhances structured concurrency with Schedulers.boundedElastic() replacement via virtual threads, simplifying reactive code. The demonstration contrasts traditional Mono/Flux composition with straightforward sequential logic executing on virtual threads, preserving backpressure while improving readability.

Data, Messaging, and Observability Advancements

Spring Data advances R2DBC connection pooling and Redis Cluster native support. Spring for Apache Kafka 4.0 introduces configurable retry templates and Micrometer metrics out-of-the-box. Unified observability aggregates metrics, traces, and logs: Prometheus exposes 200+ Kafka client metrics, OpenTelemetry correlates spans across HTTP and Kafka, and structured logging propagates MDC context. A Grafana dashboard visualizes end-to-end latency from REST ingress to database commit, enabling proactive incident response.

Community and Future Trajectory

The keynote celebrates Spring’s global community, highlighting contributions to null-safety (JSpecify), virtual thread testing, and AOT hint generation. Planned enhancements include JDK 23 support, Project Panama integration for native memory access, and AI-driven configuration validation. The vision positions Spring as the substrate for the next decade of Java innovation, balancing cutting-edge capabilities with backward compatibility.

Links:

PostHeaderIcon [DevoxxUK2025] Concerto for Java and AI: Building Production-Ready LLM Applications

At DevoxxUK2025, Thomas Vitale, a software engineer at Systematic, delivered an inspiring session on integrating generative AI into Java applications to enhance his music composition process. Combining his passion for music and software engineering, Thomas showcased a “composer assistant” application built with Spring AI, addressing real-world use cases like text classification, semantic search, and structured data extraction. Through live coding and a musical performance, he demonstrated how Java developers can leverage large language models (LLMs) for production-ready applications, emphasizing security, observability, and developer experience. His talk culminated in a live composition for an audience-chosen action movie scene, blending AI-driven suggestions with human creativity.

The Why Factor for AI Integration

Thomas introduced his “Why Factor” to evaluate hype technologies like generative AI. First, identify the problem: for his composer assistant, he needed to organize and access musical data efficiently. Second, assess production readiness: LLMs must be secure and reliable for real-world use. Third, prioritize developer experience: tools like Spring AI simplify integration without disrupting workflows. By focusing on these principles, Thomas avoided blindly adopting AI, ensuring it solved specific issues, such as automating data classification to free up time for creative tasks like composing music.

Enhancing Applications with Spring AI

Using a Spring Boot application with a Thymeleaf frontend, Thomas integrated Spring AI to connect to LLMs like those from Ollama (local) and Mistral AI (cloud). He demonstrated text classification by creating a POST endpoint to categorize musical data (e.g., “Irish tin whistle” as an instrument) using a chat client API. To mitigate risks like prompt injection attacks, he employed Java enumerations to enforce structured outputs, converting free text into JSON-parsed Java objects. This approach ensured security and usability, allowing developers to swap models without code changes, enhancing flexibility for production environments.

Semantic Search and Retrieval-Augmented Generation

Thomas addressed the challenge of searching musical data by meaning, not just keywords, using semantic search. By leveraging embedding models in Spring AI, he converted text (e.g., “melancholic”) into numerical vectors stored in a PostgreSQL database, enabling searches for related terms like “sad.” He extended this with retrieval-augmented generation (RAG), where a chat client advisor retrieves relevant data before querying the LLM. For instance, asking, “What instruments for a melancholic scene?” returned suggestions like cello, based on his dataset, improving search accuracy and user experience.

Structured Data Extraction and Human Oversight

To streamline data entry, Thomas implemented structured data extraction, converting unstructured director notes (e.g., from audio recordings) into JSON objects for database storage. Spring AI facilitated this by defining a JSON schema for the LLM to follow, ensuring structured outputs. Recognizing LLMs’ potential for errors, he emphasized keeping humans in the loop, requiring users to review extracted data before saving. This approach, applied to his composer assistant, reduced manual effort while maintaining accuracy, applicable to scenarios like customer support ticket processing.

Tools and MCP for Enhanced Functionality

Thomas enhanced his application with tools, enabling LLMs to call internal APIs, such as saving composition notes. Using Spring Data, he annotated methods to make them accessible to the model, allowing automated actions like data storage. He also introduced the Model Context Protocol (MCP), implemented in Quarkus, to integrate with external music software via MIDI signals. This allowed the LLM to play chord progressions (e.g., in A minor) through his piano software, demonstrating how MCP extends AI capabilities across local processes, though he cautioned it’s not yet production-ready.

Observability and Live Composition

To ensure production readiness, Thomas integrated OpenTelemetry for observability, tracking LLM operations like token usage and prompt augmentation. During the session, he invited the audience to choose a movie scene (action won) and used his application to generate a composition plan, suggesting chord progressions (e.g., I-VI-III-VII) and instruments like percussion and strings. He performed the music live, copy-pasting AI-suggested notes into his software, fixing minor bugs, and adding creative touches, showcasing a practical blend of AI automation and human artistry.

Links:

PostHeaderIcon [SpringIO2024] Text-to-SQL: Chat with a Database Using Generative AI by Victor Martin & Corrado De Bari @ Spring I/O 2024

At Spring I/O 2024 in Barcelona, Victor Martin, a product manager for Oracle Database, delivered a compelling session on Text-to-SQL, a transformative approach to querying databases using natural language, powered by generative AI. Stepping in for his colleague Corrado De Bari, who was unable to attend, Victor explored how Large Language Models (LLMs) and Oracle’s innovative tools, including Spring AI and Select AI, enable business users with no SQL expertise to interact seamlessly with databases. The talk highlighted practical implementations, security considerations, and emerging technologies like AI Vector Search, offering a glimpse into the future of database interaction.

The Promise of Text-to-SQL

Text-to-SQL leverages LLMs to translate natural language queries into executable SQL, democratizing data access for non-technical users. Victor began by posing a challenge: how long would it take to build a REST endpoint for a business user to query a database using plain text? Traditionally, this task required manual SQL construction, schema validation, and error handling. With modern frameworks like Spring Boot and Oracle’s Select AI, this process is streamlined. Select AI, integrated into Oracle Database 19c and enhanced in 23 AI, supports features like RUN_SQL to execute generated queries, NARRATE to return results as human-readable text, and EXPLAIN_SQL to detail query reasoning. Victor emphasized that these tools reduce development time, enabling rapid deployment of user-friendly database interfaces.

Configuring Oracle Database for Text-to-SQL

Implementing Text-to-SQL requires minimal configuration within Oracle Database. Victor outlined the steps: first, set up an Access Control List (ACL) to allow external LLM calls, specifying the host and port. Next, create credentials for the LLM service (e.g., Oracle Cloud Infrastructure Generative AI, Open AI, or Azure Open AI) using the DBMS_CLOUD_AI package. Finally, define a profile linking the schema, tables, and chosen LLM. This profile is selected per session to ensure queries use the correct context. Victor demonstrated this with a Spring Boot application, where the profile is set before invoking Select AI. The simplicity of this setup, combined with Spring AI’s abstraction, makes it accessible even for developers new to AI-driven database interactions.

Enhancing Queries with Schema Annotations

A key challenge in Text-to-SQL is ensuring LLMs interpret ambiguous schemas correctly. Victor highlighted that table and column names like “C1” or “Table1” can confuse models. To address this, Oracle Database supports annotations—comments on tables and columns that provide business context. For example, annotating a column as “process status” with possible values clarifies its purpose, aiding the LLM in generating accurate joins and filters. These annotations, which don’t affect production queries, are created collaboratively by DBAs and business stakeholders. Victor shared a real-world example from Oracle’s telecom applications, where annotated schemas improved query precision, enabling complex queries without manual intervention.

AI Vector Search: Querying Unstructured Data

Victor introduced AI Vector Search, a cutting-edge feature in Oracle Database 23 AI, which extends Text-to-SQL to unstructured data. Unlike traditional SQL, which queries structured data, vector search encodes text, images, or audio into high-dimensional vectors representing semantic meaning. These vectors, stored as a new VECTOR data type, enable similarity-based queries. For instance, a job search query for “software engineer positions in New York” can combine structured filters (e.g., location) with vector-based matching of job descriptions and resumes. Victor explained how embedding models, deployed via Oracle’s DBMS_DATA_MINING package, generate these vectors, with metrics like cosine similarity determining relevance. This capability opens new use cases, from document search to personalized recommendations.

Links:

PostHeaderIcon [DevoxxBE2023] Making Your @Beans Intelligent: Spring AI Innovations

At DevoxxBE2023, Dr. Mark Pollack delivered an insightful presentation on integrating artificial intelligence into Java applications using Spring AI, a project inspired by advancements in AI frameworks like LangChain and LlamaIndex. Mark, a seasoned Spring developer since 2003 and leader of the Spring Data project, explored how Java developers can harness pre-trained AI models to create intelligent applications that address real-world challenges. His talk introduced the audience to Spring AI’s capabilities, from simple “Hello World” examples to sophisticated use cases like question-and-answer systems over custom documents.

The Genesis of Spring AI

Mark began by sharing his journey into AI, sparked by the transformative impact of ChatGPT. Unlike traditional AI development, which often required extensive data cleaning and model training, pre-trained models like those from OpenAI offer accessible APIs and vast knowledge bases, enabling developers to focus on application engineering rather than data science. Mark highlighted how Spring AI emerged from his exploration of code generation, leveraging the structured nature of code within these models to create a framework tailored for Java developers. This framework abstracts the complexity of AI model interactions, making it easier to integrate AI into Spring-based applications.

Spring AI draws inspiration from Python’s AI ecosystem but adapts these concepts to Java’s idioms, emphasizing component abstractions and pluggability. Mark emphasized that this is not a direct port but a reimagination, aligning with the Spring ecosystem’s strengths in enterprise integration and batch processing. This approach positions Spring AI as a bridge between Java’s robust software engineering practices and the dynamic world of AI.

Core Components of AI Applications

A significant portion of Mark’s presentation focused on the architecture of AI applications, which extends beyond merely calling a model. He introduced a conceptual framework involving contextual data, AI frameworks, and models. Contextual data, akin to ETL (Extract, Transform, Load) processes, involves parsing and transforming data—such as PDFs—into embeddings stored in vector databases. These embeddings enable efficient similarity searches, crucial for use cases like question-and-answer systems.

Mark demonstrated a simple AI client in Spring AI, which abstracts interactions with various AI models, including OpenAI, Hugging Face, Amazon Bedrock, and Google Vertex. This portability allows developers to switch models without significant code changes. He also showcased the Spring CLI, a tool inspired by JavaScript’s Create React App, which simplifies project setup by generating starter code from existing repositories.

Prompt Engineering and Its Importance

Prompt engineering emerged as a critical theme in Mark’s talk. He explained that crafting effective prompts is essential for directing AI models to produce desired outputs, such as JSON-formatted responses or specific styles of answers. Spring AI’s PromptTemplate class facilitates this by allowing developers to create reusable, stateful templates with placeholders for dynamic content. Mark illustrated this with a demo where a prompt template generated a joke about a raccoon, highlighting the importance of roles (system and user) in defining the context and tone of AI responses.

He also touched on the concept of “dogfooding,” where AI models are used to refine prompts, creating a feedback loop that enhances their effectiveness. This iterative process, combined with evaluation techniques, ensures that applications deliver accurate and relevant responses, addressing challenges like model hallucinations—where AI generates plausible but incorrect information.

Retrieval Augmented Generation (RAG)

Mark introduced Retrieval Augmented Generation (RAG), a technique to overcome the limitations of AI models’ context windows, which restrict the amount of data they can process. RAG involves pre-processing data into smaller fragments, converting them into embeddings, and storing them in vector databases for similarity searches. This approach allows developers to provide only relevant data to the model, improving efficiency and accuracy.

In a demo, Mark showcased RAG with a bicycle shop dataset, where a question about city-commuting bikes retrieved relevant product descriptions from a vector store. This process mirrors traditional search engines but leverages AI to synthesize answers, demonstrating how Spring AI integrates with vector databases like Milvus and PostgreSQL to handle complex queries.

Real-World Applications and Future Directions

Mark highlighted practical applications of Spring AI, such as enabling question-and-answer systems for financial documents, medical records, or government programs like Medicaid. These use cases illustrate AI’s potential to make complex information more accessible, particularly for non-technical users. He also discussed the importance of evaluation in AI development, advocating for automated scoring mechanisms to assess response quality beyond simple test passing.

Looking forward, Mark outlined Spring AI’s roadmap, emphasizing robust core abstractions and support for a growing number of models and vector databases. He encouraged developers to explore the project’s GitHub repository and participate in its evolution, underscoring the rapid pace of AI advancements and the need for community involvement.

Links: