Jonathan Lalou's Blog

Posts Tagged ‘GenerativeAI’

[NDCMelbourne2025] How to Work with Generative AI in JavaScript – Phil Nash

Phil Nash, a developer relations engineer at DataStax, delivers a comprehensive guide to leveraging generative AI in JavaScript at NDC Melbourne 2025. His talk demystifies the process of building AI-powered applications, emphasizing that JavaScript developers can harness existing skills to create sophisticated solutions without needing deep machine learning expertise. Through practical examples and insights into tools like Gemini and retrieval-augmented generation (RAG), Phil empowers developers to explore this rapidly evolving field.

Understanding Generative AI Fundamentals

Phil begins by addressing the excitement surrounding generative AI, noting its accessibility since the release of the GPT-3.5 API two years ago. He emphasizes that JavaScript developers are well-positioned to engage with AI due to robust tooling and APIs, despite the field’s Python-centric origins. Using Google’s Gemini model as an example, Phil demonstrates how to generate content with minimal code, highlighting the importance of understanding core concepts like token generation and model behavior.

He explains tokenization, using OpenAI’s byte pair encoding as an example, where text is broken into probabilistic tokens. Parameters like top-k, top-p, and temperature allow developers to control output randomness, with Phil cautioning against overly high settings that produce nonsensical results, humorously illustrated by a chaotic AI-generated story about a gnome.

Enhancing AI with Prompt Engineering

Prompt engineering emerges as a critical skill for refining AI outputs. Phil contrasts zero-shot prompting, which offers minimal context, with techniques like providing examples or system prompts to guide model behavior. For instance, a system prompt defining a “capital city assistant” ensures concise, accurate responses. He also explores chain-of-thought prompting, where instructing the model to think step-by-step improves its ability to solve complex problems, such as a modified river-crossing riddle.

Phil underscores the need for evaluation to ensure prompt reliability, as slight changes can significantly alter outcomes. This structured approach transforms prompt engineering from guesswork into a disciplined practice, enabling developers to tailor AI responses effectively.

Retrieval-Augmented Generation for Contextual Awareness

To address AI models’ limitations, such as outdated or private data, Phil introduces retrieval-augmented generation (RAG). RAG enhances models by integrating external data, like conference talk descriptions, into prompts. He explains how vector embeddings—multidimensional representations of text—enable semantic searches, using cosine similarity to find relevant content. With DataStax’s Astra DB, developers can store and query vectorized data efficiently, as demonstrated in a demo where Phil’s bot retrieves details about NDC Melbourne talks.

This approach allows AI to provide contextually relevant answers, such as identifying AI-related talks or conference events, making it a powerful tool for building intelligent applications.

Streaming Responses and Building Agents

Phil highlights the importance of user experience, noting that AI responses can be slow. Streaming, supported by APIs like Gemini’s generateContentStream, delivers tokens incrementally, improving perceived performance. He demonstrates streaming results to a webpage using JavaScript’s fetch and text decoder streams, showcasing how to create responsive front-end experiences.

The talk culminates with AI agents, which Phil describes as systems that perceive, reason, plan, and act using tools. By defining functions in JSON schema, developers can enable models to perform tasks like arithmetic or fetching web content. A demo bot uses tools to troubleshoot a keyboard issue and query GitHub, illustrating agents’ potential to solve complex problems dynamically.

Conclusion: Empowering JavaScript Developers

Phil concludes by encouraging developers to experiment with generative AI, leveraging tools like Langflow for visual prototyping and exploring browser-based models like Gemini Nano. His talk is a call to action, urging JavaScript developers to build innovative applications by combining AI capabilities with their existing expertise. By mastering prompt engineering, RAG, streaming, and agents, developers can create powerful, user-centric solutions.

Links:

Posted in en-US | Tags: AI, DataStax, GenerativeAI, JavaScript, NDCConferences, NDCMelbourne2025, PhilNash, PromptEngineering, RAG | No Comments »

[DevoxxUK2025] Concerto for Java and AI: Building Production-Ready LLM Applications

Author: Jonathan Lalou

At DevoxxUK2025, Thomas Vitale, a software engineer at Systematic, delivered an inspiring session on integrating generative AI into Java applications to enhance his music composition process. Combining his passion for music and software engineering, Thomas showcased a “composer assistant” application built with Spring AI, addressing real-world use cases like text classification, semantic search, and structured data extraction. Through live coding and a musical performance, he demonstrated how Java developers can leverage large language models (LLMs) for production-ready applications, emphasizing security, observability, and developer experience. His talk culminated in a live composition for an audience-chosen action movie scene, blending AI-driven suggestions with human creativity.

The Why Factor for AI Integration

Thomas introduced his “Why Factor” to evaluate hype technologies like generative AI. First, identify the problem: for his composer assistant, he needed to organize and access musical data efficiently. Second, assess production readiness: LLMs must be secure and reliable for real-world use. Third, prioritize developer experience: tools like Spring AI simplify integration without disrupting workflows. By focusing on these principles, Thomas avoided blindly adopting AI, ensuring it solved specific issues, such as automating data classification to free up time for creative tasks like composing music.

Enhancing Applications with Spring AI

Using a Spring Boot application with a Thymeleaf frontend, Thomas integrated Spring AI to connect to LLMs like those from Ollama (local) and Mistral AI (cloud). He demonstrated text classification by creating a POST endpoint to categorize musical data (e.g., “Irish tin whistle” as an instrument) using a chat client API. To mitigate risks like prompt injection attacks, he employed Java enumerations to enforce structured outputs, converting free text into JSON-parsed Java objects. This approach ensured security and usability, allowing developers to swap models without code changes, enhancing flexibility for production environments.

Semantic Search and Retrieval-Augmented Generation

Thomas addressed the challenge of searching musical data by meaning, not just keywords, using semantic search. By leveraging embedding models in Spring AI, he converted text (e.g., “melancholic”) into numerical vectors stored in a PostgreSQL database, enabling searches for related terms like “sad.” He extended this with retrieval-augmented generation (RAG), where a chat client advisor retrieves relevant data before querying the LLM. For instance, asking, “What instruments for a melancholic scene?” returned suggestions like cello, based on his dataset, improving search accuracy and user experience.

Structured Data Extraction and Human Oversight

To streamline data entry, Thomas implemented structured data extraction, converting unstructured director notes (e.g., from audio recordings) into JSON objects for database storage. Spring AI facilitated this by defining a JSON schema for the LLM to follow, ensuring structured outputs. Recognizing LLMs’ potential for errors, he emphasized keeping humans in the loop, requiring users to review extracted data before saving. This approach, applied to his composer assistant, reduced manual effort while maintaining accuracy, applicable to scenarios like customer support ticket processing.

Tools and MCP for Enhanced Functionality

Thomas enhanced his application with tools, enabling LLMs to call internal APIs, such as saving composition notes. Using Spring Data, he annotated methods to make them accessible to the model, allowing automated actions like data storage. He also introduced the Model Context Protocol (MCP), implemented in Quarkus, to integrate with external music software via MIDI signals. This allowed the LLM to play chord progressions (e.g., in A minor) through his piano software, demonstrating how MCP extends AI capabilities across local processes, though he cautioned it’s not yet production-ready.

Observability and Live Composition

To ensure production readiness, Thomas integrated OpenTelemetry for observability, tracking LLM operations like token usage and prompt augmentation. During the session, he invited the audience to choose a movie scene (action won) and used his application to generate a composition plan, suggesting chord progressions (e.g., I-VI-III-VII) and instruments like percussion and strings. He performed the music live, copy-pasting AI-suggested notes into his software, fixing minor bugs, and adding creative touches, showcasing a practical blend of AI automation and human artistry.

Links:

Posted in en-US | Tags: DevoxxUK2025, GenerativeAI, Java, LLMApplications, SpringAI, ThomasVitale | No Comments »

[GoogleIO2024] Under the Hood with Google AI: Exploring Research, Impact, and Future Horizons

Author: Jonathan Lalou

Delving into AI’s foundational elements, Jeff Dean, James Manyika, and Koray Kavukcuoglu, moderated by Laurie Segall, discussed Google’s trajectory. Their dialogue traced historical shifts, current breakthroughs, and societal implications, offering profound perspectives on technology’s evolution.

Tracing AI’s Evolution and Key Milestones

Jeff recounted AI’s journey from rule-based systems to machine learning, highlighting neural networks’ resurgence around 2010 due to computational advances. Early applications at Google, like spelling corrections, paved the way for vision, speech, and language tasks. Koray noted hardware investments’ role in enabling generative methods, transforming content creation across fields.

James emphasized AI’s multiplier effect, reshaping sciences like biology and software development. The panel agreed that multimodal, long-context models like Gemini represent culminations of algorithmic and infrastructural progress, allowing generalization to novel challenges.

Addressing Societal Impacts and Ethical Considerations

James stressed AI’s mirror to humanity, prompting grapples with bias, fairness, and values—issues societies must collectively resolve. Koray advocated responsible deployment, integrating safety from inception through techniques like watermarking and red-teaming. Jeff highlighted balancing innovation with safeguards, ensuring models align with human intent while mitigating harms.

Discussions touched on global accessibility, with efforts to support underrepresented languages and equitable benefits. The leaders underscored collaborative approaches, involving diverse stakeholders to navigate complexities.

Envisioning AI’s Future Applications and Challenges

Koray envisioned AI accelerating healthcare, solving diseases efficiently worldwide. Jeff foresaw enhancements across human endeavors, from education to scientific discovery, if pursued thoughtfully. James hoped AI fosters better humanity, aiding complex problem-solving.

Challenges include advancing agentic systems for multi-step reasoning, improving evaluation beyond benchmarks, and ensuring inclusivity. The panel expressed optimism, viewing AI as an amplifier for positive change when guided responsibly.

Links:

Posted in en-US | Tags: AIResearch, DeepMind, EthicalAI, GenerativeAI, GoogleAI, GoogleIO2024, HealthcareAI, JamesManyika, JeffDean, KorayKavukcuoglu, LaurieSegall | No Comments »

[DevoxxUK2024] Breaking AI: Live Coding and Hacking Applications with Generative AI by Simon Maple and Brian Vermeer

Author: Jonathan Lalou

Simon Maple and Brian Vermeer, both seasoned developer advocates with extensive experience at Snyk and other tech firms, delivered an electrifying live coding session at DevoxxUK2024, exploring the double-edged sword of generative AI in software development. Simon, recently transitioned to a stealth-mode startup, and Brian, a current Snyk advocate, demonstrate how tools like GitHub Copilot and ChatGPT can accelerate coding velocity while introducing significant security risks. Through a live-coded Spring Boot coffee shop application, they expose vulnerabilities such as SQL injection, directory traversal, and cross-site scripting, emphasizing the need for rigorous validation and security practices. Their engaging, demo-driven approach underscores the balance between innovation and caution, offering developers actionable insights for leveraging AI safely.

Accelerating Development with Generative AI

Simon and Brian kick off by highlighting the productivity boost offered by generative AI tools, citing studies that suggest a 55% increase in developer efficiency and a 27% higher likelihood of meeting project goals. They build a Spring Boot application with a Thymeleaf front end, using Copilot to generate a homepage with a banner and product table. The process showcases AI’s ability to rapidly produce code snippets, such as HTML fragments, based on minimal prompts. However, they caution that this speed comes with risks, as AI often prioritizes completion over correctness, potentially embedding vulnerabilities. Their live demo illustrates how Copilot’s suggestions evolve with context, but also how developers must critically evaluate outputs to ensure functionality and security.

Exposing SQL Injection Vulnerabilities

The duo dives into a search functionality for their coffee shop application, where Copilot generates a query to filter products by name or description. However, the initial code concatenates user input directly into an SQL query, creating a classic SQL injection vulnerability. Brian demonstrates an exploit by injecting malicious input to set product prices to zero, highlighting how unchecked AI-generated code can compromise a system. They then refactor the code using prepared statements, showing how parameterization separates user input from the query execution plan, effectively neutralizing the vulnerability. This example underscores the importance of understanding AI outputs and applying secure coding practices, as tools like Copilot may not inherently prioritize security.

Mitigating Directory Traversal Risks

Next, Simon and Brian tackle a profile picture upload feature, where Copilot generates code to save files to a directory. The initial implementation concatenates user-provided file names with a base path, opening the door to directory traversal attacks. Using Burp Suite, they demonstrate how an attacker could overwrite critical files by manipulating the file name with “../” sequences. To address this, they refine the code to normalize paths, ensuring files remain within the intended directory. The session highlights the limitations of AI in detecting complex vulnerabilities like path traversal, emphasizing the need for developer vigilance and tools like Snyk to catch issues early in the development cycle.

Addressing Cross-Site Scripting Threats

The final vulnerability explored is cross-site scripting (XSS) in a product page feature. The AI-generated code directly embeds user input (product names) into HTML without sanitization, allowing Brian to inject a malicious script that captures session cookies. They demonstrate both reflective and stored XSS, showing how attackers could exploit these to hijack user sessions. While querying ChatGPT for a code review fails to pinpoint the XSS issue, Simon and Brian advocate for using established libraries like Spring Utils for input sanitization. This segment reinforces the necessity of combining AI tools with robust security practices and automated scanning to mitigate risks that AI might overlook.

Balancing Innovation and Security

Throughout the session, Simon and Brian stress that generative AI, while transformative, demands a cautious approach. They liken AI tools to junior developers, capable of producing functional code but requiring oversight to avoid errors or vulnerabilities. Real-world examples, such as a Samsung employee leaking sensitive code via ChatGPT, underscore the risks of blindly trusting AI outputs. They advocate for education, clear guidelines, and security tooling to complement AI-assisted development. By integrating tools like Snyk for vulnerability scanning and fostering a culture of code review, developers can harness AI’s potential while safeguarding their applications against threats.

Links:

Posted in en-US | Tags: BrianVermeer, ChatGPT, Copilot, DevoxxUK2024, GenerativeAI, Security, SimonMaple, Snyk, SpringBoot | No Comments »

[DotAI2024] DotAI 2024: Ines Montani – Crafting Resilient NLP Systems in the Generative Era

Author: Jonathan Lalou

Ines Montani, co-founder and CEO of Explosion AI, illuminated the pitfalls and potentials of natural language processing pipelines at DotAI 2024. As a core contributor to spaCy—an open-source NLP powerhouse—and Prodigy, a data annotation suite, Montani champions modular tools that blend human intuition with computational might. Her address critiqued the “prompts suffice” ethos, advocating hybrid architectures that fuse rules, examples, and generative flair for robust, production-viable solutions.

Harmonizing Paradigms for Enduring Intelligence

Montani traced instruction evolution: from rigid rules yielding brittle brittleness to supervised learning’s nuanced exemplars, now augmented by in-context prompts’ linguistic alchemy. Rules shine in clarity for novices, yet crumble under data flux; examples infuse domain savvy but demand curation toil; prompts democratize prototyping, yet hallucinate sans anchors.

The synergy? Layered pipelines where rules scaffold prompts, examples calibrate outputs, and LLMs infuse creativity. Montani showcased spaCy’s evolution: rule-based tokenizers ensure consistency, while generative components handle ambiguity, like entity resolution in noisy texts. This modularity mitigates drift, preserving fidelity across model swaps.

In industrial extraction—parsing resumes or contracts—Montani stressed data’s primacy: raw inputs reveal logic gaps, prompting refactorings that unearth “window-knocking machines”—flawed proxies mistaking correlation for causation. A chatbot querying calendars, she analogized, falters if oblivious to time zones; true utility demands holistic orchestration.

Fostering Modularity Amid Generative Hype

Montani cautioned against abstraction overload: leaky layers spawn brittle facades, where one-liners unravel on edge cases. Instead, embrace transparency—Prodigy’s active learning loops refine datasets iteratively, blending human oversight with AI proposals to curb over-reliance.

Retrieval-augmented generation (RAG) exemplifies balanced integration: LLMs query structured stores, yielding chat interfaces atop databases, supplanting clunky GUIs. Yet, Montani warned, context dictates efficacy; for analytical dives, raw views trump conversational veils.

Her ethos: interrogate intent—who wields the tool, what risks lurk? Surprise greets data dives, unveiling bespoke logics that generative magic alone can’t conjure. Efficiency, privacy, and modularity—spaCy’s hallmarks—thwart big-tech monoliths, empowering bespoke ingenuity.

In sum, Montani’s blueprint rejects compromise: generative AI amplifies, not supplants, principled engineering, birthing interfaces that endure and elevate.

Links:

Posted in en-US | Tags: DotAI2024, GenerativeAI, InesMontani, NLP, Prodigy, spaCy | No Comments »

[PHPForumParis2023] Experience Report: Building Two Open-Source Personal AIs with OpenAI – Maxime Thoonsen

Author: Jonathan Lalou

Maxime Thoonsen, CTO at Theodo, shared an exhilarating session at Forum PHP 2023, detailing his experience building two open-source personal AI applications using OpenAI’s technologies. As an organizer of the Generative AI Paris Meetup, Maxime’s passion for the PHP community and innovative AI solutions shone through. His step-by-step approach demystified AI development, encouraging PHP developers to explore generative AI by demonstrating its simplicity and potential through practical examples.

Understanding Generative AI’s Potential

Maxime began by introducing the capabilities of generative AI, emphasizing its accessibility for PHP developers. He explained how OpenAI’s APIs enable the creation of applications that process and generate human-like text. Drawing from his work at Theodo, Maxime showcased two personal AI projects, illustrating how they leverage semantic search and embeddings to deliver tailored responses. His enthusiasm for the community, where he began his speaking career, underscored the collaborative spirit driving AI innovation.

Practical AI Development with OpenAI

Delving into the technical details, Maxime walked the audience through building AI applications using OpenAI’s APIs. He highlighted the simplicity of implementing semantic search to retrieve relevant data from documents, advising against premature fine-tuning in favor of straightforward similarity searches. Responding to an audience question, Maxime noted the availability of open-source alternatives like Llama and Mistral, though he acknowledged OpenAI’s GPT-4 as a leader in embedding accuracy. His examples empowered developers to start building AI-driven features in their PHP projects.

Navigating the AI Ecosystem

Maxime concluded by addressing the rapidly evolving AI landscape, likening it to the proliferation of JavaScript frameworks. He emphasized the cost-effectiveness of smaller open-source models for specific use cases, while noting OpenAI’s edge in precision. His talk inspired developers to join communities like the Generative AI Paris Meetup to explore AI further, fostering a sense of curiosity and experimentation within the PHP ecosystem.

Links:

Posted in en-US | Tags: GenerativeAI, MaximeThoonsen, OpenAI, PHP, PHPForumParis2023, Theodo | No Comments »

[DevoxxBE2023] Build a Generative AI App in Project IDX and Firebase by Prakhar Srivastav

Author: Jonathan Lalou

At Devoxx Belgium 2023, Prakhar Srivastav, a software engineer at Google, unveiled the power of Project IDX and Firebase in crafting a generative AI mobile application. His session illuminated how developers can harness these tools to streamline full-stack, multiplatform app development directly from the browser, eliminating cumbersome local setups. Through a live demonstration, Prakhar showcased the creation of “Listed,” a Flutter-based app that leverages Google’s PaLM API to break down user-defined goals into actionable subtasks, offering a practical tool for task management. His engaging presentation, enriched with real-time coding, highlighted the synergy of cloud-based development environments and AI-driven solutions.

Introducing Project IDX: A Cloud-Based Development Revolution

Prakhar introduced Project IDX as a transformative cloud-based development environment designed to simplify the creation of multiplatform applications. Unlike traditional setups requiring hefty binaries like Xcode or Android Studio, Project IDX enables developers to work entirely in the browser. Prakhar demonstrated this by running Android and iOS emulators side-by-side within the browser, showcasing a Flutter app that compiles to multiple platforms—Android, iOS, web, Linux, and macOS—from a single codebase. This eliminates the need for platform-specific configurations, making development accessible even on lightweight devices like Chromebooks.

The live demo featured “Listed,” a mobile app where users input a goal, such as preparing for a tech talk, and receive AI-generated subtasks and tips. For instance, entering “give a tech talk at a conference” yielded steps like choosing a relevant topic and practicing the presentation, with a tip to have a backup plan for technical issues. Prakhar’s real-time tweak—changing the app’s color scheme from green to red—illustrated the iterative development flow, where changes are instantly reflected in the emulator, enhancing productivity and experimentation.

Harnessing the PaLM API for Generative AI

Central to the app’s functionality is Google’s PaLM API, which Prakhar utilized to integrate generative AI capabilities. He explained that large language models (LLMs), like those powering the PaLM API, act as sophisticated autocomplete systems, predicting likely text outputs based on extensive training data. For “Listed,” the text API was chosen for its suitability in single-turn interactions, such as generating subtasks from a user’s query. Prakhar emphasized the importance of crafting effective prompts, comparing a vague prompt like “the sky is” to a precise one like “complete the sentence: the sky is,” which yields more relevant results.

To enhance the AI’s output, Prakhar employed few-shot prompting, providing the model with examples of desired responses. For instance, for the query “go camping,” the prompt included sample subtasks like choosing a campsite and packing meals, along with a tip about wildlife safety. This structured approach ensured the model generated contextually accurate and actionable suggestions, making the app intuitive for users tackling complex tasks.

Securing AI Integration with Firebase Extensions

Integrating the PaLM API into a mobile app poses security challenges, particularly around API key exposure. Prakhar addressed this by leveraging Firebase Extensions, which provide pre-packaged solutions to streamline backend integration. Specifically, he used a Firebase Extension to securely call the PaLM API via Cloud Functions, avoiding the need to embed sensitive API keys in the client-side Flutter app. This setup not only enhances security but also simplifies infrastructure management, as the extension handles logging, monitoring, and optional AppCheck for client verification.

In the live demo, Prakhar navigated the Firebase Extensions Marketplace, selecting the “Call PaLM API Securely” extension. With a few clicks, he deployed Cloud Functions that exposed a POST API for sending prompts and receiving AI-generated responses. The code walkthrough revealed a straightforward implementation in Dart, where the app constructs a JSON payload with the prompt, model name (text-bison-001), and temperature (0.25 for deterministic outputs), ensuring seamless and secure communication with the backend.

Building the Flutter App: Simplicity and Collaboration

The Flutter app’s architecture, built within Project IDX, was designed for simplicity and collaboration. Prakhar walked through the main.dart file, which scaffolds the app’s UI with a material-themed interface, an input field for user queries, and a list to display AI-generated tasks. The app uses anonymous Firebase authentication to secure backend calls without requiring user logins, enhancing accessibility. A PromptBuilder class dynamically constructs prompts by combining predefined prefixes and examples, ensuring flexibility in handling varied user inputs.

Project IDX’s integration with Visual Studio Code’s open-source framework added collaborative features. Prakhar demonstrated how developers can invite colleagues to a shared workspace, enabling real-time collaboration. Additionally, the IDE’s AI capabilities allow users to explain selected code or generate new snippets, streamlining development. For instance, selecting the PromptBuilder class and requesting an explanation provided detailed insights into its parameters, showcasing how Project IDX enhances developer productivity.

Links:

Posted in en-US | Tags: CloudDevelopment, DevoxxBE2023, Firebase, Flutter, GenerativeAI, Google, MobileDevelopment, PaLMAPI, PrakharSrivastav, ProjectIDX | No Comments »