Jonathan Lalou's Blog

Posts Tagged ‘AgenticAI’

[GoogleIO2025] Google I/O ’25 Keynote

Keynote Speakers

Sundar Pichai serves as the Chief Executive Officer of Alphabet Inc. and Google, overseeing the company’s strategic direction with a focus on artificial intelligence integration across products and services. Born in India, he holds degrees from the Indian Institute of Technology Kharagpur, Stanford University, and the Wharton School, and has been instrumental in advancing Google’s cloud computing and AI initiatives since joining the firm in 2004.

Demis Hassabis acts as the Co-Founder and Chief Executive Officer of Google DeepMind, leading efforts in artificial general intelligence and breakthroughs in areas like protein folding and game-playing AI. A former child chess prodigy with a PhD in cognitive neuroscience from University College London, he has received knighthood for his contributions to science and technology.

Liz Reid holds the position of Vice President of Search at Google, directing product management and engineering for core search functionalities. She joined Google in 2003 as its first female engineer in the New York office and has spearheaded innovations in local search and AI-enhanced experiences.

Johanna Voolich functions as the Chief Product Officer at YouTube, guiding product strategies for the platform’s global user base. With extensive experience at Google in search, Android, and Workspace, she emphasizes AI-driven enhancements for content creation and consumption.

Dave Burke previously served as Vice President of Engineering for Android at Google, contributing to the platform’s development for over a decade before transitioning to advisory roles in AI and biotechnology.

Donald Glover is an acclaimed American actor, musician, writer, and director, known professionally as Childish Gambino in his music career. Born in 1983, he has garnered multiple Emmy and Grammy awards for his work in television series like Atlanta and music albums exploring diverse themes.

Sameer Samat operates as President of the Android Ecosystem at Google, responsible for the operating system’s user and developer experiences worldwide. Holding a bachelor’s degree in computer science from the University of California San Diego, he has held leadership roles in product management across Google’s mobile and ecosystem divisions.

Abstract

This examination delves into the pivotal announcements from the Google I/O 2025 keynote, centering on breakthroughs in artificial intelligence models, agentic systems, search enhancements, generative media, and extended reality platforms. It dissects the underlying methodologies driving these advancements, their contextual evolution from research prototypes to practical implementations, and the far-reaching implications for technological accessibility, societal problem-solving, and ethical AI deployment. By analyzing demonstrations and strategic integrations, the discourse illuminates how Google’s full-stack approach fosters rapid innovation while addressing real-world challenges.

Evolution of AI Models and Infrastructure

The keynote commences with Sundar Pichai highlighting the accelerated pace of AI development within Google’s ecosystem, emphasizing the transition from foundational research to widespread application. Central to this narrative is the Gemini model family, which has seen substantial enhancements since its inception. Pichai notes the deployment of over a dozen models and features in the past year, underscoring a methodology that prioritizes swift iteration and integration. For instance, the Gemini 2.5 Pro model achieves top rankings on benchmarks like the Ella Marina leaderboard, reflecting a 300-point increase in ELO scores—a metric evaluating model performance across diverse tasks.

This progress is underpinned by Google’s proprietary infrastructure, exemplified by the seventh-generation TPU named Ironwood. Designed for both training and inference at scale, it offers a tenfold performance boost over predecessors, enabling 42.5 exaflops per pod. Such hardware advancements facilitate cost reductions and efficiency gains, allowing models to process outputs at unprecedented speeds—Gemini models dominate the top three positions for tokens per second on leading leaderboards. The implications extend to democratizing AI, as lower prices and higher performance make advanced capabilities accessible to developers and users alike.

Demis Hassabis elaborates on the intelligence layer, positioning Gemini 2.5 Pro as the world’s premier foundation model. Updated previews have empowered creators to generate interactive applications from sketches or simulate urban environments, demonstrating multimodal reasoning that spans text, code, and visuals. The incorporation of LearnM, a specialized educational model, elevates its utility in learning scenarios, topping relevant benchmarks. Meanwhile, the refined Gemini 2.5 Flash serves as an efficient alternative, appealing to developers for its balance of speed and affordability.

Methodologically, these models leverage vast datasets and advanced training techniques, including reinforcement learning from human feedback, to enhance reasoning and contextual understanding. The context of this evolution lies in Google’s commitment to a full-stack AI strategy, integrating hardware, software, and research. Implications include fostering an ecosystem where AI augments human creativity, though challenges like computational resource demands necessitate ongoing optimizations to ensure equitable access.

Agentic Systems and Personalization Strategies

A significant portion of the presentation explores agentic AI, where systems autonomously execute tasks while remaining under user oversight. Pichai introduces concepts like Project Starline evolving into Google Beam, a 3D video platform that merges multiple camera feeds via AI to create immersive communications. This innovation, collaborating with HP, employs real-time rendering at 60 frames per second, implying enhanced remote interactions that mimic physical presence.

Building on this, Project Astra’s capabilities migrate to Gemini Live, enabling contextual awareness through camera and screen sharing. Demonstrations reveal its application in everyday scenarios, such as interview preparation or fitness training. The introduction of multitasking in Project Mariner allows oversight of up to ten tasks, utilizing “teach and repeat” mechanisms where agents learn from single demonstrations. Available via the Gemini API, this tool invites developer experimentation, with partners like UiPath integrating it for automation.

The agent ecosystem is bolstered by protocols like the open agent-to-agent framework and Model Context Protocol (MCP) compatibility in the Gemini SDK, facilitating inter-agent communication and service access. In practice, agent mode in the Gemini app exemplifies this by sourcing apartment listings, applying filters, and scheduling tours—streamlining complex workflows.

Personalization emerges as a complementary frontier, with “personal context” allowing models to draw from user data across Google apps, ensuring privacy through user controls. An example in Gmail illustrates personalized smart replies that emulate individual styles by analyzing past communications and documents. This methodology relies on secure data handling and fine-tuned models, implying deeper user engagement but raising ethical considerations around data consent and bias mitigation.

Overall, these agentic and personalized approaches shift AI from reactive tools to proactive assistants, contextualized within Google’s product suite. The implications are transformative for productivity, yet require robust governance to balance utility with user autonomy.

Innovations in Search and Information Retrieval

Liz Reid advances the discussion on search evolution, framing AI Overviews and AI Mode as pivotal shifts. With over 1.5 billion monthly users, AI Overviews synthesize responses from web content, enhancing query resolution. AI Mode extends this into conversational interfaces, supporting complex, multi-step inquiries like travel planning by integrating reasoning, tool usage, and web interaction.

Methodologically, this involves grounding models in real-time data, ensuring factual accuracy through citations and diverse perspectives. Demonstrations showcase handling ambiguous queries, such as dietary planning, by breaking them into sub-tasks and verifying outputs. The introduction of video understanding allows analysis of uploaded content, providing step-by-step guidance.

Contextually, these features address information overload in an era of abundant data, implying improved user satisfaction—evidenced by higher engagement metrics. However, implications include potential disruptions to content ecosystems, necessitating transparency in sourcing to maintain trust.

Generative Media and Creative Tools

Johanna Voolich and Donald Glover spotlight generative media, with Imagine 3 and V3 models enabling high-fidelity image and video creation. Imagine 3’s stylistic versatility and V3’s narrative consistency allow seamless editing, as Glover illustrates in crafting a short film.

The Flow tool democratizes filmmaking by generating clips from prompts, supporting extensions and refinements. Methodologically, these leverage diffusion-based architectures trained on vast datasets, ensuring coherence across outputs.

Context lies in empowering creators, with implications for industries like entertainment—potentially lowering barriers but raising concerns over authenticity and intellectual property. Subscription plans like Google AI Pro and Ultra provide access, fostering experimentation.

Android XR Platform and Ecosystem Expansion

Sameer Samat introduces Android XR, optimized for headsets and glasses, integrating Gemini for contextual assistance. Project Muhan with Samsung offers immersive experiences, while glasses prototypes enable hands-free interactions like navigation and translation.

Partnerships with Gentle Monster and Warby Parker emphasize style, with developer previews forthcoming. Methodologically, this builds on Android’s ecosystem, ensuring app compatibility.

Implications include redefining human-computer interaction, enhancing accessibility, but demanding advancements in battery life and privacy.

Societal Impacts and Prospective Horizons

The keynote culminates in applications like Firesat for wildfire detection and drone relief during disasters, showcasing AI’s role in societal challenges. Pichai envisions near-term realizations in robotics, medicine, quantum computing, and autonomous vehicles.

This forward-looking context underscores ethical deployment, with implications for global equity. Personal anecdotes reinforce technology’s inspirational potential, urging collaborative progress.

Links:

Posted in en-US | Tags: AgenticAI, AndroidXR, ArtificialIntelligence, DaveBurke, DemisHassabis, DonaldGlover, GeminiModels, GenerativeMedia, GoogleDeepMind, GoogleIO2025, JohannaVoolich, LizReid, SameerSamat, SundarPichai | No Comments »

[AWSReInforce2025] From possibility to production: A strong, flexible foundation for AI security

Author: Jonathan Lalou

Lecturer

The session features AWS security specialists who architect the AI security substrate, combining expertise in machine learning operations, formal methods, and cloud-native controls. Their work spans Bedrock Guardrails, SageMaker security boundaries, and agentic workflow protection.

Abstract

The presentation constructs a comprehensive AI security framework that accelerates development while maintaining enterprise-grade controls. Through layered defenses—data provenance, model isolation, runtime guardrails, and agentic supervision—it demonstrates how AWS transforms AI security from a deployment blocker into an innovation catalyst, with real-world deployments illustrating production readiness.

AI Security Risk Taxonomy and Defense Layering

AI systems introduce novel threat vectors: training data poisoning, prompt injection, model inversion, and agentic escape. AWS categorizes these across the ML lifecycle:

Data Layer: Provenance tracking, differential privacy, synthetic data generation
Model Layer: Isolation via confidential computing, integrity verification
Inference Layer: Input/output filtering, rate limiting, behavioral monitoring
Agentic Layer: Tool access control, execution sandboxing, human-in-loop gates

Defense in depth applies at each stratum, with controls compounding rather than duplicating effort.

Data Security and Provenance Foundation

Data forms the bedrock of AI trustworthiness. Amazon Macie now classifies training datasets, identifying PII leakage before model ingestion. SageMaker Feature Store implements cryptographic commitment—hashing datasets to immutable ledger entries—enabling audit trails for regulatory compliance.

\# SageMaker data provenance
feature_group = FeatureGroup(name="credit-risk")
feature_group.create(...)
commit_hash = feature_group.commit(data_frame)
audit_log.put(commit_hash, metadata)

This provenance chain supports model cards that document training data composition, bias metrics, and fairness constraints, satisfying EU AI Act requirements.

Model Isolation and Confidential Computing

Model intellectual property requires protection equivalent to source code. AWS Nitro Enclaves provide hardware-isolated execution environments:

\# Enclave attestation document
curl --cert enclave.crt --key enclave.key \
  https://enclave.local/attestation

The enclave receives encrypted model weights, decrypts internally, and serves inferences without exposing parameters. Memory encryption and remote attestation prevent exfiltration even from privileged host processes. Bedrock custom models execute within enclaves by default, eliminating trust in underlying infrastructure.

Runtime Guardrails and Content Moderation

Amazon Bedrock Guardrails implement multi-faceted content filtering:

{
  "blockedInputMessaging": "Policy violation",
  "blockedOutputsMessaging": "Response blocked",
  "contentPolicyConfig": {
    "filtersConfig": [
      {"type": "HATE", "inputStrength": "HIGH"},
      {"type": "PROMPT_INJECTION", "inputStrength": "MEDIUM"}
    ]
  }
}

Filters operate at token level, with configurable strength thresholds. PII redaction, topic blocking, and word denylists combine with contextual analysis to prevent jailbreak attempts. Guardrails integrate with CodeWhisperer to scan generated code for vulnerabilities before execution.

Agentic AI Supervision and Execution Control

Agentic workflows—LLMs that invoke tools, APIs, or other models—amplify risk surface. AWS implements execution sandboxing:

@bedrock_agent
def trading_agent(prompt):
    tools = [
        {"name": "execute_trade", "permissions": "trading:execute"},
        {"name": "read_portfolio", "permissions": "trading:read"}
    ]
    return agent.invoke(prompt, tools)

IAM-bound tool invocation ensures least privilege. Step Functions orchestrate multi-agent workflows with approval gates for high-risk actions. Anthropic’s enterprise deployment uses this pattern to route sensitive queries through human review while automating routine analysis.

Production Deployments and Operational Resilience

Robinhood’s AI-powered fraud detection processes 10 million transactions daily using SageMaker endpoints behind WAF rules that detect prompt injection patterns. BMW’s infrastructure optimization agent operates across 1,300 accounts with VPC-private networking and KMS-encrypted prompts.

These deployments share common patterns:
– Immutable infrastructure via ECS Fargate
– Blue/green model updates with Shadow Mode testing
– Continuous evaluation using held-out datasets
– Automated rollback triggered by drift detection

Future Threat Modeling and Adaptive Controls

Emerging risks—model stealing via API querying, adversarial example crafting—require proactive modeling. AWS invests in automated reasoning to prove guardrail efficacy against known attack classes. Formal methods verify that prompt filters cannot be bypassed through encoding obfuscation.

Agentic systems introduce non-deterministic execution paths. Step Functions now support probabilistic branching with confidence thresholds, routing uncertain decisions to human oversight. This hybrid approach balances automation velocity with risk management.

Conclusion: Security as AI Innovation Substrate

The AWS AI security framework demonstrates that rigorous controls need not impede velocity. By providing data provenance, model isolation, runtime guardrails, and agentic supervision as managed services, AWS enables organizations to progress from proof-of-concept to production without security debt. The flexible control plane—configurable via console, API, or IaC—adapts to evolving regulations and threat landscapes. Security becomes the substrate that accelerates AI adoption, transforming defensive posture into competitive advantage.

Links:

Lecture Video

Posted in en-US | Tags: AgenticAI, AISecurity, AWSreInforce, AWSReInforce2025, BedrockGuardrails, MLSecurity, NitroEnclaves | No Comments »

[DevoxxFR2025] Building an Agentic AI with Structured Outputs, Function Calling, and MCP

Author: Jonathan Lalou

The rapid advancements in Artificial Intelligence, particularly in large language models (LLMs), are enabling the creation of more sophisticated and autonomous AI agents – programs capable of understanding instructions, reasoning, and interacting with their environment to achieve goals. Building such agents requires effective ways for the AI model to communicate programmatically and to trigger external actions. Julien Dubois, in his deep-dive session, explored key techniques and a new protocol essential for constructing these agentic AI systems: Structured Outputs, Function Calling, and the Model-Controller Protocol (MCP). Using practical examples and the latest Java SDK developed by OpenAI, he demonstrated how to implement these features within LangChain4j, showcasing how developers can build AI agents that go beyond simple text generation.

Structured Outputs: Enabling Programmatic Communication

One of the challenges in building AI agents is getting LLMs to produce responses in a structured format that can be easily parsed and used by other parts of the application. Julien explained how Structured Outputs address this by allowing developers to define a specific JSON schema that the AI model must adhere to when generating its response. This ensures that the output is not just free-form text but follows a predictable structure, making it straightforward to map the AI’s response to data objects in programming languages like Java. He demonstrated how to provide the LLM with a JSON schema definition and constrain its output to match that schema, enabling reliable programmatic communication between the AI model and the application logic. This is crucial for scenarios where the AI needs to provide data in a specific format for further processing or action.

Function Calling: Giving AI the Ability to Act

To be truly agentic, an AI needs the ability to perform actions in the real world or interact with external tools and services. Julien introduced Function Calling as a powerful mechanism that allows developers to define functions in their code (e.g., Java methods) and expose them to the AI model. The LLM can then understand when a user’s request requires calling one of these functions and generate a structured output indicating which function to call and with what arguments. The application then intercepts this output, executes the corresponding function, and can provide the function’s result back to the AI, allowing for a multi-turn interaction where the AI reasons, acts, and incorporates the results into its subsequent responses. Julien demonstrated how to define function “signatures” that the AI can understand and how to handle the function calls triggered by the AI, showcasing scenarios like retrieving information from a database or interacting with an external API based on the user’s natural language request.

MCP: Standardizing LLM Interaction

While Structured Outputs and Function Calling provide the capabilities for AI communication and action, the Model-Controller Protocol (MCP) emerges as a new standard to streamline how LLMs interact with various data sources and tools. Julien discussed MCP as a protocol that aims to standardize the communication layer between AI models (the “Model”) and the application logic that orchestrates them and provides access to external resources (the “Controller”). This standardization can facilitate building more portable and interoperable AI agentic systems, allowing developers to switch between different LLMs or integrate new tools and data sources more easily. While details of MCP might still be evolving, its goal is to provide a common interface for tasks like function calling, accessing external knowledge, and managing conversational state. Julien illustrated how libraries like LangChain4j are adopting these concepts and integrating with protocols like MCP to simplify the development of sophisticated AI agents. The presentation, rich in code examples using the OpenAI Java SDK, provided developers with the practical knowledge and tools to start building the next generation of agentic AI applications.

Links:

Julien Dubois: https://www.linkedin.com/in/juliendubois/
Microsoft: https://www.microsoft.com/
LangChain4j on GitHub: https://github.com/langchain4j/langchain4j
OpenAI: https://openai.com/
Devoxx France LinkedIn: https://www.linkedin.com/company/devoxx-france/
Devoxx France Bluesky: https://bsky.app/profile/devoxx.fr
Devoxx France Website: https://www.devoxx.fr/

Posted in en-US | Tags: AgenticAI, AI, DevoxxFR2025, FunctionCalling, Java, JulienDubois, LangChain4j, LLM, MCP, Microsoft, OpenAI, StructuredOutputs | No Comments »