Jonathan Lalou's Blog

[VoxxedDaysBucharest2026] Building a Sarcastic, Agentic Pair Programmer: Alexander Chatzizacharias on Crafting Playful LLM Workflows

Author: Jonathan Lalou | June 26, 2026

Lecturer

Alexander Chatzizacharias is a software engineer at JDriven, a specialized consultancy in the Netherlands focused on JVM technologies and modern software development practices. With a unique background blending Dutch and Greek influences and a keen interest in game studies, Alexander brings creativity and playful thinking to technical challenges. He frequently speaks on topics including Java, Spring Boot, AI applications, and innovative development workflows.

Abstract

As mainstream AI coding assistants converge toward similar polished but somewhat generic experiences, Alexander Chatzizacharias demonstrates how to build a highly personalized, characterful AI pair programmer named “Pip.” Inspired by interactions with a sarcastic colleague named Ricardo, Pip incorporates personality through vectorized Slack history, utilizes Spring Boot and Kotlin, runs entirely locally with Qwen models via Ollama, and employs sophisticated workflows, multi-vector RAG, and the Model Context Protocol (MCP) to create delightful and productive assistance while addressing challenges like non-determinism and model drift.

The Homogenization of AI Assistants and the Quest for Personality

Alexander observes that leading AI coding tools have converged on remarkably similar chat-based interfaces and interaction patterns, largely influenced by OpenAI’s design choices. While incremental improvements continue, the overall experience feels increasingly uniform. This observation inspired the creation of Pip — an intentionally quirky, sarcastic AI pair programmer that injects personality drawn from real colleague interactions.

By processing Slack conversation history into vector embeddings stored in Qdrant, Pip can retrieve and emulate Ricardo’s characteristic sarcastic tone, witty retorts, and playful threats (such as threatening to delete poorly written code). This transforms the assistant from a neutral tool into a more engaging, human-like collaborator that questions unclear requirements, offers humorous feedback, and makes the development process more enjoyable.

Technical Architecture: Workflows, Agents, and Local Execution

Pip is implemented as a Spring Boot application written in Kotlin, with an IntelliJ IDEA plugin providing the frontend interface. Everything runs locally to maintain privacy and control: Qwen 3.5 models served through Ollama handle the language tasks.

Rather than pursuing fully autonomous agents, Alexander favors structured workflows that provide greater determinism and reliability — attributes particularly valued in enterprise environments. A categorization agent, functioning as an LLM-as-Judge, routes incoming queries to appropriate specialized handlers. Each handler uses carefully crafted system prompts derived from Slack history to consistently embody the desired personality traits.

The architecture incorporates multiple specialized agents for response generation, sophisticated RAG pipelines leveraging both dense and sparse vector representations with ColBERT reranking for improved retrieval quality, and integration with the Model Context Protocol (MCP) for tool usage such as playing music or generating memes when appropriate.

RAG, Tools, and the Challenges of Non-Determinism

Retrieval-Augmented Generation forms a cornerstone of Pip’s capabilities, dynamically pulling relevant context to overcome the inherent token limitations of even advanced models. Multi-vector search strategies combine semantic understanding with keyword precision for more reliable information retrieval from project documentation, codebases, and conversation history.

Tool integration via MCP enables rich interactions but introduces additional complexity due to the non-deterministic nature of local models. Alexander discusses practical challenges including prompt sensitivity to model updates (“model locking” strategies), the art of prompt engineering which he likens to “vibe checking,” and the necessity of implementing guardrails to maintain appropriate behavior boundaries.

Implications for Future AI Development

Alexander encourages attendees to experiment with building personalized, domain-specific AI assistants using accessible open-source tools. While acknowledging the increasing commercialization of AI, he emphasizes the current window of opportunity for creative, playful implementations that enhance both productivity and developer satisfaction.

Pip serves as an inspiring example of how thoughtful combination of RAG techniques, vector databases, workflow orchestration, and personality injection can create AI tools that feel genuinely collaborative rather than merely functional.

Links:

Posted in en-US | Tags: AgenticAI, AI, AlexanderChatzizacharias, Java, Kotlin, LLM, MCP, PairProgramming, Qdrant, Qwen, RAG, SpringBoot, VectorDatabase, VoxxedDaysBucharest2026