Recent Posts
Archives

PostHeaderIcon [KotlinConf2024] The Best Programmer I Know: Insights from KotlinConf2024

At KotlinConf2024, Daniel Terhorst-North shared a heartfelt reflection on the traits of exceptional programmers, drawing from his 30-year career and a colleague who embodies these qualities. Without a formal degree, this programmer excels by starting tasks, prioritizing outcomes, simplifying solutions, choosing tools wisely, and fostering team growth. Daniel’s narrative, blending personal anecdotes and practical advice, inspires developers to cultivate curiosity, resilience, and empathy while building impactful software.

Starting with Action

Great programmers dive into tasks without hesitation. Daniel recounted how his colleague tackles projects by starting anywhere, embracing the unknown. This “just start” mindset counters procrastination, which Daniel admits to masking as research. By iterating rapidly—trying, failing, and learning—programmers overcome perfectionism and ego. Daniel likened progress to navigating a grid city, moving stoplight to stoplight, accepting delays as part of the journey, ensuring steady advancement toward solutions.

Prioritizing Outcomes Over Code

Building products, not just code, defines effective programming. Daniel emphasized that emotional investment should focus on outcomes, not code, which is merely a means. The best programmers write minimal, high-quality code, holding no attachment to it. Studying the domain reveals user needs, as Daniel learned during a financial project where ignorance of CDOs led to unintended consequences. Observing users’ frustrations, like manual data entry, uncovers opportunities to eliminate friction, enhancing product value.

Simplifying the Complex

Exceptional programmers see through complexity to find simple solutions. Daniel shared a story of his colleague bypassing bloated Java web servers by writing a lean one from the HTTP spec. In another case, a team debating JSON libraries was guided to implement a simple interface for nine serialized objects, avoiding heavy dependencies. Writing clear documentation, like a streamlined README, drives “embarrassment-driven refactoring,” ensuring solutions remain concise and maintainable, solving only what’s necessary.

Choosing Tools for the Problem

Tool selection should prioritize the product, not team familiarity. Daniel recounted a team learning Scala to sketch code quickly, despite no prior experience, proving adaptability trumps comfort. He advocated for polyglot programming, using Advent of Code to learn Rust and Go, which broadened his problem-solving perspective. By minimizing cognitive distance between problem and solution, as Rich Hickey’s “Simple Made Easy” suggests, programmers select tools that evolve with project needs, ensuring flexibility.

Fostering Team Care

Great programmers uplift their teams. Daniel finds joy in pairing and teaching, inspired by an XKCD comic about the “lucky 10,000” who learn something new daily. He creates environments for learning, drawing from jiu-jitsu’s teaching-first philosophy. Sending teams home to rest, as Daniel advocates, boosts effectiveness, while assuming positive intent—per Virginia Satir’s family therapy principle—builds empathy, transforming conflicts into opportunities for collaboration and growth.

Building Psychological Safety

Psychological safety, per Amy Edmondson’s research, is vital for high-performing teams. Daniel explained that safe teams encourage saying “I don’t know,” seeking help, and disagreeing without fear. A study of surgical teams showed high performers report more errors, reflecting trust, not incompetence. In software, this translates to teams where questions spark learning, help fosters collaboration, and dissent drives improvement, creating dynamic, challenging environments that fuel innovation.

Growing as a Programmer

Personal growth sustains programming excellence. Daniel urged developers to stay current through communities, contribute actively, and remain skeptical of trends like AI hype. Practicing via challenges like Advent of Code sharpens skills, as Daniel found when switching languages mid-puzzle. Balancing work with physical activities, like running, and prioritizing rest prevents burnout. By embracing continual learning and kindness, programmers evolve, as Daniel’s colleague demonstrates, into impactful, resilient professionals.

Links:

PostHeaderIcon [DefCon32] DEF CON 32: Feet Feud

Tiberius, presenting as “Toes” with the OnlyFeet CTF team, hosted the lively “Feet Feud” game show at DEF CON 32, bringing together cybersecurity enthusiasts for a fun, interactive competition. Team captains Ali Diamond, known for her Hack 5 ThreatWire hosting, and John Hammond, a prominent cybersecurity educator, led their teams in a spirited battle of wits. The event, a fan favorite after unofficial runs in previous years, engaged the audience with hacker-themed challenges and quirky prizes, fostering community spirit and camaraderie.

The Game Show Experience

Tiberius kicked off the event with infectious enthusiasm, introducing Ali and John as team captains. Audience members, selected based on vibrant attire like orange bow ties, joined the teams, creating an electric atmosphere. The game, inspired by classic game shows, featured cybersecurity-themed questions and challenges, blending humor with technical knowledge. Tiberius’s dynamic hosting, supported by assistants Helen and Wolfie, ensured a fast-paced, engaging experience that kept the crowd entertained.

Celebrating the Hacker Community

The event celebrated the DEF CON community’s creativity and collaboration, with Ali and John leading their teams through rounds that tested hacking trivia and problem-solving skills. Prizes, including Hack The Box VIP vouchers, coding socks, and whimsical baby foot candles, added a playful touch. Tiberius emphasized the importance of community-driven events like Feet Feud, which provide a lighthearted counterbalance to the conference’s technical intensity, strengthening bonds among attendees.

Building on Tradition

Reflecting on Feet Feud’s evolution from a small gathering to a main-stage event, Tiberius highlighted its growing popularity, with this year’s crowd far exceeding the previous high of 40 attendees. The game’s success, supported by sponsors like Hack The Box, underscores its role in fostering a sense of belonging within the cybersecurity community. By encouraging audience participation and celebrating victories with quirky rewards, Feet Feud reinforces DEF CON’s unique blend of learning and fun.

Looking Ahead

Concluding, Tiberius expressed hope for Feet Feud’s return with even grander prizes, thanking Helen and Wolfie for their invaluable support. The event’s success lies in its ability to unite hackers in a shared celebration of their craft, inspiring future iterations that continue to blend competition with camaraderie. Ali and John’s leadership, combined with the audience’s enthusiasm, ensures Feet Feud remains a cherished DEF CON tradition.

Links:

PostHeaderIcon [KotlinConf2025] Building a macOS Screen Saver with Kotlin

A captivating tale of a developer’s obsession and a journey into the less-trodden paths of Kotlin development was shared by Márton Braun, a Developer Advocate at JetBrains. It all began with a simple, yet compelling, observation at a previous KotlinConf: a screen saver featuring the bouncing Kotlin logos, reminiscent of old DVD players. Upon discovering it was merely a pre-rendered video and not a true screen saver, a challenge was born. Márton set out to build his own, a native macOS application powered by Kotlin/Native.

The project became a masterclass in interoperability and a candid exploration of the quirks of native application development. Márton detailed how Kotlin/Native’s powerful interop capabilities made it surprisingly easy to call native macOS APIs. However, this ease was often contrasted with the complexities and frustrations of working with the macOS platform itself. The development process was a constant battle, with macOS often proving to be an uncooperative partner in this creative endeavor.

Márton’s perseverance paid off, resulting in a fully functional screen saver. He even managed to create two distinct implementations: one using the traditional AppKit framework and another built with Compose Multiplatform. This dual approach not only demonstrated the capabilities of both technologies but also provided a unique learning experience. He highlighted how the Compose version allowed him to focus on the core UI logic, abstracting away the intricacies of packaging the screen saver. This is a powerful testament to Compose Multiplatform’s potential for simplifying development and improving productivity.

The screen saver project serves as an excellent case study, showcasing Kotlin’s ability to venture into unconventional domains beyond mobile and backend development. Márton successfully demonstrated that with Kotlin and the right tools, developers can create truly native applications for platforms like macOS, leveraging their existing skills and knowledge. The flexibility of Kotlin Multiplatform allows developers to share code across platforms while still delivering a native user experience.

Ultimately, this project is a celebration of the unique possibilities that Kotlin offers. It encourages developers to think creatively about how they can apply the language to solve a wide range of problems and build applications for a diverse set of platforms. Márton’s story is an inspiring reminder that sometimes the most interesting and valuable projects are born from a simple desire to see something that doesn’t exist yet come to life.

Links:


PostHeaderIcon [DevoxxFR2025] Building an Agentic AI with Structured Outputs, Function Calling, and MCP

The rapid advancements in Artificial Intelligence, particularly in large language models (LLMs), are enabling the creation of more sophisticated and autonomous AI agents – programs capable of understanding instructions, reasoning, and interacting with their environment to achieve goals. Building such agents requires effective ways for the AI model to communicate programmatically and to trigger external actions. Julien Dubois, in his deep-dive session, explored key techniques and a new protocol essential for constructing these agentic AI systems: Structured Outputs, Function Calling, and the Model-Controller Protocol (MCP). Using practical examples and the latest Java SDK developed by OpenAI, he demonstrated how to implement these features within LangChain4j, showcasing how developers can build AI agents that go beyond simple text generation.

Structured Outputs: Enabling Programmatic Communication

One of the challenges in building AI agents is getting LLMs to produce responses in a structured format that can be easily parsed and used by other parts of the application. Julien explained how Structured Outputs address this by allowing developers to define a specific JSON schema that the AI model must adhere to when generating its response. This ensures that the output is not just free-form text but follows a predictable structure, making it straightforward to map the AI’s response to data objects in programming languages like Java. He demonstrated how to provide the LLM with a JSON schema definition and constrain its output to match that schema, enabling reliable programmatic communication between the AI model and the application logic. This is crucial for scenarios where the AI needs to provide data in a specific format for further processing or action.

Function Calling: Giving AI the Ability to Act

To be truly agentic, an AI needs the ability to perform actions in the real world or interact with external tools and services. Julien introduced Function Calling as a powerful mechanism that allows developers to define functions in their code (e.g., Java methods) and expose them to the AI model. The LLM can then understand when a user’s request requires calling one of these functions and generate a structured output indicating which function to call and with what arguments. The application then intercepts this output, executes the corresponding function, and can provide the function’s result back to the AI, allowing for a multi-turn interaction where the AI reasons, acts, and incorporates the results into its subsequent responses. Julien demonstrated how to define function “signatures” that the AI can understand and how to handle the function calls triggered by the AI, showcasing scenarios like retrieving information from a database or interacting with an external API based on the user’s natural language request.

MCP: Standardizing LLM Interaction

While Structured Outputs and Function Calling provide the capabilities for AI communication and action, the Model-Controller Protocol (MCP) emerges as a new standard to streamline how LLMs interact with various data sources and tools. Julien discussed MCP as a protocol that aims to standardize the communication layer between AI models (the “Model”) and the application logic that orchestrates them and provides access to external resources (the “Controller”). This standardization can facilitate building more portable and interoperable AI agentic systems, allowing developers to switch between different LLMs or integrate new tools and data sources more easily. While details of MCP might still be evolving, its goal is to provide a common interface for tasks like function calling, accessing external knowledge, and managing conversational state. Julien illustrated how libraries like LangChain4j are adopting these concepts and integrating with protocols like MCP to simplify the development of sophisticated AI agents. The presentation, rich in code examples using the OpenAI Java SDK, provided developers with the practical knowledge and tools to start building the next generation of agentic AI applications.

Links:

PostHeaderIcon [DefCon32] DEF CON 32: HookChain – A New Perspective for Bypassing EDR Solutions

Helvio Carvalho Junior, a renowned security researcher and CEO of Sec4US, unveiled his groundbreaking HookChain technique at DEF CON 32, offering a fresh perspective on evading Endpoint Detection and Response (EDR) systems. By combining Import Address Table (IAT) hooking, dynamic System Service Number (SSN) resolution, and indirect system calls, Helvio demonstrated how HookChain stealthily redirects Windows subsystem execution flows, bypassing EDR monitoring without altering application code. His presentation, enriched with live demonstrations, challenged cybersecurity conventions and spurred discussion on adaptive defense strategies.

Understanding EDR Limitations

Helvio opened by outlining the rapid evolution of digital threats, which continuously challenge EDR solutions designed to monitor API calls through Ntdll.dll. He explained that traditional EDRs rely on hooking key functions to detect malicious activity, but these mechanisms can be circumvented. HookChain exploits this by manipulating the execution flow to avoid monitored interfaces, achieving stealth without modifying the source code of applications or malware. Helvio’s approach highlights the need for EDRs to evolve beyond static monitoring techniques.

Technical Mechanics of HookChain

Delving into the technical core, Helvio detailed HookChain’s methodology, which integrates IAT hooking to redirect function calls, dynamic SSN resolution to adapt to varying Windows versions, and indirect system calls to bypass EDR hooks. His live demo showcased shellcode injection into a process, executing it undetected by EDRs like CrowdStrike and SentinelOne. By leveraging techniques like Halo’s Gate to locate unhooked functions, HookChain ensures malicious payloads operate invisibly, achieving an impressive 66% success rate against top EDR products listed in Gartner’s Magic Quadrant.

Testing and Vendor Responses

Helvio shared results from testing HookChain against various EDR solutions, including remote process injection and credential dumping with tools like Mimikatz. His findings revealed that while some vendors, such as SentinelOne, implemented patches to counter HookChain, others lagged in response. He emphasized the importance of open collaboration, noting that two vendors engaged with him to test mitigations. Helvio’s transparency, including sharing his whitepaper and source code on GitHub, encourages the community to refine and challenge his techniques, fostering stronger defenses.

Future Directions for Cybersecurity

Concluding, Helvio urged the DEF CON community to embrace continuous innovation in security research. HookChain not only exposes vulnerabilities in current EDR systems but also paves the way for more adaptive solutions. He advocated for proactive strategies that anticipate emerging threats, inspiring researchers to explore new evasion techniques and defenders to enhance monitoring beyond Ntdll.dll. His work, rooted in a passion for discovery, sets a benchmark for advancing endpoint security in a dynamic threat landscape.

Links:

PostHeaderIcon [DefCon32] DEF CON 32: Leveraging Private APNs for Mobile Network Traffic Analysis

Aapo Oksman, a seasoned security researcher specializing in IoT and network protocols, delivered a compelling presentation at DEF CON 32 on harnessing private Access Point Names (APNs) to analyze mobile and IoT device traffic. As devices increasingly rely on 4G and 5G networks, bypassing traditional Wi-Fi monitoring, Aapo’s innovative approach enables security professionals to inspect, filter, and tamper with mobile network traffic. His talk provided practical techniques for both offensive and defensive cybersecurity, from penetration testing to detecting malicious activity in mobile ecosystems.

Challenges in Mobile Network Monitoring

Aapo began by highlighting the shift in device communication from Wi-Fi to mobile networks, which complicates traditional traffic analysis due to direct connections to ISP-operated base stations. Setting up private base stations, while possible, is costly and complex. Aapo introduced private APNs as a cost-effective alternative, allowing users to create isolated networks within ISP infrastructure. This approach grants visibility into device communications, overcoming the limitations of locked-down devices and enabling detailed traffic analysis for security purposes.

Harnessing Private APNs for Security

Delving into the technical details, Aapo explained how private APNs can be configured to route mobile traffic through controlled environments, such as firewalls or custom servers. His demonstration showcased the setup process, emphasizing affordability and scalability, with costs decreasing as more devices are added. By intercepting IP traffic, security professionals can perform penetration testing on IoT devices or monitor for malicious activity, such as command-and-control (C2) communications. Aapo’s approach leverages ISP infrastructure to create a controlled network environment, enhancing both offensive and defensive capabilities.

Uncovering Advanced Malware Threats

Aapo addressed the growing sophistication of mobile malware, which often avoids Wi-Fi or VPN connections to evade detection. He cited an example of a misconfigured malware detected via Wi-Fi traffic, underscoring that advanced threats are designed to operate solely over mobile networks. Private APNs enable defenders to monitor these communications, identifying C2 servers or other malicious activities that would otherwise go unnoticed. Aapo’s insights highlight the critical need for innovative monitoring techniques to counter evolving mobile threats.

Practical Applications and Future Directions

Concluding, Aapo shared project details and encouraged the DEF CON community to explore private APNs for their research. He emphasized the dual-use potential of his approach, enabling both penetration testers and defenders to gain deeper insights into mobile device behavior. By connecting private APNs to existing security infrastructure, organizations can enhance their ability to detect and mitigate threats. Aapo’s work paves the way for future advancements in mobile network security, urging continued exploration of ISP-based solutions.

Links:

  • None available

PostHeaderIcon [NDC Security 2025] Hacking History: The First Computer Worm

Håvard Opheim, a software developer at Kaa, took the audience at NDC Security 2025 in Oslo on a captivating journey through the history of the Morris Worm, the first significant malware to disrupt the early internet. Through a blend of historical narrative and technical analysis, Håvard explored the worm’s impact, its technical mechanisms, and the enduring lessons it offers for modern cybersecurity. His talk, rich with anecdotes and technical insights, highlighted how vulnerabilities exploited in 1988 remain relevant today.

The Dawn of the Morris Worm

Håvard set the stage by describing the internet of 1988, a nascent network connecting research institutions and defense installations via ARPANET. With minimal security controls, this “walled garden” fostered trust among users, allowing easy data sharing but also exposing systems to exploitation. On November 2, 1988, the Morris Worm, created by Cornell graduate student Robert Morris, brought this trust to its knees. Håvard recounted how the worm rendered computers across North America unusable, affecting universities, NASA, and the Department of Defense.

The worm’s rapid spread, Håvard explained, was not a deliberate attack but the result of a coding error by Robert. Intended as a proof-of-concept to highlight internet vulnerabilities, the worm’s aggressive replication turned it into a denial-of-service (DoS) fork bomb, overwhelming systems. Håvard’s narrative brought to life the chaos of that night, with system administrators scrambling to mitigate the damage as the worm reinfected systems despite reboots.

Technical Exploits and Vulnerabilities

Delving into the worm’s mechanics, Håvard outlined its exploitation of multiple vulnerabilities. The worm targeted Unix-based systems, leveraging flaws in the finger and sendmail programs. The finger daemon, used to query user information, suffered from a buffer overflow vulnerability due to the gets function, which lacked bounds checking. By sending a 536-byte payload—exceeding the 512-byte buffer—the worm overwrote memory to execute a remote shell, granting attackers full access.

Similarly, the sendmail program, running in debug mode on BSD 4.2 and 4.3, allowed commands in the recipient field, enabling the worm to send itself as an email and execute on the recipient’s system. Håvard also highlighted the worm’s password-cracking capabilities, exploiting predictable user behaviors, such as using usernames as passwords or simple variations like reversed usernames. These flaws, combined with insecure remote execution tools like rexec and rsh, allowed the worm to propagate rapidly across trusted networks.

Response and Legacy

Håvard described the community’s swift response, with ad-hoc working groups at Berkeley and MIT dissecting the worm overnight. By November 3, 1988, researchers had identified and patched the vulnerabilities, and within days, the worm’s source code was decompiled, revealing its inner workings. The incident, Håvard noted, marked a turning point, introducing the term “internet” to mainstream media and prompting the creation of the Computer Emergency Response Team (CERT).

The legal aftermath saw Robert convicted under the newly enacted Computer Fraud and Abuse Act (CFAA) of 1986, the first such conviction. Despite the worm’s benign intent, its impact—estimated at 100,000��10 million in damages—underscored the need for robust cybersecurity. Håvard emphasized that Robert’s career rebounded, with contributions to e-commerce and the founding of Y Combinator, but the incident left a lasting mark on the industry.

Enduring Lessons for Cybersecurity

Reflecting on the worm’s legacy, Håvard highlighted its relevance to modern cybersecurity. The vulnerabilities it exploited—buffer overflows, weak passwords, and insecure configurations—persist in today’s systems, albeit in patched forms. He stressed that human behavior remains a weak link, with users still prone to predictable password patterns. The worm’s unintended DoS effect also serves as a cautionary tale about the risks of untested code in production environments.

Håvard advocated for proactive measures, such as regular patching, strong authentication, and threat modeling, to mitigate similar risks today. He underscored the importance of learning from history, noting that the internet’s growth has amplified the stakes. By understanding past incidents like the Morris Worm, developers can build more resilient systems, recognizing that no system is inherently secure.

Hashtags: #MorrisWorm #CybersecurityHistory #NDCSecurity2025 #HåvardOpheim #Kaa #InternetSecurity #Malware

PostHeaderIcon [GoogleIO2024] What’s New in Android Development Tools: Boosting Efficiency and Innovation

Jamal Eason, Tor Norbye, and Ryan McMorrow unveil Android Studio’s latest, integrating AI, enhancing Compose, and Firebase tools for superior app development.

Evolving Roadmap with AI Integration

From Hedgehog’s vitals to Iguana’s baselines, Jellyfish stabilizes, while Koala previews Gemini enhancements in 200+ regions. Privacy controls empower users. Quality fixes resolved 900+ bugs, slashing memory use by 33%.

Gemini excels in code tasks, from generation to refactoring, accelerating workflows.

Advanced Features in Editing and Firebase

Koala’s IntelliJ base introduces sticky lines, improved navigation, and device-agnostic previews. Firebase’s Genkit streamlines AI, Crashlytics aids prioritization.

App insights aggregate issues; device streaming reproduces crashes on real hardware.

Streamlined Debugging and Release Cadence

Crashlytics’ diffs trace origins; streaming ensures secure testing.

Platform-first releases with feature drops double updates, enhancing stability.

Ladybug (2024.2.1) adds K2 mode; Koala Feature Drop (2024.1.2) expands devices.

Links:

EN_GoogleIO2024_014_017.md

PostHeaderIcon [DefCon32] DEF CON 32: Iconv, Set the Charset to RCE – Exploiting glibc to Hack the PHP Engine

Charles Fox, a security researcher with a knack for uncovering hidden vulnerabilities, captivated the DEF CON 32 audience with his exploration of CVE-2024-2961, a long-standing buffer overflow in the GNU C Library (glibc) that he leveraged to compromise the PHP engine. Discovered by chance while auditing PHP, Charles’s work revealed new remote code execution (RCE) vectors and previously unknown zero-day vulnerabilities. His presentation offered a deep dive into the internals of PHP, showcasing innovative exploitation techniques and their impact on the broader PHP ecosystem, while providing actionable insights for securing web applications.

Discovering the glibc Vulnerability

Charles stumbled upon CVE-2024-2961 while auditing PHP, though the flaw resided in glibc’s iconv library, responsible for character set conversion. This buffer overflow, overlooked for years, presented a potent opportunity for exploitation within PHP’s context. Charles detailed how his accidental discovery unfolded, emphasizing the importance of thorough code audits. By analyzing the iconv library’s behavior, he identified a pathway to manipulate PHP’s execution environment, transforming a seemingly innocuous bug into a powerful attack vector. His approach underscores the value of curiosity-driven research in uncovering critical security flaws.

Crafting Remote Code Execution Exploits

Delving into the technical intricacies, Charles explained two distinct methods to achieve RCE using the glibc vulnerability. The first targeted PHP filters, a lesser-known component of the PHP engine, which he manipulated to execute arbitrary code remotely. The second approach exploited direct calls to iconv, bypassing conventional security checks. His live demonstration showcased a sophisticated exploit that navigated PHP’s memory management constraints, even in scenarios without output visibility or with randomized memory allocations. Charles’s ability to achieve a shell under such conditions highlighted the vulnerability’s severity and his ingenuity in exploit development.

Impact on the PHP Ecosystem

Charles explored the broader implications of CVE-2024-2961, revealing its reach across popular PHP libraries and applications, including webmail platforms like Roundcube. He noted that email headers specifying charsets provided an ideal entry point for exploitation, as attackers could craft malicious inputs to trigger the buffer overflow. His analysis of affected sinks, from well-known functions to obscure code paths, underscored the pervasive risk within PHP-based systems. By sharing his findings, Charles aimed to alert developers to the hidden dangers in widely used software and encourage proactive vulnerability management.

Mitigation Strategies for Developers

Concluding, Charles offered practical recommendations to fortify PHP applications against similar exploits. He urged developers to update glibc to patched versions and scrutinize charset handling in their codebases. Additionally, he advocated for robust input validation and the use of secure coding practices to minimize exposure to buffer overflows. His work, shared openly with the community, empowers developers to strengthen their systems and inspires further research into PHP’s security landscape, ensuring the web remains a safer environment.

Links:

  • None available

PostHeaderIcon [DevoxxUK2025] Mastering Prompt Engineering for Immersive Text-Based Adventures

At DevoxxUK2025, Charles-Philippe Bernard, a software engineer at JPMorgan in Glasgow, captivated attendees with his talk on mastering prompt engineering through his remastered 1980s text-based adventure game, SRAM. Using the Godot engine, a WebSocket Python server, and Ollama for local LLM inference with Llama 3.1, Charles showcased how carefully crafted prompts bring dynamic interactions to life. His presentation explored the art of prompt engineering, demonstrating how to shape AI responses for immersive gameplay, manage game states, and handle NPC interactions. Through practical examples, he shared techniques to harness AI’s potential while navigating its quirks, such as hallucinations, offering developers actionable insights to create engaging experiences.

Crafting the System Prompt

Charles began by emphasizing the importance of a well-defined system prompt, which sets the tone and context for the LLM. In SRAM, the prompt establishes the AI as the “Game Master,” named Gun Master, responsible for narrating the adventure in a JSON-formatted output. This structure includes speaker ID, response text, and actions, ensuring consistency across interactions. By injecting variables like scene state and inventory, Charles demonstrated how the prompt adapts dynamically, enabling the game to track items like a knife or navigate scenes. He stressed the need for clear, structured instructions to guide the LLM, especially for smaller models like Llama 3.1’s 7-billion-parameter version, which may struggle with complex tasks.

Managing Game State and NPCs

A key challenge in SRAM is maintaining the game’s state, including inventory, scene descriptions, and NPC interactions. Charles explained how the prompt template incorporates variables to reflect the player’s progress, such as adding a knife to the inventory after picking it up. For NPCs, like the leprechaun Fergus, he crafted specific instructions to define personality, tone (e.g., a humorous Irish accent), and behavior, using few-shot examples to steer responses. However, he noted challenges like the LLM repeating examples verbatim or hallucinating actions, which he mitigates by balancing creativity (via a temperature of 0.8) with structured outputs to ensure consistency.

Handling AI Quirks and Hallucinations

Charles candidly addressed the LLM’s limitations, particularly hallucinations, where the model generates unexpected or incorrect actions, like responding to “make me a pizza” outside the game’s context. By setting a temperature of 0.8, he balances creativity with adherence to instructions, though this sometimes leads to inconsistent outputs. He shared techniques like explicit instructions (e.g., listing no items in the inventory) and iterative prompt refinement, often using larger models like ChatGPT to improve prompts for smaller, local models. Charles also highlighted the importance of testing prompts with humans to ensure clarity, as unclear instructions confuse both humans and AI.

Practical Tips for Prompt Engineering

To master prompt engineering, Charles recommended starting with a clear, structured prompt template, using markdown or bullet points for readability. He advised including specific guidelines, like short responses or JSON formatting, and leveraging few-shot examples to guide the model. For smaller models, verbose yet clear instructions are crucial, as they lack the reasoning power of larger frontier models. Charles also emphasized iterative refinement, storing interactions for testing consistency, and using tools like uppercase keywords or structured formatting to enhance the model’s understanding. His approach empowers developers to create dynamic, AI-driven experiences while managing the inherent challenges of LLMs.

Links: