Recent Posts
Archives

Posts Tagged ‘BacklogMD’

PostHeaderIcon [DevoxxBE2025] Backlog.md: Reaching 95% Task Success Rate with AI Agents

Lecturer

Alex Gavrilescu is the developer of Backlog.md, a command-line utility for AI-enhanced project oversight, with a history in program creation and mobile advancement. He emphasizes processes that elevate AI task accomplishment, derived from personal ventures in auxiliary initiatives.

Abstract

This examination follows the progression from preliminary AI scripting setbacks to a polished arrangement attaining near-flawless duty fulfillment through Backlog.md. It clarifies notions like specification-guided creation and agent coordination, placed amid initial cue deficiencies. Emphasizing tactics for background supplying and archetype choice, it scrutinizes effects on output, particularly in disconnected settings. The exploration furnishes profundity on moving to AI-primary oversight, stressing functional inventories and mergers.

Preliminary Difficulties with AI Aid

Early AI endeavors, such as applying Claude to repositories, frequently faltered owing to “bare” cues deficient in background, yielding more corrections than advancements. Fulfillment percentages lingered at 50%, hampered by repository disorder and partial comprehension.

Placed: AI excitement vowed mechanization, but truths disclosed requirements for organized entries. Procedurally, appending background documents elevated percentages to 75%, as agents acquired essential particulars.

Ramifications: Inferior arrangements squander duration; methodical tactics transform AI into dependable supports.

Polishing Processes for Elevated Fulfillment

Backlog.md organizes duties as Markdown documents in repositories, permitting parallelization and agent handling. CLI illustrations convert phrases into duties:

backlog init
backlog add "Construct user verification"
backlog run

Agents scheme, enact, assess. Archetype contrasts: Claude for deduction, Codex for scripting, Jules for advantages.

Scrutiny: Inventories determine agent functions—Claude schemes, Codex enacts. Ramifications: 95% fulfillment via coordination.

Mobile-Exclusive and Merger Tactics

Mobile-exclusive processes test portability: CLI permits duty oversight sans workstations. Real-time merges from mobiles illustrate adaptability.

Procedurally, synchronizing with GitHub matters broadens utility, albeit intricate.

Ramifications: AI permits “ubiquitous” creation, enhancing auxiliary initiatives.

Deployment Preparedness and Prospective Boosts

Backlog.md attains elevated percentages via specifications, not supplanting instruments like Jira but supplementing for agents.

Prospective: GR mergers for enterprise.

In overview, organized AI processes revolutionize creation, optimizing fulfillment.

Links:

  • Lecture video: https://www.youtube.com/watch?v=LSoDQU_9MMA
  • Alex Gavrilescu on Twitter/X: https://twitter.com/H3xx3n

PostHeaderIcon [VoxxedDaysTicino2026] Backlog.md: The Simplest Project Management Tool for the AI Era

Lecturer

Alex Gavrilescu is a full-stack developer with extensive experience in .NET and Vue.js technologies. He has been actively involved in software development for many years and has shifted his focus toward artificial intelligence since last year. Alex developed Backlog.md as a side project starting from the end of May 2025, while maintaining a full-time role in the casino industry. He shares insights through blog articles on platforms like LinkedIn and X (formerly Twitter). Relevant links include his LinkedIn profile (https://www.linkedin.com/in/alex-gavrilescu/) and X account (https://x.com/alexgavrilescu).

Abstract

This article examines Alex Gavrilescu’s presentation on his journey in AI-assisted software development and the creation of Backlog.md, a terminal-based project management tool designed to enhance predictability and structure in workflows involving AI agents. Drawing from personal experiences, the discussion analyzes the evolution from unstructured prompting to a systematic approach, emphasizing task decomposition, context management, and delegation modes. It explores the tool’s features, limitations, and implications for spec-driven AI development, highlighting how such methodologies foster deterministic outcomes in non-deterministic AI environments.

Context of AI Integration in Development Workflows

In the evolving landscape of software engineering, the integration of artificial intelligence agents has transformed traditional practices. Alex begins by contextualizing his experiences, noting the shift from basic code completions in integrated development environments (IDEs) like Visual Studio’s IntelliSense, which relied on simple machine learning or pattern matching, to more advanced tools. The advent of models like ChatGPT allowed developers to query and incorporate code snippets, reducing friction but still requiring manual transfers.

The introduction of GitHub Copilot marked a significant advancement, embedding AI directly into IDEs for contextual queries and modifications. However, the true leap came with agent modes, where AI operates in a loop, utilizing tools and gathering context autonomously until task completion. Alex distinguishes between “steer mode,” where developers iteratively guide AI through prompts and approvals, and “delegate mode,” where comprehensive instructions are provided upfront for independent execution. His focus leans toward delegation, aiming for reliable outcomes without constant intervention.

This context is crucial as AI models are inherently non-deterministic, yielding varied results from identical prompts. Alex draws parallels to human collaboration, where structured information—clarifying the “why,” “what,” and “how”—ensures success. He references practices like Gherkin scenarios (given-when-then) but simplifies them to acceptance criteria and definitions of done, adapting them for AI efficiency. Early challenges, such as limited context windows in models like those from May 2025, necessitated task breakdown to avoid information loss during compaction.

The implications are profound: unstructured AI use often leads to abandonment, as complexity escalates failure rates. Alex classifies developers into categories like “vibe coders” (improvisational prompting without code review) and “AI product managers” (structured delegation with final reviews), illustrating how his journey from near-abandonment to 95% success stemmed from imposing structure.

Development and Features of Backlog.md

Backlog.md emerged as Alex’s solution to the limitations of manual task structuring. Initially, he created tasks in Markdown files, logging them in Git repositories for sharing and history. This allowed referencing between tasks, scoping to prevent derailment, and assigning tasks to specialized agents (e.g., Opus for UI, Codex for backend). By avoiding database or API dependencies, agents could directly read files, enhancing efficiency.

The tool formalizes this into a command-line interface (CLI) resembling Git commands: backlog task create, edit, list. Tasks are stored as Markdown with a front-matter section for metadata (title, ID, dependencies, status). Sections include “why” for problem context, acceptance criteria with checkboxes for self-verification, implementation plans generated by agents, and notes/summaries for pull request descriptions.

Backlog.md supports subtasks, dependencies (e.g., “relates to” or “blocked by”), and a web interface for easier editing, including rich text and dark mode. It operates offline, uses Git for synchronization across branches, and avoids conflicts by leveraging repository permissions for security. Notably, 99% of its code was AI-generated, with Alex reviewing initial tasks, demonstrating the tool’s recursive utility.

Limitations include no direct task initiation from the interface, self-hosting requirements, single-repo support, experimental documentation/decisions sections, and absent integrations like GitHub Issues or Jira. As a solo side project, it lacks production-grade support, but welcomes community contributions via issues or pull requests.

In practice, Alex showcases Backlog.md in a live demo for spec-driven development. Starting with a product requirements document (PRD) generated by an agent like Claude, tasks are decomposed. Implementation plans are reviewed per task to adapt to changes, ensuring accuracy. Sub-agents orchestrate parallel planning, with human checkpoints at description, plan, and code stages.

Methodological Implications for Spec-Driven AI Development

Spec-driven AI development, as outlined, requires clear intent expression before execution. Backlog.md facilitates this by breaking projects into manageable tasks, delegating to agents for research, planning, and coding. A feedback loop refines agent instructions, specs, and processes.

Alex’s workflow begins with PRD creation, followed by task decomposition adhering to Backlog.md guidelines. Agents generate plans only upon task start, preventing obsolescence. For a task-scheduling feature, he demonstrates PRD prompting, task creation, and sub-agent orchestration for plans, emphasizing acceptance criteria for verification.

The methodology promotes one-task-per-context-window sessions, referencing summaries to avoid bloat. Definitions of done, global across projects, enforce testing, linting, and security checks. This counters “vibe coding’s” directional uncertainty, ensuring guardrails like unit tests prevent premature completion claims.

Implications extend to project readiness: documentation for agent onboarding mirrors human processes, with skills, code styles, and self-verification loops enhancing efficiency. Alex references a Factory.ai article on AI-ready maturity levels, underscoring documentation’s role.

Challenges persist in UI verification, requiring human QA, and complex integrations. Yet, the approach allows iterations without full restarts, leveraging cheap tokens for refinements.

Consequences and Future Directions

Backlog.md’s simplicity yields repeatability, boosting success from 50% (slot-machine-like prompting) to 95%. By structuring delegation, it mitigates AI’s non-determinism, fostering predictable workflows. Consequences include democratized AI use—no prior experience needed beyond basic Git—potentially broadening adoption.

For teams, Git synchronization enables collaboration, though self-hosting limits non-technical access. Future enhancements might include multi-repo support, integrations, and improved documentation, driven by its 4,600 GitHub stars and community feedback.

Broader implications question AI’s role: accepting “good enough” results accelerates development, but human input remains vital for steering and verification. As models improve (e.g., Opus 5.6’s million-token window), tools like Backlog.md evolve, but foundational structure endures.

In conclusion, Alex’s tool and methodology exemplify pragmatic AI integration, balancing innovation with reliability in an era where agents redefine development.

Links: