← Back to Blog
software development ai

The Ralph Method

Let AI Build Your Software From Start to Finish

| Aaron Hnatiw | Living Document

Note: This blog post was written with the help of AI (how meta!), but is based entirely on my own work implementing Ralph across real software projects. The Quick Start section below is written primarily for humans who want to get up and running fast. The rest of the document is technical reference material, useful for Ralph agents to parse during setup or for fellow nerds who want to understand how it all works under the hood.

Quick Start

Ralph is the best approach I've found for having AI build complete software projects autonomously, from start to finish. You describe what you want, define what "done" looks like, and Ralph loops until you get there. No babysitting required.

The magic is in eventual convergence: each iteration brings the project closer to your goals, even if individual steps aren't perfect. It works for all kinds of projects. I've used it to build multiple disparate systems myself, and a good friend used Ralph to build a completely custom Smalltalk VM implementation over a single weekend. It worked flawlessly.

The core idea is simple: an infinite loop feeds prompts to an AI coding assistant (Claude Code, Codex, Cursor, Aider, etc.). The agent reads the prompt, does work, commits changes, and the loop continues. This methodology takes that original Ralph approach and improves upon it in several ways to make fully autonomous development practical and reliable:

  • Beads: Git-native issue tracking that replaces unwieldy markdown TODO files
  • Skills: Modular domain knowledge (Claude Code only, for now)
  • Control scripts: Start, stop, and monitor Ralph with simple commands
  • Notifications: Get push alerts on your phone via ntfy.sh
  • Evidence-based learning: Ralph remembers what worked and what didn't

Prerequisites

Before using the bootstrap prompt, install Beads (the git-native issue tracker):

# Using Homebrew (recommended)
brew install steveyegge/tap/beads

# Or using Go
go install github.com/steveyegge/beads/cmd/bd@latest

Bootstrap Prompt (Greenfield Projects)

For new projects, copy this prompt into your AI coding agent. It will ask you clarifying questions to design the right Ralph setup for your needs:

# Set Up Ralph for My Project

## Read First
- **Ralph Methodology**: https://aaronhnatiw.com/blog/ralph-autonomous-ai-agent-development-methodology.html
- **Beads Issue Tracker**: https://github.com/steveyegge/beads

## My Project

[DESCRIBE YOUR PROJECT HERE - What are you building? What's the tech stack? What does "done" look like?]

## Your Task

Ask me 5-10 clarifying questions to understand:
1. What "done" looks like (measurable success criteria)
2. What rules must NEVER be violated (sacred principles)
3. How to verify work is correct (testing strategy)
4. What specialized knowledge Ralph needs (domain expertise)

After I answer, set up the complete Ralph infrastructure:
- Initialize Beads (`bd init`)
- Create `.ralph-state/` with goals.yaml, status.yaml, lessons-learned.yaml
- Create `ralph.md` (guidance document), `ralph.sh` (loop script), `ralph-control.sh` (management)
- Set up Skills if using Claude Code
- Create initial Beads issues with proper dependencies

When done, I should be able to run `./ralph-control.sh start` and Ralph will work autonomously toward my goals.

For existing codebases or when you want more control over the setup, see the detailed bootstrap prompt later in this document.

πŸ“š

Technical Reference Below

Everything above is all you need to get started. The rest of this document explains how all the pieces work under the hood.

Continue reading if you:

  • Are an AI agent parsing this to initialize Ralph
  • Want to understand how the system works internally
  • Need to customize Ralph for your specific needs
  • Are troubleshooting issues with your setup

1. The Core Concept

The Original Ralph

The original Ralph concept (credit: ghuntley.com/ralph) is elegantly simple:

while :; do cat PROMPT.md | npx --yes @sourcegraph/amp ; done

This infinite loop continuously feeds a prompt file to an AI coding agent, enabling autonomous iteration. The AI reads the prompt, does work, commits changes, and the loop continues.

The Philosophy

Ralph operates on several key principles:

  1. Eventual Consistency: Like tuning an instrument, each iteration refines the system toward the goal
  2. Sign-Based Guidance: Prompts act as "signs" that guide AI behavior, refined over time
  3. Autonomous Operation: After initial setup, the system works without human intervention
  4. Deterministic Improvement: Even if individual iterations vary, the trajectory is toward goals

The Problem with Basic Ralph

While powerful, the original approach has limitations:

Problem Impact
Context amnesia AI forgets task details when context window compresses between sessions
No dependency tracking Agent may attempt blocked work or miss prerequisites
Discovered work lost Bugs found during feature work get forgotten
No specialized knowledge Monolithic prompts become unwieldy and hard to maintain
Manual monitoring required Must watch logs to know if system is working
No error recovery Transient failures can halt the loop
No learning Same mistakes repeated across iterations

2. Key Innovations

This methodology addresses original Ralph's limitations through five key innovations:

2.1 Beads: Git-Native Issue Tracking (Most Critical)

The single most valuable improvement is integrating Beads, a git-native distributed issue tracker designed specifically for AI agents.

Why Beads matters:

The original Ralph approach often used markdown files (like TODO.md or TASKS.md) to track work. This creates a serious problem: as the project grows, these files become long and unwieldy. AI agents struggle to read and update them accurately, causing drift. The agent loses track of what's been done, what's blocked, and what's next. This drift prevents the eventual consistency that makes Ralph work.

Beads solves this with a proper database that's still git-native:

  • Replaces markdown TODOs: No more giant, hard-to-parse task files
  • Dependency-aware selection: bd ready returns only unblocked, actionable work
  • Discovered work tracking: New issues linked to originating work via discovered-from relationships
  • No external services: Everything commits to git (offline-capable)
  • Machine-readable: JSON output for programmatic parsing

2.2 Skills: Modular Domain Knowledge

Instead of monolithic prompts, domain expertise lives in skills. These are focused, versioned, reusable knowledge modules:

.claude/skills/
β”œβ”€β”€ testing-workflows/SKILL.md
β”œβ”€β”€ quality-assurance/SKILL.md
β”œβ”€β”€ configuration-management/SKILL.md
└── regression-prevention/SKILL.md

Each skill contains procedures, standards, and examples for one operational area. Skills are markdown files with YAML frontmatter that define metadata like dependencies, allowed tools, and version info. When Ralph needs specialized knowledge (e.g., how to run tests in your project), it loads the relevant skill rather than searching through a massive prompt file.

Note: Skills are currently only available in Claude Code. See the Skills documentation for setup details. If you're using a different AI agent, you can achieve similar results with well-organized prompt files, but won't get the automatic skill loading that Claude Code provides.

2.3 Control Scripts: Robust Process Management

The control script (ralph-control.sh) is the primary interface for managing Ralph. You'll use it to start, stop, and monitor the entire system. Wrapper scripts provide:

  • Background execution with PID tracking
  • Graceful stop via signal files
  • Log management and session archiving
  • Status reporting without log diving

2.4 Notifications: Asynchronous Monitoring

Push notifications via ntfy.sh enable monitoring without polling:

  • Iteration start/completion alerts
  • Error escalation with severity levels
  • Milestone tracking
  • Long-running operation progress

2.5 Evidence-Based Learning

A structured lessons-learned system captures:

  • What approaches worked (with metrics)
  • What approaches failed (with root causes)
  • Confidence levels for future decisions

3. Architecture Overview

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                     RALPH CONTROL LAYER                        β”‚
β”‚  ralph-control.sh: start | stop | status | logs | sessions     β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                              β”‚
                              β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                   CONTINUOUS LOOP (ralph.sh)                    β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”‚
β”‚  β”‚ Read    │───▢│ Build   │───▢│ Invoke  │───▢│ Handle  β”‚     β”‚
β”‚  β”‚ State   β”‚    β”‚ Prompt  β”‚    β”‚ Agent   β”‚    β”‚ Result  β”‚     β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β”‚
β”‚       β”‚                                             β”‚          β”‚
β”‚       β”‚              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”           β”‚          β”‚
β”‚       └──────────────│ Update State    β”‚β—€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜          β”‚
β”‚                      β”‚ + Notify        β”‚                       β”‚
β”‚                      β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                       β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                              β”‚
              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
              β–Ό               β–Ό               β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  STATE FILES    β”‚ β”‚     BEADS       β”‚ β”‚     SKILLS      β”‚
β”‚  .ralph-state/  β”‚ β”‚   .beads/       β”‚ β”‚  .claude/skills β”‚
β”‚  β”œβ”€goals.yaml   β”‚ β”‚   beads.jsonl   β”‚ β”‚  └─*/SKILL.md   β”‚
β”‚  β”œβ”€status.yaml  β”‚ β”‚   (247 issues)  β”‚ β”‚                 β”‚
β”‚  └─lessons.yaml β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                              β”‚
                              β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    AI CODING AGENT                              β”‚
β”‚     Claude Code | Codex | Cursor | Aider | Any CLI Agent       β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

4. Component Deep Dives

4.1 The Continuous Loop

Important: You don't run ralph.sh directly. Instead, use ralph-control.sh start to launch Ralph in the background with proper process management. See Control Scripts for the full interface.

The core loop structure:

#!/bin/bash
# ralph.sh - Continuous improvement loop (run via ralph-control.sh, not directly)

while :; do
    # 1. Check stopping conditions
    if [[ -f "${STATE_DIR}/STOP" ]] || goals_achieved; then
        notify "Ralph stopped: $(get_stop_reason)"
        break
    fi

    # 2. Check for long-running operations
    if is_long_operation_active; then
        wait_for_operation  # Low-polling mode
        continue
    fi

    # 3. Increment iteration
    ((iteration++))

    # 4. Build prompt with current context
    prompt=$(build_prompt "${iteration}")

    # 5. Invoke AI agent
    if echo "${prompt}" | your-ai-agent > "${output_file}" 2>&1; then
        handle_success "${iteration}"
    else
        handle_failure "${iteration}" "$?"
    fi

    # 6. Update state files
    write_status "${iteration}" "${errors}" "${successes}"

    # 7. Wait before next iteration
    sleep "${ITERATION_DELAY}"
done

Key features:

Feature Purpose
STOP file Graceful shutdown without killing processes
Long operation detection Prevents polling waste during multi-hour tasks
Error counting Distinguishes transient vs. permanent failures
State persistence Enables handoff between iterations
Configurable delay Tune for hardware/API rate limits

4.2 State Management

Ralph maintains state in YAML files for human readability and AI parseability:

goals.yaml - Strategic targets:

# What we're working toward
phase: 1
targets:
  test_coverage:
    current: 75
    target: 90
    status: "BELOW_TARGET"

  build_time:
    current: 45
    target: 30
    status: "BELOW_TARGET"
    unit: "seconds"

completion_criteria:
  - All targets at or above goals
  - Zero failing tests
  - Documentation updated

status.yaml - Operational health:

# Current loop state
iteration: 417
consecutive_errors: 0
consecutive_successes: 25
goals_stable_count: 0
last_run: "2025-11-26 12:00:00"

lessons-learned.yaml - Evidence database:

lessons:
  - id: 1
    date: "2025-10-12"
    approach: "Template-based configuration"
    outcome: "SUCCESS"
    metrics:
      before: 52
      after: 98
    confidence: "HIGH"
    why: "Concrete examples beat abstract rules"
    apply: "Use examples in all configuration tasks"

4.3 Beads: Git-Native Issue Tracking

Beads is the most critical improvement to Ralph. It replaces unwieldy markdown TODO files with a proper git-native database, preventing the drift that occurs when AI agents struggle to read and update large task lists.

How Beads Works

Source of Truth: .beads/beads.jsonl (git-tracked)
Local Cache: .beads/*.db (SQLite, gitignored)

User runs: bd create "Fix authentication bug"
    ↓
Beads updates SQLite cache
    ↓
Beads exports to JSONL (5s debounce)
    ↓
Developer commits: git add .beads/beads.jsonl
    ↓
Other machines: git pull β†’ auto-import to local cache

Essential Beads Commands

# Find ready work (unblocked, actionable)
bd ready --priority 1 --limit 5

# Create issue with rich context
bd create "Implement user authentication" \
  -d "Requirements:
      - OAuth2 support
      - Session management
      - Rate limiting

      Acceptance criteria:
      - All auth tests pass
      - Security audit complete" \
  -p 1 -t feature

# Link dependencies
bd dep add <child> <parent> --type blocks
bd dep add <discovered> <origin> --type discovered-from

# Progress tracking
bd update <id> --status in_progress
bd close <id> --reason "Implemented in commit abc123"

# Visualize dependencies
bd dep tree <epic-id>

Four Dependency Types

Type Meaning Use Case
blocks Hard prerequisite Bug fix must complete before feature work
related Soft connection Reference without blocking
parent-child Hierarchical Epic contains subtasks
discovered-from Origin tracking Bug found during feature work

Why Beads Beats Alternatives

Aspect Beads GitHub Issues Markdown TODOs
Offline Full None Full
Dependencies DAG with 4 types Labels only None
Query speed <100ms (local) 1-3s (API) Linear scan
Context persistence Git commits Cloud-dependent Lost on reset
Agent-optimized Yes (JSON output) Partial No

4.4 Skills: Domain Knowledge Modules

Skills encapsulate specialized knowledge in reusable, versioned modules.

Skill Structure

---
name: Testing Workflows
description: Procedures for test execution and validation
version: 1.0
last-updated: 2025-11-01
depends-on: [regression-prevention]
allowed-tools: [Bash, Read, Grep]
---

# Testing Workflows

## Purpose & Scope
When to invoke this skill and what it provides.

## Core Concepts
Key principles and definitions.

## Procedures & Workflows
Step-by-step operational guidance.

### Running Tests
1. Execute unit tests: `make test`
2. Execute integration tests: `make test-integration`
3. Check coverage: `make coverage`

## Standards & Requirements
Rules that must be followed.

## Examples & Templates
Concrete demonstrations.

## Quick Reference
Command summaries and key paths.

Skill Categories

Category Examples Purpose
Orchestration task-selection, verification-protocol How Ralph operates
Operations testing-workflows, configuration-management How to do common tasks
Domain quality-assurance, security-guidelines Specialized expertise
Integration beads-task-tracking, notification-patterns How systems connect

Benefits Over Monolithic Prompts

MONOLITHIC PROMPT (200KB):
β”œβ”€ All knowledge embedded
β”œβ”€ Hard to update one area
β”œβ”€ Overwhelming context
└─ Brittle dependencies

SKILLS-BASED (14 Γ— 10-30KB each):
β”œβ”€ Load only what's needed
β”œβ”€ Update skills independently
β”œβ”€ Clear dependency graph
└─ Reusable across agents

4.5 Control Scripts

The control script provides a clean interface for managing Ralph:

#!/bin/bash
# ralph-control.sh

case "$1" in
    start)
        # Check for existing process
        if [[ -f "${PID_FILE}" ]]; then
            pid=$(cat "${PID_FILE}")
            if kill -0 "$pid" 2>/dev/null; then
                echo "Ralph already running (PID $pid)"
                exit 1
            fi
            rm "${PID_FILE}"  # Stale PID file
        fi

        # Start in background
        nohup ./ralph.sh > "${LOG_DIR}/ralph-output.log" 2>&1 &
        echo $! > "${PID_FILE}"
        echo "Ralph started (PID $!)"
        ;;

    stop)
        # Graceful stop via signal file
        touch "${STATE_DIR}/STOP"
        echo "Stop requested. Ralph will exit after current iteration."
        ;;

    kill)
        # Force stop
        if [[ -f "${PID_FILE}" ]]; then
            kill -TERM "$(cat "${PID_FILE}")" 2>/dev/null
            rm "${PID_FILE}"
        fi
        ;;

    status)
        # Show current state
        if [[ -f "${PID_FILE}" ]] && kill -0 "$(cat "${PID_FILE}")" 2>/dev/null; then
            echo "Ralph is running (PID $(cat "${PID_FILE}"))"
            cat "${STATE_DIR}/status.yaml"
        else
            echo "Ralph is not running"
        fi
        ;;

    logs)
        tail -f "${LOG_DIR}/ralph.log"
        ;;

    sessions)
        # List recent sessions
        ls -lt "${STATE_DIR}/sessions" | head -20
        ;;

    *)
        echo "Usage: $0 {start|stop|kill|status|logs|sessions}"
        ;;
esac

4.6 Notifications

Push notifications enable monitoring without active watching:

# Notification function
notify() {
    local message="$1"
    local title="${2:-Ralph}"

    curl -s -X POST "https://ntfy.sh/${NTFY_TOPIC}" \
        -H "Title: ${title}" \
        -d "${message}" > /dev/null 2>&1 || true
}

# Usage throughout ralph.sh
notify "Ralph iteration ${iteration} starting..."
notify "Iteration ${iteration} completed: ${summary}"
notify "ERROR: ${error_message}" "Ralph: Error"

Notification Types

Event Emoji Example
Iteration start 🤖 "Ralph iteration 417 starting..."
Success "Fixed auth bug. Tests passing."
Failure "Iteration 418 failed (2/3 errors)"
Milestone 📊 "420 iterations completed"
Backoff "API error. Backing off 5min."
Critical 🚨 "Manual intervention required"

ntfy.sh Setup

  1. Choose a unique topic name. Since ntfy.sh topics are public, use something like ralph-yourname-projectname-2025 or a random string to avoid collisions with other users
  2. Subscribe at https://ntfy.sh/your-unique-topic (or use the ntfy mobile app)
  3. Set NTFY_TOPIC=your-unique-topic in environment
  4. Notifications appear on all subscribed devices

Warning: Do not use generic topic names like ralph or my-project. Anyone can subscribe to any topic, so you may receive notifications from other people's Ralph instances (or send yours to strangers).

4.7 Evidence-Based Learning

Ralph learns from experience through structured lesson capture:

# .ralph-state/lessons-learned.yaml

lessons:
  - id: 1
    date: "2025-10-12"
    category: "quality_improvement"
    task_type: "test_coverage"
    approach: "Added integration tests for edge cases"
    outcome: "SUCCESS"
    metrics:
      before: 65
      after: 92
      improvement: 27
    lesson: "Edge case tests have highest ROI for coverage"
    why_it_worked: |
      Integration tests exercise multiple code paths.
      Edge cases often have no existing coverage.
    future_application: |
      When improving coverage, prioritize edge cases
      over additional happy-path tests.
    confidence: "HIGH"
    evidence_files:
      - "tests/integration/edge_cases_test.go"
      - ".ralph-state/coverage-report-2025-10-12.html"

  - id: 2
    date: "2025-10-15"
    category: "performance"
    approach: "Added caching layer"
    outcome: "PARTIAL"
    metrics:
      before: 450
      after: 180
      improvement: 270
      unit: "milliseconds"
    lesson: "Caching helps but introduces complexity"
    why_partial: |
      Performance improved but cache invalidation
      caused intermittent test failures.
    future_application: |
      Consider cache complexity cost.
      Ensure invalidation is tested.
    confidence: "MEDIUM"

Query Protocol

Before selecting an approach, Ralph queries lessons:

# In task selection logic:
# 1. Read lessons file
# 2. Filter by category and task_type
# 3. Apply confidence levels:
#    - CRITICAL: Must follow
#    - HIGH: Strongly prefer
#    - MEDIUM: Consider
#    - LOW: Reference only
# 4. Avoid approaches with outcome: FAILURE

5. Implementation Guide

5.0 Detailed Bootstrap Prompt

This is the full bootstrap prompt with all the fields you can customize. Use this when you have an existing codebase, specific requirements, or want fine-grained control over how Ralph is configured. For new/greenfield projects, the simpler prompt in Quick Start may be easier.

The Detailed Bootstrap Prompt

Copy and customize this prompt, replacing the placeholders with your project-specific information:

# Ralph Autonomous Development Setup

## Documentation to Read First

Before we begin, please read and understand these resources:

1. **Ralph Methodology**: https://aaronhnatiw.com/blog/ralph-autonomous-ai-agent-development-methodology.html
   - Understand the continuous loop architecture
   - Learn the state management approach (goals.yaml, status.yaml, lessons-learned.yaml)
   - Understand Beads integration for persistent task tracking
   - Note the notification system via ntfy.sh

2. **Beads Issue Tracker**: https://github.com/steveyegge/beads
   - Git-native distributed issue tracking
   - Dependency management (blocks, related, parent-child, discovered-from)
   - The `bd ready` command for finding unblocked work

## Project Context

**Project Name**: [YOUR PROJECT NAME]

**Project Description**: [BRIEF DESCRIPTION OF WHAT THE PROJECT DOES]

**Current State**: [NEW PROJECT | EXISTING CODEBASE | PROTOTYPE | etc.]

**Tech Stack**: [LANGUAGES, FRAMEWORKS, KEY DEPENDENCIES]

## The Goal

**Primary Objective**: [WHAT YOU WANT RALPH TO ACHIEVE]

**Success Criteria**: [HOW YOU'LL KNOW WHEN YOU'RE DONE]
- [ ] [Criterion 1]
- [ ] [Criterion 2]
- [ ] [Criterion 3]

**Constraints/Requirements**:
- [Any non-negotiable rules]
- [Quality standards]
- [Performance requirements]

## Your Task

### Phase 1: Codebase Exploration (DO THIS FIRST)

Before asking questions or setting anything up, thoroughly explore the existing codebase to understand:

1. **Project structure**: Directory layout, key files, entry points
2. **Architecture**: How components are organized and how they interact
3. **Existing patterns**: Code conventions, testing approach, error handling
4. **Current state**: What's working, what's incomplete, any obvious technical debt
5. **Dependencies**: External libraries, services, and integrations
6. **Build/test process**: How to build, run, and test the project

Summarize your findings before proceeding. This context is essential for asking informed questions and configuring Ralph effectively.

### Phase 2: Clarification

With your understanding of the codebase, ask me clarifying questions to fill in the gaps:

1. **The end state**: What does "done" look like? What are the measurable success criteria?
2. **The iteration loop**: What constitutes one meaningful unit of work? What should each Ralph iteration accomplish?
3. **Testing strategy**: How will Ralph verify its work is correct?
4. **Sacred principles**: What rules must NEVER be violated, even if they slow progress?
5. **Domain knowledge**: What specialized knowledge does Ralph need to do this work effectively?

Ask me 5-10 thoughtful questions that will help you design the most effective Ralph configuration for this project. I want Ralph to have a clear target and tight iteration loop (plan β†’ implement β†’ test β†’ commit β†’ repeat).

### Phase 3: Setup (AFTER CLARIFICATION)

Once you understand the project deeply, set up the complete Ralph infrastructure:

1. **Initialize Beads** for issue tracking (`bd init`)

2. **Create `.ralph-state/` directory** with:
   - `goals.yaml` - Strategic targets based on our discussion
   - `status.yaml` - Initial loop state
   - `lessons-learned.yaml` - Empty lessons template
   - `sessions/` directory for iteration archives
   - `logs/` directory for ralph.log

3. **Create `ralph.md`** - Core guidance document with:
   - Project-specific sacred principles
   - Clear stopping conditions
   - Task selection priorities
   - Verification checklist

4. **Create `ralph.sh`** - The continuous loop script adapted for [YOUR AI AGENT]

5. **Create `ralph-control.sh`** - Management interface (start/stop/status/logs)

6. **Set up Skills** (if using Claude Code) in `.claude/skills/`:
   - Project-specific domain knowledge
   - Testing workflows
   - Any specialized procedures

7. **Configure notifications** via ntfy.sh

8. **Create initial Beads issues** breaking down the primary objective into actionable tasks with proper dependencies

9. **Update `.gitignore`** appropriately

10. **Test with a single iteration** to verify everything works

### Deliverables

When complete, I should be able to run:
```bash
./ralph-control.sh start
```

And Ralph will autonomously work toward the goals we defined, sending me notifications of progress, until all success criteria are met or I manually stop it.

Customization Tips

When using this prompt:

  1. Be specific about the goal: Vague goals lead to wandering iterations. "Improve the codebase" is bad. "Achieve 90% test coverage with all tests passing" is good.
  2. Define measurable success criteria: Ralph needs to know when to stop. Numbers are better than adjectives.
  3. List your sacred principles: What should Ralph NEVER do? What quality standards are non-negotiable?
  4. Describe your testing strategy: How will Ralph verify each iteration's work? Automated tests? Manual validation commands?
  5. Be honest about current state: A new project needs different setup than an existing codebase with technical debt.

Example: Filled-In Bootstrap Prompt

Here's a concrete example of a filled-in bootstrap prompt:

# Ralph Autonomous Development Setup

## Documentation to Read First
1. **Ralph Methodology**: https://aaronhnatiw.com/blog/ralph-autonomous-ai-agent-development-methodology.html
2. **Beads Issue Tracker**: https://github.com/steveyegge/beads

## Project Context

**Project Name**: TaskFlow API

**Project Description**: A REST API for task management with real-time notifications, built in Go with PostgreSQL.

**Current State**: Existing codebase with ~60% test coverage, some technical debt in the authentication module.

**Tech Stack**: Go 1.21, PostgreSQL 15, Redis for caching, Docker for deployment

## The Goal

**Primary Objective**: Achieve production-ready quality with comprehensive test coverage and zero critical bugs.

**Success Criteria**:
- [ ] 90%+ test coverage across all packages
- [ ] All existing tests passing
- [ ] Zero critical or high-severity bugs in issue tracker
- [ ] API response times under 100ms for all endpoints
- [ ] Authentication module refactored with no security vulnerabilities

**Constraints/Requirements**:
- Must maintain backward compatibility with existing API consumers
- No external service dependencies beyond PostgreSQL and Redis
- All changes must include tests
- Security-sensitive code requires extra scrutiny

## Your Task

### Phase 1: Codebase Exploration (DO THIS FIRST)

Explore the codebase to understand:
1. Project structure and architecture
2. Existing code patterns and conventions
3. Current test coverage and testing approach
4. The authentication module's current implementation
5. Build, run, and deployment processes

Summarize your findings before asking questions.

### Phase 2: Clarification

With your understanding of the codebase, ask me about:
1. What "production-ready" means for this specific project
2. How to prioritize: coverage vs. bug fixes vs. performance
3. The authentication refactor scope and constraints
4. Testing strategy for database-dependent code
5. Any existing patterns or conventions I want maintained

[Continue with Phase 3 after discussion...]

What Happens Next

After the AI agent asks clarifying questions and you answer them:

  1. The agent will create all Ralph infrastructure files
  2. Initial Beads issues will be created with proper dependencies
  3. goals.yaml will reflect your specific success criteria
  4. ralph.md will encode your sacred principles
  5. You run ./ralph-control.sh start and Ralph begins autonomous operation

The clarification phase is critical. A few minutes of back-and-forth here saves hours of misdirected autonomous work later.

5.1 Directory Structure

your-project/
β”œβ”€β”€ .ralph-state/
β”‚   β”œβ”€β”€ goals.yaml              # Strategic targets
β”‚   β”œβ”€β”€ status.yaml             # Loop health
β”‚   β”œβ”€β”€ lessons-learned.yaml    # Evidence database
β”‚   β”œβ”€β”€ sessions/               # Per-iteration archives
β”‚   β”‚   └── session-YYYYMMDD_HHMMSS/
β”‚   β”‚       β”œβ”€β”€ prompt.txt
β”‚   β”‚       β”œβ”€β”€ output.txt
β”‚   β”‚       └── metadata.json
β”‚   β”œβ”€β”€ logs/
β”‚   β”‚   └── ralph.log
β”‚   └── STOP                    # Touch to request stop
β”œβ”€β”€ .beads/
β”‚   β”œβ”€β”€ beads.jsonl             # Issue database (commit this)
β”‚   β”œβ”€β”€ beads.db                # Local cache (gitignore)
β”‚   └── config.json
β”œβ”€β”€ .claude/skills/             # Or equivalent for your agent
β”‚   β”œβ”€β”€ testing-workflows/SKILL.md
β”‚   └── .../SKILL.md
β”œβ”€β”€ ralph.sh                    # Main loop
β”œβ”€β”€ ralph-control.sh            # Management interface
└── ralph.md                    # Core guidance document

5.2 Minimal Setup Checklist

  1. Install Beads
    # Follow instructions at https://github.com/steveyegge/beads
    go install github.com/steveyegge/beads/cmd/bd@latest
    bd init
  2. Create State Directory
    mkdir -p .ralph-state/{sessions,logs}
  3. Initialize Goals
    cat > .ralph-state/goals.yaml << 'EOF'
    phase: 1
    targets:
      # Define your project-specific targets
      test_passing:
        current: false
        target: true
        status: "BELOW_TARGET"
    EOF
  4. Create Core Guidance Document
    # ralph.md - Your project's sacred principles and guidance
  5. Set Up Loop Script
    # ralph.sh - Adapted to your AI agent
  6. Configure Notifications
    export NTFY_TOPIC="your-topic-name"
  7. Test Single Iteration
    # Run once manually to verify
    ./ralph.sh --single-iteration
  8. Start Autonomous Operation
    ./ralph-control.sh start

5.3 Adapting for Different AI Agents

The methodology is agent-agnostic. Adapt the invocation:

Claude Code:

echo "${prompt}" | claude --dangerously-skip-permissions

Codex CLI:

echo "${prompt}" | codex --full-auto

Aider:

echo "${prompt}" | aider --yes-always

Cursor (via CLI):

cursor --prompt "${prompt}" --auto-apply

Generic pattern:

echo "${prompt}" | your-agent [autonomous-flags]

Security Warning: Autonomous Mode Flags

The flags shown above (--dangerously-skip-permissions, --full-auto, --yes-always, --auto-apply) grant the AI agent extensive permissions to modify files, execute commands, and make changes without confirmation prompts.

Only use these flags in isolated environments. Ralph is most effective with these elevated permissions, but you should ensure appropriate isolation and sandboxing:

  • Use a dedicated machine or VM solely for the Ralph project
  • Run inside containers with limited host access
  • Ensure the environment has no access to sensitive credentials, production systems, or personal data
  • Use separate git credentials with limited repository access

The tradeoff is real: autonomous operation dramatically increases Ralph's effectiveness, but requires a properly sandboxed environment to mitigate risk.


6. Best Practices

6.1 Prompt Engineering

Layer your prompts:

  1. Session context (dynamic): Iteration number, git status, recent commits
  2. Sacred principles (static): Non-negotiable rules
  3. Current goals (semi-static): What we're working toward
  4. Task guidance (per-iteration): Specific instructions

Include stopping conditions:

## Stopping Conditions (ONLY TWO)
1. **Goals Achieved**: All targets met and validated
2. **Manual Stop**: STOP file exists

You continue until one of these conditions is true.

6.2 Task Selection

Use strict priority tiers:

TIER 1 (Critical): Bugs, security issues, broken builds
TIER 2 (Quality): Metrics below targets
TIER 3 (Features): New capabilities

RULE: Never work on TIER 3 while TIER 1 or 2 items exist

Query Beads for ready work:

bd ready --priority 1 --limit 5

6.3 Error Handling

Classify errors:

is_transient_error() {
    local output="$1"
    # API rate limits, 500 errors, network issues
    grep -qE 'API Error: 5[0-9]{2}|rate limit|ECONNREFUSED' <<< "$output"
}

handle_error() {
    if is_transient_error "$output"; then
        # Exponential backoff, don't count toward stop limit
        backoff_and_retry
    else
        # Permanent error, count toward limit
        ((consecutive_errors++))
    fi
}

Implement exponential backoff:

calculate_backoff() {
    local attempt="$1"
    case "$attempt" in
        1) echo 120 ;;   # 2 minutes
        2) echo 300 ;;   # 5 minutes
        3) echo 600 ;;   # 10 minutes
        *) echo 1800 ;;  # 30 minute ceiling
    esac
}

6.4 Long-Running Operations

Detect and wait efficiently:

if is_long_operation_active; then
    # Switch to low-polling mode (e.g., check every 25 minutes)
    # instead of normal iteration cycle (e.g., every 5 minutes)
    wait_for_operation
fi

Track with heartbeat files:

# Long operation updates heartbeat
echo "$(date +%s)" > .ralph-state/operation-heartbeat

# Ralph checks heartbeat age
if heartbeat_older_than 3600; then
    notify "WARNING: Operation may be stuck"
fi

6.5 Verification Protocol

Before claiming completion:

  1. Build succeeds (zero warnings/errors)
  2. All tests pass
  3. No regressions from baseline
  4. Changes committed with clear message
  5. Notification sent
  6. Handoff documentation updated

6.6 Beads Workflow

Rich issue descriptions:

bd create "Implement feature X" \
  -d "## Context
      [Why this is needed]

      ## Requirements
      - Requirement 1
      - Requirement 2

      ## Acceptance Criteria
      - [ ] Tests pass
      - [ ] Documentation updated

      ## Technical Notes
      [Implementation hints]" \
  -p 2 -t feature

Track discovered work:

# During feature work, found a bug
bd create "Bug: edge case in auth" -p 1 -t bug
bd dep add <bug-id> <feature-id> --type discovered-from

Maintain dependency hygiene:

# Before starting work
bd dep tree <id>  # Understand blockers

# When blocked
bd update <id> --status blocked
bd create "Unblock: fix dependency" -p 1
bd dep add <id> <unblocker-id> --type blocks

7. Lessons Learned

7.1 What Works

Practice Why It Works
One task per iteration Prevents scope creep, enables clean commits
Beads over markdown TODOs Persists context, tracks dependencies
Skills over monolithic prompts Easier updates, clearer dependencies
Graceful stop via file Clean shutdown without kill signals
Notifications for all outcomes Visibility without polling
Exponential backoff Handles transient failures gracefully
Session archiving Enables debugging and analysis

7.2 What Doesn't Work

Anti-Pattern Why It Fails
Monolithic 200KB prompts Context overload, hard to maintain
Implementing directly Orchestrators should delegate
Skipping verification Regressions compound over iterations
Ignoring lessons learned Same mistakes repeated
Blocking on transient errors API hiccups shouldn't stop progress
Claiming without evidence False confidence leads to regressions

7.3 Captured Lessons Template

- id: N
  date: "YYYY-MM-DD"
  category: "[quality|performance|architecture|debugging]"
  approach: "What was tried"
  outcome: "[SUCCESS|FAILURE|PARTIAL]"
  metrics:
    before: X
    after: Y
  lesson: "One-line takeaway"
  why: "Root cause analysis"
  apply: "How to use this in future"
  confidence: "[CRITICAL|HIGH|MEDIUM|LOW]"

8. Troubleshooting

Common Issues

Ralph not starting:

# Check for stale PID file
cat .ralph-state/ralph.pid
ps aux | grep [pid]
# If process doesn't exist, remove PID file
rm .ralph-state/ralph.pid

Iterations failing repeatedly:

# Check consecutive errors
cat .ralph-state/status.yaml

# Review recent session output
cat .ralph-state/sessions/$(ls -t .ralph-state/sessions | head -1)/output.txt

# Check for transient vs. permanent errors
grep -E "API Error|rate limit" .ralph-state/logs/ralph.log

Beads sync issues:

# Force re-import from JSONL
bd sync --force

# Check for merge conflicts
git status .beads/beads.jsonl

Long operation stuck:

# Check heartbeat age
stat .ralph-state/operation-heartbeat

# If too old, may need manual intervention
# Kill stuck process and update state

Notifications not arriving:

# Test ntfy connectivity
curl -d "Test message" https://ntfy.sh/${NTFY_TOPIC}

# Check topic subscription
# Visit https://ntfy.sh/YOUR_TOPIC in browser

9. Appendix: File Templates

9.1 ralph.md Template

# Ralph Guidance Document

You are Ralph, an autonomous development agent. Your mission:
**complete ONE meaningful task per iteration**, then exit cleanly.

## Sacred Principles
1. [Your non-negotiable rule 1]
2. [Your non-negotiable rule 2]
3. ...

## Stopping Conditions (ONLY TWO)
1. **Goals Achieved**: All targets in goals.yaml met
2. **Manual Stop**: .ralph-state/STOP file exists

## Task Selection Process
1. Read goals.yaml for targets
2. Query `bd ready` for unblocked work
3. Select highest priority item
4. Complete ONE task thoroughly
5. Verify completion
6. Commit and notify
7. Exit for next iteration

## Verification Checklist
- [ ] Build succeeds
- [ ] Tests pass
- [ ] No regressions
- [ ] Changes committed
- [ ] Notification sent

9.2 goals.yaml Template

# Project Goals - Updated by Ralph

phase: 1
phase_status: "ACTIVE"

targets:
  target_1:
    description: "What this target measures"
    current: 0
    target: 100
    status: "BELOW_TARGET"
    last_measured: "YYYY-MM-DD"

  target_2:
    description: "Another metric"
    current: false
    target: true
    status: "BELOW_TARGET"

completion_criteria:
  description: "When is this phase complete"
  rules:
    - "All targets at or above goals"
    - "Zero failing tests"
    - "Documentation updated"

9.3 status.yaml Template

# Ralph Status - Auto-updated each iteration

iteration: 0
consecutive_errors: 0
consecutive_successes: 0
goals_stable_count: 0
last_run: ""
total_sessions: 0

9.4 .gitignore Additions

# Ralph state (selective)
.ralph-state/logs/
.ralph-state/sessions/
.ralph-state/ralph.pid
.ralph-state/STOP

# Beads cache (source of truth is beads.jsonl)
.beads/*.db

Changelog

Date Version Changes
2025-11 1.0 Initial release with Beads, Skills, Control Scripts, Notifications

Acknowledgments