Note: This blog post was written with the help of AI (how meta!), but is based entirely on my own work implementing Ralph across real software projects. The Quick Start section below is written primarily for humans who want to get up and running fast. The rest of the document is technical reference material, useful for Ralph agents to parse during setup or for fellow nerds who want to understand how it all works under the hood.
Quick Start
Ralph is the best approach I've found for having AI build complete software projects autonomously, from start to finish. You describe what you want, define what "done" looks like, and Ralph loops until you get there. No babysitting required.
The magic is in eventual convergence: each iteration brings the project closer to your goals, even if individual steps aren't perfect. It works for all kinds of projects. I've used it to build multiple disparate systems myself, and a good friend used Ralph to build a completely custom Smalltalk VM implementation over a single weekend. It worked flawlessly.
The core idea is simple: an infinite loop feeds prompts to an AI coding assistant (Claude Code, Codex, Cursor, Aider, etc.). The agent reads the prompt, does work, commits changes, and the loop continues. This methodology takes that original Ralph approach and improves upon it in several ways to make fully autonomous development practical and reliable:
- Beads: Git-native issue tracking that replaces unwieldy markdown TODO files
- Skills: Modular domain knowledge (Claude Code only, for now)
- Control scripts: Start, stop, and monitor Ralph with simple commands
- Notifications: Get push alerts on your phone via ntfy.sh
- Evidence-based learning: Ralph remembers what worked and what didn't
Prerequisites
Before using the bootstrap prompt, install Beads (the git-native issue tracker):
# Using Homebrew (recommended)
brew install steveyegge/tap/beads
# Or using Go
go install github.com/steveyegge/beads/cmd/bd@latest
Bootstrap Prompt (Greenfield Projects)
For new projects, copy this prompt into your AI coding agent. It will ask you clarifying questions to design the right Ralph setup for your needs:
# Set Up Ralph for My Project
## Read First
- **Ralph Methodology**: https://aaronhnatiw.com/blog/ralph-autonomous-ai-agent-development-methodology.html
- **Beads Issue Tracker**: https://github.com/steveyegge/beads
## My Project
[DESCRIBE YOUR PROJECT HERE - What are you building? What's the tech stack? What does "done" look like?]
## Your Task
Ask me 5-10 clarifying questions to understand:
1. What "done" looks like (measurable success criteria)
2. What rules must NEVER be violated (sacred principles)
3. How to verify work is correct (testing strategy)
4. What specialized knowledge Ralph needs (domain expertise)
After I answer, set up the complete Ralph infrastructure:
- Initialize Beads (`bd init`)
- Create `.ralph-state/` with goals.yaml, status.yaml, lessons-learned.yaml
- Create `ralph.md` (guidance document), `ralph.sh` (loop script), `ralph-control.sh` (management)
- Set up Skills if using Claude Code
- Create initial Beads issues with proper dependencies
When done, I should be able to run `./ralph-control.sh start` and Ralph will work autonomously toward my goals.
For existing codebases or when you want more control over the setup, see the detailed bootstrap prompt later in this document.
Technical Reference Below
Everything above is all you need to get started. The rest of this document explains how all the pieces work under the hood.
Continue reading if you:
- Are an AI agent parsing this to initialize Ralph
- Want to understand how the system works internally
- Need to customize Ralph for your specific needs
- Are troubleshooting issues with your setup
1. The Core Concept
The Original Ralph
The original Ralph concept (credit: ghuntley.com/ralph) is elegantly simple:
while :; do cat PROMPT.md | npx --yes @sourcegraph/amp ; done
This infinite loop continuously feeds a prompt file to an AI coding agent, enabling autonomous iteration. The AI reads the prompt, does work, commits changes, and the loop continues.
The Philosophy
Ralph operates on several key principles:
- Eventual Consistency: Like tuning an instrument, each iteration refines the system toward the goal
- Sign-Based Guidance: Prompts act as "signs" that guide AI behavior, refined over time
- Autonomous Operation: After initial setup, the system works without human intervention
- Deterministic Improvement: Even if individual iterations vary, the trajectory is toward goals
The Problem with Basic Ralph
While powerful, the original approach has limitations:
| Problem | Impact |
|---|---|
| Context amnesia | AI forgets task details when context window compresses between sessions |
| No dependency tracking | Agent may attempt blocked work or miss prerequisites |
| Discovered work lost | Bugs found during feature work get forgotten |
| No specialized knowledge | Monolithic prompts become unwieldy and hard to maintain |
| Manual monitoring required | Must watch logs to know if system is working |
| No error recovery | Transient failures can halt the loop |
| No learning | Same mistakes repeated across iterations |
2. Key Innovations
This methodology addresses original Ralph's limitations through five key innovations:
2.1 Beads: Git-Native Issue Tracking (Most Critical)
The single most valuable improvement is integrating Beads, a git-native distributed issue tracker designed specifically for AI agents.
Why Beads matters:
The original Ralph approach often used markdown files (like TODO.md or TASKS.md) to track work. This creates a serious problem: as the project grows, these files become long and unwieldy. AI agents struggle to read and update them accurately, causing drift. The agent loses track of what's been done, what's blocked, and what's next. This drift prevents the eventual consistency that makes Ralph work.
Beads solves this with a proper database that's still git-native:
- Replaces markdown TODOs: No more giant, hard-to-parse task files
- Dependency-aware selection:
bd readyreturns only unblocked, actionable work - Discovered work tracking: New issues linked to originating work via
discovered-fromrelationships - No external services: Everything commits to git (offline-capable)
- Machine-readable: JSON output for programmatic parsing
2.2 Skills: Modular Domain Knowledge
Instead of monolithic prompts, domain expertise lives in skills. These are focused, versioned, reusable knowledge modules:
.claude/skills/
βββ testing-workflows/SKILL.md
βββ quality-assurance/SKILL.md
βββ configuration-management/SKILL.md
βββ regression-prevention/SKILL.md
Each skill contains procedures, standards, and examples for one operational area. Skills are markdown files with YAML frontmatter that define metadata like dependencies, allowed tools, and version info. When Ralph needs specialized knowledge (e.g., how to run tests in your project), it loads the relevant skill rather than searching through a massive prompt file.
Note: Skills are currently only available in Claude Code. See the Skills documentation for setup details. If you're using a different AI agent, you can achieve similar results with well-organized prompt files, but won't get the automatic skill loading that Claude Code provides.
2.3 Control Scripts: Robust Process Management
The control script (ralph-control.sh) is the primary interface for managing Ralph. You'll use it to start, stop, and monitor the entire system. Wrapper scripts provide:
- Background execution with PID tracking
- Graceful stop via signal files
- Log management and session archiving
- Status reporting without log diving
2.4 Notifications: Asynchronous Monitoring
Push notifications via ntfy.sh enable monitoring without polling:
- Iteration start/completion alerts
- Error escalation with severity levels
- Milestone tracking
- Long-running operation progress
2.5 Evidence-Based Learning
A structured lessons-learned system captures:
- What approaches worked (with metrics)
- What approaches failed (with root causes)
- Confidence levels for future decisions
3. Architecture Overview
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β RALPH CONTROL LAYER β
β ralph-control.sh: start | stop | status | logs | sessions β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β CONTINUOUS LOOP (ralph.sh) β
β βββββββββββ βββββββββββ βββββββββββ βββββββββββ β
β β Read βββββΆβ Build βββββΆβ Invoke βββββΆβ Handle β β
β β State β β Prompt β β Agent β β Result β β
β βββββββββββ βββββββββββ βββββββββββ βββββββββββ β
β β β β
β β βββββββββββββββββββ β β
β ββββββββββββββββ Update State βββββββββββββ β
β β + Notify β β
β βββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββΌββββββββββββββββ
βΌ βΌ βΌ
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β STATE FILES β β BEADS β β SKILLS β
β .ralph-state/ β β .beads/ β β .claude/skills β
β ββgoals.yaml β β beads.jsonl β β ββ*/SKILL.md β
β ββstatus.yaml β β (247 issues) β β β
β ββlessons.yaml β βββββββββββββββββββ βββββββββββββββββββ
βββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β AI CODING AGENT β
β Claude Code | Codex | Cursor | Aider | Any CLI Agent β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
4. Component Deep Dives
4.1 The Continuous Loop
Important: You don't run ralph.sh directly. Instead, use ralph-control.sh start to launch Ralph in the background with proper process management. See Control Scripts for the full interface.
The core loop structure:
#!/bin/bash
# ralph.sh - Continuous improvement loop (run via ralph-control.sh, not directly)
while :; do
# 1. Check stopping conditions
if [[ -f "${STATE_DIR}/STOP" ]] || goals_achieved; then
notify "Ralph stopped: $(get_stop_reason)"
break
fi
# 2. Check for long-running operations
if is_long_operation_active; then
wait_for_operation # Low-polling mode
continue
fi
# 3. Increment iteration
((iteration++))
# 4. Build prompt with current context
prompt=$(build_prompt "${iteration}")
# 5. Invoke AI agent
if echo "${prompt}" | your-ai-agent > "${output_file}" 2>&1; then
handle_success "${iteration}"
else
handle_failure "${iteration}" "$?"
fi
# 6. Update state files
write_status "${iteration}" "${errors}" "${successes}"
# 7. Wait before next iteration
sleep "${ITERATION_DELAY}"
done
Key features:
| Feature | Purpose |
|---|---|
| STOP file | Graceful shutdown without killing processes |
| Long operation detection | Prevents polling waste during multi-hour tasks |
| Error counting | Distinguishes transient vs. permanent failures |
| State persistence | Enables handoff between iterations |
| Configurable delay | Tune for hardware/API rate limits |
4.2 State Management
Ralph maintains state in YAML files for human readability and AI parseability:
goals.yaml - Strategic targets:
# What we're working toward
phase: 1
targets:
test_coverage:
current: 75
target: 90
status: "BELOW_TARGET"
build_time:
current: 45
target: 30
status: "BELOW_TARGET"
unit: "seconds"
completion_criteria:
- All targets at or above goals
- Zero failing tests
- Documentation updated
status.yaml - Operational health:
# Current loop state
iteration: 417
consecutive_errors: 0
consecutive_successes: 25
goals_stable_count: 0
last_run: "2025-11-26 12:00:00"
lessons-learned.yaml - Evidence database:
lessons:
- id: 1
date: "2025-10-12"
approach: "Template-based configuration"
outcome: "SUCCESS"
metrics:
before: 52
after: 98
confidence: "HIGH"
why: "Concrete examples beat abstract rules"
apply: "Use examples in all configuration tasks"
4.3 Beads: Git-Native Issue Tracking
Beads is the most critical improvement to Ralph. It replaces unwieldy markdown TODO files with a proper git-native database, preventing the drift that occurs when AI agents struggle to read and update large task lists.
How Beads Works
Source of Truth: .beads/beads.jsonl (git-tracked)
Local Cache: .beads/*.db (SQLite, gitignored)
User runs: bd create "Fix authentication bug"
β
Beads updates SQLite cache
β
Beads exports to JSONL (5s debounce)
β
Developer commits: git add .beads/beads.jsonl
β
Other machines: git pull β auto-import to local cache
Essential Beads Commands
# Find ready work (unblocked, actionable)
bd ready --priority 1 --limit 5
# Create issue with rich context
bd create "Implement user authentication" \
-d "Requirements:
- OAuth2 support
- Session management
- Rate limiting
Acceptance criteria:
- All auth tests pass
- Security audit complete" \
-p 1 -t feature
# Link dependencies
bd dep add <child> <parent> --type blocks
bd dep add <discovered> <origin> --type discovered-from
# Progress tracking
bd update <id> --status in_progress
bd close <id> --reason "Implemented in commit abc123"
# Visualize dependencies
bd dep tree <epic-id>
Four Dependency Types
| Type | Meaning | Use Case |
|---|---|---|
| blocks | Hard prerequisite | Bug fix must complete before feature work |
| related | Soft connection | Reference without blocking |
| parent-child | Hierarchical | Epic contains subtasks |
| discovered-from | Origin tracking | Bug found during feature work |
Why Beads Beats Alternatives
| Aspect | Beads | GitHub Issues | Markdown TODOs |
|---|---|---|---|
| Offline | Full | None | Full |
| Dependencies | DAG with 4 types | Labels only | None |
| Query speed | <100ms (local) | 1-3s (API) | Linear scan |
| Context persistence | Git commits | Cloud-dependent | Lost on reset |
| Agent-optimized | Yes (JSON output) | Partial | No |
4.4 Skills: Domain Knowledge Modules
Skills encapsulate specialized knowledge in reusable, versioned modules.
Skill Structure
---
name: Testing Workflows
description: Procedures for test execution and validation
version: 1.0
last-updated: 2025-11-01
depends-on: [regression-prevention]
allowed-tools: [Bash, Read, Grep]
---
# Testing Workflows
## Purpose & Scope
When to invoke this skill and what it provides.
## Core Concepts
Key principles and definitions.
## Procedures & Workflows
Step-by-step operational guidance.
### Running Tests
1. Execute unit tests: `make test`
2. Execute integration tests: `make test-integration`
3. Check coverage: `make coverage`
## Standards & Requirements
Rules that must be followed.
## Examples & Templates
Concrete demonstrations.
## Quick Reference
Command summaries and key paths.
Skill Categories
| Category | Examples | Purpose |
|---|---|---|
| Orchestration | task-selection, verification-protocol | How Ralph operates |
| Operations | testing-workflows, configuration-management | How to do common tasks |
| Domain | quality-assurance, security-guidelines | Specialized expertise |
| Integration | beads-task-tracking, notification-patterns | How systems connect |
Benefits Over Monolithic Prompts
MONOLITHIC PROMPT (200KB): ββ All knowledge embedded ββ Hard to update one area ββ Overwhelming context ββ Brittle dependencies SKILLS-BASED (14 Γ 10-30KB each): ββ Load only what's needed ββ Update skills independently ββ Clear dependency graph ββ Reusable across agents
4.5 Control Scripts
The control script provides a clean interface for managing Ralph:
#!/bin/bash
# ralph-control.sh
case "$1" in
start)
# Check for existing process
if [[ -f "${PID_FILE}" ]]; then
pid=$(cat "${PID_FILE}")
if kill -0 "$pid" 2>/dev/null; then
echo "Ralph already running (PID $pid)"
exit 1
fi
rm "${PID_FILE}" # Stale PID file
fi
# Start in background
nohup ./ralph.sh > "${LOG_DIR}/ralph-output.log" 2>&1 &
echo $! > "${PID_FILE}"
echo "Ralph started (PID $!)"
;;
stop)
# Graceful stop via signal file
touch "${STATE_DIR}/STOP"
echo "Stop requested. Ralph will exit after current iteration."
;;
kill)
# Force stop
if [[ -f "${PID_FILE}" ]]; then
kill -TERM "$(cat "${PID_FILE}")" 2>/dev/null
rm "${PID_FILE}"
fi
;;
status)
# Show current state
if [[ -f "${PID_FILE}" ]] && kill -0 "$(cat "${PID_FILE}")" 2>/dev/null; then
echo "Ralph is running (PID $(cat "${PID_FILE}"))"
cat "${STATE_DIR}/status.yaml"
else
echo "Ralph is not running"
fi
;;
logs)
tail -f "${LOG_DIR}/ralph.log"
;;
sessions)
# List recent sessions
ls -lt "${STATE_DIR}/sessions" | head -20
;;
*)
echo "Usage: $0 {start|stop|kill|status|logs|sessions}"
;;
esac
4.6 Notifications
Push notifications enable monitoring without active watching:
# Notification function
notify() {
local message="$1"
local title="${2:-Ralph}"
curl -s -X POST "https://ntfy.sh/${NTFY_TOPIC}" \
-H "Title: ${title}" \
-d "${message}" > /dev/null 2>&1 || true
}
# Usage throughout ralph.sh
notify "Ralph iteration ${iteration} starting..."
notify "Iteration ${iteration} completed: ${summary}"
notify "ERROR: ${error_message}" "Ralph: Error"
Notification Types
| Event | Emoji | Example |
|---|---|---|
| Iteration start | 🤖 | "Ralph iteration 417 starting..." |
| Success | ✅ | "Fixed auth bug. Tests passing." |
| Failure | ❌ | "Iteration 418 failed (2/3 errors)" |
| Milestone | 📊 | "420 iterations completed" |
| Backoff | ⏳ | "API error. Backing off 5min." |
| Critical | 🚨 | "Manual intervention required" |
ntfy.sh Setup
- Choose a unique topic name. Since ntfy.sh topics are public, use something like
ralph-yourname-projectname-2025or a random string to avoid collisions with other users - Subscribe at
https://ntfy.sh/your-unique-topic(or use the ntfy mobile app) - Set
NTFY_TOPIC=your-unique-topicin environment - Notifications appear on all subscribed devices
Warning: Do not use generic topic names like ralph or my-project. Anyone can subscribe to any topic, so you may receive notifications from other people's Ralph instances (or send yours to strangers).
4.7 Evidence-Based Learning
Ralph learns from experience through structured lesson capture:
# .ralph-state/lessons-learned.yaml
lessons:
- id: 1
date: "2025-10-12"
category: "quality_improvement"
task_type: "test_coverage"
approach: "Added integration tests for edge cases"
outcome: "SUCCESS"
metrics:
before: 65
after: 92
improvement: 27
lesson: "Edge case tests have highest ROI for coverage"
why_it_worked: |
Integration tests exercise multiple code paths.
Edge cases often have no existing coverage.
future_application: |
When improving coverage, prioritize edge cases
over additional happy-path tests.
confidence: "HIGH"
evidence_files:
- "tests/integration/edge_cases_test.go"
- ".ralph-state/coverage-report-2025-10-12.html"
- id: 2
date: "2025-10-15"
category: "performance"
approach: "Added caching layer"
outcome: "PARTIAL"
metrics:
before: 450
after: 180
improvement: 270
unit: "milliseconds"
lesson: "Caching helps but introduces complexity"
why_partial: |
Performance improved but cache invalidation
caused intermittent test failures.
future_application: |
Consider cache complexity cost.
Ensure invalidation is tested.
confidence: "MEDIUM"
Query Protocol
Before selecting an approach, Ralph queries lessons:
# In task selection logic:
# 1. Read lessons file
# 2. Filter by category and task_type
# 3. Apply confidence levels:
# - CRITICAL: Must follow
# - HIGH: Strongly prefer
# - MEDIUM: Consider
# - LOW: Reference only
# 4. Avoid approaches with outcome: FAILURE
5. Implementation Guide
5.0 Detailed Bootstrap Prompt
This is the full bootstrap prompt with all the fields you can customize. Use this when you have an existing codebase, specific requirements, or want fine-grained control over how Ralph is configured. For new/greenfield projects, the simpler prompt in Quick Start may be easier.
The Detailed Bootstrap Prompt
Copy and customize this prompt, replacing the placeholders with your project-specific information:
# Ralph Autonomous Development Setup
## Documentation to Read First
Before we begin, please read and understand these resources:
1. **Ralph Methodology**: https://aaronhnatiw.com/blog/ralph-autonomous-ai-agent-development-methodology.html
- Understand the continuous loop architecture
- Learn the state management approach (goals.yaml, status.yaml, lessons-learned.yaml)
- Understand Beads integration for persistent task tracking
- Note the notification system via ntfy.sh
2. **Beads Issue Tracker**: https://github.com/steveyegge/beads
- Git-native distributed issue tracking
- Dependency management (blocks, related, parent-child, discovered-from)
- The `bd ready` command for finding unblocked work
## Project Context
**Project Name**: [YOUR PROJECT NAME]
**Project Description**: [BRIEF DESCRIPTION OF WHAT THE PROJECT DOES]
**Current State**: [NEW PROJECT | EXISTING CODEBASE | PROTOTYPE | etc.]
**Tech Stack**: [LANGUAGES, FRAMEWORKS, KEY DEPENDENCIES]
## The Goal
**Primary Objective**: [WHAT YOU WANT RALPH TO ACHIEVE]
**Success Criteria**: [HOW YOU'LL KNOW WHEN YOU'RE DONE]
- [ ] [Criterion 1]
- [ ] [Criterion 2]
- [ ] [Criterion 3]
**Constraints/Requirements**:
- [Any non-negotiable rules]
- [Quality standards]
- [Performance requirements]
## Your Task
### Phase 1: Codebase Exploration (DO THIS FIRST)
Before asking questions or setting anything up, thoroughly explore the existing codebase to understand:
1. **Project structure**: Directory layout, key files, entry points
2. **Architecture**: How components are organized and how they interact
3. **Existing patterns**: Code conventions, testing approach, error handling
4. **Current state**: What's working, what's incomplete, any obvious technical debt
5. **Dependencies**: External libraries, services, and integrations
6. **Build/test process**: How to build, run, and test the project
Summarize your findings before proceeding. This context is essential for asking informed questions and configuring Ralph effectively.
### Phase 2: Clarification
With your understanding of the codebase, ask me clarifying questions to fill in the gaps:
1. **The end state**: What does "done" look like? What are the measurable success criteria?
2. **The iteration loop**: What constitutes one meaningful unit of work? What should each Ralph iteration accomplish?
3. **Testing strategy**: How will Ralph verify its work is correct?
4. **Sacred principles**: What rules must NEVER be violated, even if they slow progress?
5. **Domain knowledge**: What specialized knowledge does Ralph need to do this work effectively?
Ask me 5-10 thoughtful questions that will help you design the most effective Ralph configuration for this project. I want Ralph to have a clear target and tight iteration loop (plan β implement β test β commit β repeat).
### Phase 3: Setup (AFTER CLARIFICATION)
Once you understand the project deeply, set up the complete Ralph infrastructure:
1. **Initialize Beads** for issue tracking (`bd init`)
2. **Create `.ralph-state/` directory** with:
- `goals.yaml` - Strategic targets based on our discussion
- `status.yaml` - Initial loop state
- `lessons-learned.yaml` - Empty lessons template
- `sessions/` directory for iteration archives
- `logs/` directory for ralph.log
3. **Create `ralph.md`** - Core guidance document with:
- Project-specific sacred principles
- Clear stopping conditions
- Task selection priorities
- Verification checklist
4. **Create `ralph.sh`** - The continuous loop script adapted for [YOUR AI AGENT]
5. **Create `ralph-control.sh`** - Management interface (start/stop/status/logs)
6. **Set up Skills** (if using Claude Code) in `.claude/skills/`:
- Project-specific domain knowledge
- Testing workflows
- Any specialized procedures
7. **Configure notifications** via ntfy.sh
8. **Create initial Beads issues** breaking down the primary objective into actionable tasks with proper dependencies
9. **Update `.gitignore`** appropriately
10. **Test with a single iteration** to verify everything works
### Deliverables
When complete, I should be able to run:
```bash
./ralph-control.sh start
```
And Ralph will autonomously work toward the goals we defined, sending me notifications of progress, until all success criteria are met or I manually stop it.
Customization Tips
When using this prompt:
- Be specific about the goal: Vague goals lead to wandering iterations. "Improve the codebase" is bad. "Achieve 90% test coverage with all tests passing" is good.
- Define measurable success criteria: Ralph needs to know when to stop. Numbers are better than adjectives.
- List your sacred principles: What should Ralph NEVER do? What quality standards are non-negotiable?
- Describe your testing strategy: How will Ralph verify each iteration's work? Automated tests? Manual validation commands?
- Be honest about current state: A new project needs different setup than an existing codebase with technical debt.
Example: Filled-In Bootstrap Prompt
Here's a concrete example of a filled-in bootstrap prompt:
# Ralph Autonomous Development Setup
## Documentation to Read First
1. **Ralph Methodology**: https://aaronhnatiw.com/blog/ralph-autonomous-ai-agent-development-methodology.html
2. **Beads Issue Tracker**: https://github.com/steveyegge/beads
## Project Context
**Project Name**: TaskFlow API
**Project Description**: A REST API for task management with real-time notifications, built in Go with PostgreSQL.
**Current State**: Existing codebase with ~60% test coverage, some technical debt in the authentication module.
**Tech Stack**: Go 1.21, PostgreSQL 15, Redis for caching, Docker for deployment
## The Goal
**Primary Objective**: Achieve production-ready quality with comprehensive test coverage and zero critical bugs.
**Success Criteria**:
- [ ] 90%+ test coverage across all packages
- [ ] All existing tests passing
- [ ] Zero critical or high-severity bugs in issue tracker
- [ ] API response times under 100ms for all endpoints
- [ ] Authentication module refactored with no security vulnerabilities
**Constraints/Requirements**:
- Must maintain backward compatibility with existing API consumers
- No external service dependencies beyond PostgreSQL and Redis
- All changes must include tests
- Security-sensitive code requires extra scrutiny
## Your Task
### Phase 1: Codebase Exploration (DO THIS FIRST)
Explore the codebase to understand:
1. Project structure and architecture
2. Existing code patterns and conventions
3. Current test coverage and testing approach
4. The authentication module's current implementation
5. Build, run, and deployment processes
Summarize your findings before asking questions.
### Phase 2: Clarification
With your understanding of the codebase, ask me about:
1. What "production-ready" means for this specific project
2. How to prioritize: coverage vs. bug fixes vs. performance
3. The authentication refactor scope and constraints
4. Testing strategy for database-dependent code
5. Any existing patterns or conventions I want maintained
[Continue with Phase 3 after discussion...]
What Happens Next
After the AI agent asks clarifying questions and you answer them:
- The agent will create all Ralph infrastructure files
- Initial Beads issues will be created with proper dependencies
goals.yamlwill reflect your specific success criteriaralph.mdwill encode your sacred principles- You run
./ralph-control.sh startand Ralph begins autonomous operation
The clarification phase is critical. A few minutes of back-and-forth here saves hours of misdirected autonomous work later.
5.1 Directory Structure
your-project/
βββ .ralph-state/
β βββ goals.yaml # Strategic targets
β βββ status.yaml # Loop health
β βββ lessons-learned.yaml # Evidence database
β βββ sessions/ # Per-iteration archives
β β βββ session-YYYYMMDD_HHMMSS/
β β βββ prompt.txt
β β βββ output.txt
β β βββ metadata.json
β βββ logs/
β β βββ ralph.log
β βββ STOP # Touch to request stop
βββ .beads/
β βββ beads.jsonl # Issue database (commit this)
β βββ beads.db # Local cache (gitignore)
β βββ config.json
βββ .claude/skills/ # Or equivalent for your agent
β βββ testing-workflows/SKILL.md
β βββ .../SKILL.md
βββ ralph.sh # Main loop
βββ ralph-control.sh # Management interface
βββ ralph.md # Core guidance document
5.2 Minimal Setup Checklist
-
Install Beads
# Follow instructions at https://github.com/steveyegge/beads go install github.com/steveyegge/beads/cmd/bd@latest bd init -
Create State Directory
mkdir -p .ralph-state/{sessions,logs} -
Initialize Goals
cat > .ralph-state/goals.yaml << 'EOF' phase: 1 targets: # Define your project-specific targets test_passing: current: false target: true status: "BELOW_TARGET" EOF -
Create Core Guidance Document
# ralph.md - Your project's sacred principles and guidance -
Set Up Loop Script
# ralph.sh - Adapted to your AI agent -
Configure Notifications
export NTFY_TOPIC="your-topic-name" -
Test Single Iteration
# Run once manually to verify ./ralph.sh --single-iteration -
Start Autonomous Operation
./ralph-control.sh start
5.3 Adapting for Different AI Agents
The methodology is agent-agnostic. Adapt the invocation:
Claude Code:
echo "${prompt}" | claude --dangerously-skip-permissions
Codex CLI:
echo "${prompt}" | codex --full-auto
Aider:
echo "${prompt}" | aider --yes-always
Cursor (via CLI):
cursor --prompt "${prompt}" --auto-apply
Generic pattern:
echo "${prompt}" | your-agent [autonomous-flags]
Security Warning: Autonomous Mode Flags
The flags shown above (--dangerously-skip-permissions, --full-auto, --yes-always, --auto-apply) grant the AI agent extensive permissions to modify files, execute commands, and make changes without confirmation prompts.
Only use these flags in isolated environments. Ralph is most effective with these elevated permissions, but you should ensure appropriate isolation and sandboxing:
- Use a dedicated machine or VM solely for the Ralph project
- Run inside containers with limited host access
- Ensure the environment has no access to sensitive credentials, production systems, or personal data
- Use separate git credentials with limited repository access
The tradeoff is real: autonomous operation dramatically increases Ralph's effectiveness, but requires a properly sandboxed environment to mitigate risk.
6. Best Practices
6.1 Prompt Engineering
Layer your prompts:
- Session context (dynamic): Iteration number, git status, recent commits
- Sacred principles (static): Non-negotiable rules
- Current goals (semi-static): What we're working toward
- Task guidance (per-iteration): Specific instructions
Include stopping conditions:
## Stopping Conditions (ONLY TWO)
1. **Goals Achieved**: All targets met and validated
2. **Manual Stop**: STOP file exists
You continue until one of these conditions is true.
6.2 Task Selection
Use strict priority tiers:
TIER 1 (Critical): Bugs, security issues, broken builds TIER 2 (Quality): Metrics below targets TIER 3 (Features): New capabilities RULE: Never work on TIER 3 while TIER 1 or 2 items exist
Query Beads for ready work:
bd ready --priority 1 --limit 5
6.3 Error Handling
Classify errors:
is_transient_error() {
local output="$1"
# API rate limits, 500 errors, network issues
grep -qE 'API Error: 5[0-9]{2}|rate limit|ECONNREFUSED' <<< "$output"
}
handle_error() {
if is_transient_error "$output"; then
# Exponential backoff, don't count toward stop limit
backoff_and_retry
else
# Permanent error, count toward limit
((consecutive_errors++))
fi
}
Implement exponential backoff:
calculate_backoff() {
local attempt="$1"
case "$attempt" in
1) echo 120 ;; # 2 minutes
2) echo 300 ;; # 5 minutes
3) echo 600 ;; # 10 minutes
*) echo 1800 ;; # 30 minute ceiling
esac
}
6.4 Long-Running Operations
Detect and wait efficiently:
if is_long_operation_active; then
# Switch to low-polling mode (e.g., check every 25 minutes)
# instead of normal iteration cycle (e.g., every 5 minutes)
wait_for_operation
fi
Track with heartbeat files:
# Long operation updates heartbeat
echo "$(date +%s)" > .ralph-state/operation-heartbeat
# Ralph checks heartbeat age
if heartbeat_older_than 3600; then
notify "WARNING: Operation may be stuck"
fi
6.5 Verification Protocol
Before claiming completion:
- Build succeeds (zero warnings/errors)
- All tests pass
- No regressions from baseline
- Changes committed with clear message
- Notification sent
- Handoff documentation updated
6.6 Beads Workflow
Rich issue descriptions:
bd create "Implement feature X" \
-d "## Context
[Why this is needed]
## Requirements
- Requirement 1
- Requirement 2
## Acceptance Criteria
- [ ] Tests pass
- [ ] Documentation updated
## Technical Notes
[Implementation hints]" \
-p 2 -t feature
Track discovered work:
# During feature work, found a bug
bd create "Bug: edge case in auth" -p 1 -t bug
bd dep add <bug-id> <feature-id> --type discovered-from
Maintain dependency hygiene:
# Before starting work
bd dep tree <id> # Understand blockers
# When blocked
bd update <id> --status blocked
bd create "Unblock: fix dependency" -p 1
bd dep add <id> <unblocker-id> --type blocks
7. Lessons Learned
7.1 What Works
| Practice | Why It Works |
|---|---|
| One task per iteration | Prevents scope creep, enables clean commits |
| Beads over markdown TODOs | Persists context, tracks dependencies |
| Skills over monolithic prompts | Easier updates, clearer dependencies |
| Graceful stop via file | Clean shutdown without kill signals |
| Notifications for all outcomes | Visibility without polling |
| Exponential backoff | Handles transient failures gracefully |
| Session archiving | Enables debugging and analysis |
7.2 What Doesn't Work
| Anti-Pattern | Why It Fails |
|---|---|
| Monolithic 200KB prompts | Context overload, hard to maintain |
| Implementing directly | Orchestrators should delegate |
| Skipping verification | Regressions compound over iterations |
| Ignoring lessons learned | Same mistakes repeated |
| Blocking on transient errors | API hiccups shouldn't stop progress |
| Claiming without evidence | False confidence leads to regressions |
7.3 Captured Lessons Template
- id: N
date: "YYYY-MM-DD"
category: "[quality|performance|architecture|debugging]"
approach: "What was tried"
outcome: "[SUCCESS|FAILURE|PARTIAL]"
metrics:
before: X
after: Y
lesson: "One-line takeaway"
why: "Root cause analysis"
apply: "How to use this in future"
confidence: "[CRITICAL|HIGH|MEDIUM|LOW]"
8. Troubleshooting
Common Issues
Ralph not starting:
# Check for stale PID file
cat .ralph-state/ralph.pid
ps aux | grep [pid]
# If process doesn't exist, remove PID file
rm .ralph-state/ralph.pid
Iterations failing repeatedly:
# Check consecutive errors
cat .ralph-state/status.yaml
# Review recent session output
cat .ralph-state/sessions/$(ls -t .ralph-state/sessions | head -1)/output.txt
# Check for transient vs. permanent errors
grep -E "API Error|rate limit" .ralph-state/logs/ralph.log
Beads sync issues:
# Force re-import from JSONL
bd sync --force
# Check for merge conflicts
git status .beads/beads.jsonl
Long operation stuck:
# Check heartbeat age
stat .ralph-state/operation-heartbeat
# If too old, may need manual intervention
# Kill stuck process and update state
Notifications not arriving:
# Test ntfy connectivity
curl -d "Test message" https://ntfy.sh/${NTFY_TOPIC}
# Check topic subscription
# Visit https://ntfy.sh/YOUR_TOPIC in browser
9. Appendix: File Templates
9.1 ralph.md Template
# Ralph Guidance Document
You are Ralph, an autonomous development agent. Your mission:
**complete ONE meaningful task per iteration**, then exit cleanly.
## Sacred Principles
1. [Your non-negotiable rule 1]
2. [Your non-negotiable rule 2]
3. ...
## Stopping Conditions (ONLY TWO)
1. **Goals Achieved**: All targets in goals.yaml met
2. **Manual Stop**: .ralph-state/STOP file exists
## Task Selection Process
1. Read goals.yaml for targets
2. Query `bd ready` for unblocked work
3. Select highest priority item
4. Complete ONE task thoroughly
5. Verify completion
6. Commit and notify
7. Exit for next iteration
## Verification Checklist
- [ ] Build succeeds
- [ ] Tests pass
- [ ] No regressions
- [ ] Changes committed
- [ ] Notification sent
9.2 goals.yaml Template
# Project Goals - Updated by Ralph
phase: 1
phase_status: "ACTIVE"
targets:
target_1:
description: "What this target measures"
current: 0
target: 100
status: "BELOW_TARGET"
last_measured: "YYYY-MM-DD"
target_2:
description: "Another metric"
current: false
target: true
status: "BELOW_TARGET"
completion_criteria:
description: "When is this phase complete"
rules:
- "All targets at or above goals"
- "Zero failing tests"
- "Documentation updated"
9.3 status.yaml Template
# Ralph Status - Auto-updated each iteration
iteration: 0
consecutive_errors: 0
consecutive_successes: 0
goals_stable_count: 0
last_run: ""
total_sessions: 0
9.4 .gitignore Additions
# Ralph state (selective)
.ralph-state/logs/
.ralph-state/sessions/
.ralph-state/ralph.pid
.ralph-state/STOP
# Beads cache (source of truth is beads.jsonl)
.beads/*.db
Changelog
| Date | Version | Changes |
|---|---|---|
| 2025-11 | 1.0 | Initial release with Beads, Skills, Control Scripts, Notifications |
Acknowledgments
- Original Ralph concept: ghuntley.com/ralph
- Beads issue tracker: github.com/steveyegge/beads
- ntfy.sh notification service: ntfy.sh
- Claude Code Skills: claude.ai/code