AI Coding Agents

Codex /goal Command: OpenAI's Built-in Ralph Loop for Autonomous Coding (2026 Deep Dive)

The new /goal command in Codex CLI 0.128.0 turns OpenAI's coding agent into a built-in Ralph loop. Setup, prompts, token budgets, slash commands, and how it compares to bash Ralph loops, Claude Code, Cursor, and Copilot.

Ralphable Team

May 3, 2026

43 min read

codexopenai codex/goalralph loopautonomous codingai coding agentcodex cliagentic codingai workflowsclaude codecursorgithub copilotai dev tools

On April 30, 2026, OpenAI shipped Codex CLI 0.128.0. Buried in the release notes, under the deceptively flat phrase "Added persisted /goal workflows with app-server APIs, model tools, runtime continuation, and TUI controls for create, pause, resume, and clear," OpenAI did something the autonomous coding community has been doing in bash scripts for almost two years: it shipped a built-in Ralph loop. Greg Brockman summarized it in the most efficient possible way on X: "codex now has a built in Ralph loop++."

This is not a small change. The Ralph loop — the pattern where an AI coding agent runs in a tight, persistent cycle until a goal is met — has been the dominant production pattern for serious agentic coding work since Geoffrey Huntley's "everything is a ralph loop" post crystallized it. Until last week, every team running a Ralph loop on Codex was running it with their own bash scripts, their own iteration counters, their own DONE/BLOCKED file conventions, and their own token budget guardrails. Now Codex itself ships with a first-class /goal slash command that handles all of that internally, with model-side audit logic baked into the prompt itself.

If you are searching for codex /goal command, codex CLI ralph loop, openai autonomous coding agent, codex 0.128.0, codex goals feature, how to use /goal in codex, codex CLI tutorial 2026, agentic coding with codex, codex vs claude code, or codex CLI autonomous coding loop, this is the long-form, no-fluff teardown. We will cover what /goal actually does, the exact prompts OpenAI injects, the configuration flags, the slash command surface, the goal lifecycle states, the failure modes already filed in OpenAI's issue tracker, and — most importantly — when you should use /goal versus when you should still hand-roll a bash Ralph loop or reach for Claude Code skills instead.

If you have been running Ralph loops for a while, the short version is: /goal is real, it is good, and it changes how you should think about agent supervision. If you have not been running Ralph loops, this is your invitation to stop micromanaging your AI and start writing objectives.

The Three-Sentence Summary

What is a Ralph Loop, in Honest Engineering Terms

The Bash-Loop Era: How We Got Here

What /goal Actually Ships

Enabling /goal: The Config Flag You Will Miss

The Slash Command Surface: /goal, /goal pause, /goal resume, /goal clear

Goal Lifecycle States: pursuing, paused, achieved, unmet, budget-limited

The Two Prompts That Make It Work: continuation.md and budget_limit.md

The update_goal Model Tool

/goal vs a Hand-Rolled Bash Ralph Loop

/goal vs Claude Code Skills

/goal vs Cursor Agent and GitHub Copilot CLI

When to Use /goal (and When to Skip It)

Writing Goal Objectives That Actually Work

Token Budget Design: Why Budget Beats Iteration Count

Audit-First Completion: The "Do Not Accept Proxy Signals" Rule

Production Patterns: Combining /goal with AGENTS.md, MCPs, and Skills

Known Issues in 0.128.0 (and What to Do About Them)

Scaling /goal Across a Team

The Future: Weaving Loom, Evolutionary Code, and the End of the Bash Wrapper

Frequently Asked Questions

Where to Go Next

1. The Three-Sentence Summary

Codex CLI 0.128.0 adds a /goal <objective> slash command that keeps the agent looping until the objective is achieved or a configured token budget is exhausted. The loop is driven by two prompts (continuation.md and budget_limit.md) that OpenAI injects automatically at the end of each turn, plus a update_goal model tool the agent uses to mark status. You enable it by setting goals = true in the [features] section of ~/.codex/config.toml, and you can pause, resume, or clear active goals through subcommands.

That is the whole thing in one paragraph. Everything below is the texture, the trade-offs, and the production playbook.

2. What is a Ralph Loop, in Honest Engineering Terms

The Ralph loop, named for Ralph Wiggum from The Simpsons, is the most-misunderstood pattern in agentic AI. People think it is a joke — and the name is a joke — but the pattern itself is dead serious. A Ralph loop is the recognition that LLM coding agents have three stable failure modes that compound the longer a single conversation runs: context rot (the model loses precision as the context window fills with intermediate state), goal drift (the model starts solving a slightly different problem than the one you asked about), and proxy-signal collapse (the model declares success when it has only met a surface indicator like "tests pass," not the actual underlying requirement).

The Ralph loop solves all three by doing something obvious in retrospect: instead of one long conversation, you run many short conversations, each with a fresh context, against a persistent on-disk workspace. Each iteration reads the current state of the code, the test results, the task file, and any progress notes; takes one bounded action; writes its output to the workspace; and exits. A wrapper script (or, now, the Codex runtime) checks an exit condition — a DONE file, a BLOCKED file, a max-iteration cap, or a token budget — and either re-invokes the agent for another turn or stops.

It works because the agent is genuinely ignorant, persistent, and optimistic (Geoffrey Huntley's three-word summary). It does not remember its previous frustrations. It does not get tired. It does not start cutting corners after the third failed attempt. Each iteration begins with the same energy and the same clean context as the first. The loop, not the model, holds the long-term memory — and that memory lives in files you can grep, diff, and audit.

If you have not internalized this distinction, internalize it now: the Ralph loop's intelligence is in the loop, not in the agent. The agent is fungible. The loop is what makes it autonomous. This is exactly what /goal codifies.

3. The Bash-Loop Era: How We Got Here

Before /goal, every serious Codex Ralph loop looked roughly like this:

bash

#!/usr/bin/env bash
set -euo pipefail
MAX_ITERS="${MAX_ITERS:-12}"
TASK_FILE="task.md"
DONE_FILE=".ralph/DONE"
BLOCKED_FILE=".ralph/BLOCKED"
LOG_DIR=".ralph/logs"
mkdir -p "$LOG_DIR" .ralph
for ((i = 1; i <= MAX_ITERS; i++)); do
  ts="$(date -u +%Y%m%dT%H%M%SZ)"
  echo "==> iteration $i ($ts)"
codex exec \
    --skip-git-repo-check \
    -s workspace-write \
    --model gpt-5.2 \
    -- "$(cat <<EOF
You are running iteration $i of a Ralph loop.
Read $TASK_FILE for the goal and constraints.
Read .ralph/notes.md for prior-iteration context (if it exists).
Inspect the current state of the workspace.
Take one bounded action toward the goal.
Append a short summary of what you did to .ralph/notes.md.
If the goal is now complete, write a one-paragraph summary to $DONE_FILE.
If you are blocked and need user input, write the blocker to $BLOCKED_FILE.
Otherwise, do nothing special — the loop will re-invoke you.
EOF
)" 2>&1 | tee "$LOG_DIR/iter-$i.log"
if [[ -f "$DONE_FILE" ]]; then
    echo "==> DONE on iteration $i"
    exit 0
  fi
  if [[ -f "$BLOCKED_FILE" ]]; then
    echo "==> BLOCKED on iteration $i — see $BLOCKED_FILE"
    exit 2
  fi
git add -A && git commit -m "ralph: iteration $i" --allow-empty || true
done
echo "==> MAX_ITERS reached without DONE"
exit 3

Variants of this script have been shipped by d4b's Ralph Loops with Codex guide, breezewish's CodexPotter, iannuttall/ralph, nsoderberg/ralph-codex, frankbria/ralph-claude-code, and probably a hundred internal Slack snippets. The pattern is right. The implementation is fragile. Every team writes their own version, gets the iteration cap wrong on their first run, forgets to commit between turns, ships a different DONE-file convention, and eventually builds a small tower of bash to handle the edge cases.

/goal is OpenAI's bet that this entire layer should be in the runtime, not in user-space bash.

4. What `/goal` Actually Ships

The 0.128.0 release notes describe /goal as "persisted /goal workflows with app-server APIs, model tools, runtime continuation, and TUI controls for create, pause, resume, and clear." That is dense. Unpacking it term by term:

Persisted — A goal is a first-class object stored in the Codex app-server's state, not a transient prompt. It survives across turns, across compactions, and across /clear operations on the conversation.
App-server APIs — Goals are addressable from the Codex app-server, which means external tooling (TUIs, IDE integrations, future remote control) can create, query, pause, and resume goals through a stable surface, not by scraping stdout.
Model tools — The agent has a structured update_goal tool it calls to mark progress (pursuing, paused, achieved, unmet, budget-limited). The model does not just print text; it emits a tool call that the runtime parses.
Runtime continuation — At the end of each turn, if the goal is not yet achieved and budget remains, the runtime injects the continuation.md prompt and re-invokes the model. This is the actual Ralph loop, but inside the Codex process instead of a wrapping bash script.
TUI controls — The interactive Codex terminal UI exposes the goal as a visible, manipulable object. You can see goal status, pause it, resume it, or clear it without leaving the session.

The net effect: you type /goal fix the flaky search test and Codex keeps working until either the goal is genuinely achieved (per the audit logic, more on that below) or the token budget is hit. You do not write a wrapper script. You do not poll. You do not babysit.

5. Enabling `/goal`: The Config Flag You Will Miss

In 0.128.0, /goal is gated behind a feature flag. If you install Codex CLI 0.128.0 today and type /goal something, you will likely get nothing — the slash command will not even be recognized as a known command. This is exactly the failure reported in openai/codex issue #20591 (/goal slash command does not work in 0.128.0). The fix is to add the goals feature flag to your Codex config.

Open ~/.codex/config.toml (create it if it does not exist) and add:

toml

[features]
goals = true

That is the entire opt-in. Save the file, restart your Codex session, and /goal becomes a recognized slash command with full subcommand support. There is no per-project override; this is global to your Codex install. If you are running Codex CLI on a server, set the flag in the deploy user's home directory, not yours.

A note on why the flag exists: as of the 0.128.0 release, /goal is shipped but its lifecycle docs are still in progress, and OpenAI has visible follow-up issues like #19910 (active goal continuation prompt and audit requirements can be lost after mid-turn compaction). Gating the feature behind a flag lets OpenAI ship the runtime without committing to the surface area as stable. If you are on 0.128.0 and /goal works for you, treat it as a strong beta. By 0.130 or so it will probably be on by default.

6. The Slash Command Surface: `/goal`, `/goal pause`, `/goal resume`, `/goal clear`

Once enabled, /goal exposes four subcommands. They are deliberately small.

`/goal <objective>`

Sets the active goal for the current thread. The objective is free-form natural language. Examples:

/goal migrate src/legacy/auth.ts from callback style to async/await without breaking any callers
/goal eliminate every TypeScript error in src/components/dashboard/, do not use ts-ignore or as any
/goal add a Stripe webhook handler at src/api/webhooks/stripe.ts that handles checkout.session.completed and customer.subscription.deleted, with signature verification, idempotency keys, and integration tests
/goal achieve 90% mutation test score on src/lib/billing/ using Stryker

The objective should read like a contract, not a wish. Treat the rules from our iterative prompting guide and the 47 AI coding prompts collection as your style guide for objective wording.

`/goal pause`

Suspends the active goal. The continuation prompt stops being injected at the end of each turn. The goal object is preserved with status paused and can be resumed later with the full audit context intact. Use this when you want to take manual control mid-run — for example, to investigate something the agent surfaced — without throwing away the goal.

`/goal resume`

Reactivates a paused goal. The next turn re-invokes the agent with the goal's continuation prompt as if pause had never happened. This is your "ok, keep going" command after a manual interruption.

`/goal clear`

Deletes the active goal entirely. The agent goes back to behaving like a normal Codex chat — single-turn responses, no continuation prompt. Use this when the goal is genuinely no longer relevant, not when you just want to take a break (use pause for that).

There is no /goal status subcommand documented in 0.128.0, but the TUI surfaces the active goal's status in the chrome of the terminal UI, so you do not need one in interactive mode. For headless or scripted use you would query the app-server API directly.

7. Goal Lifecycle States: pursuing, paused, achieved, unmet, budget-limited

A goal in Codex 0.128.0 moves through five named states. Understanding these states is the difference between treating /goal as a black box and treating it as a programmable runtime.

`pursuing`

The default state immediately after /goal <objective> is set. The agent is actively working on it. Each turn, the runtime checks the goal status; if it is pursuing, the continuation prompt is injected and the agent gets another turn. This is the steady-state working condition.

`paused`

Manually entered via /goal pause. Continuation prompt injection stops. The goal object remains in the thread state with all its audit history intact. Resumes back to pursuing on /goal resume.

`achieved`

The agent called update_goal with status complete and the runtime accepted it. Continuation stops. This is the success terminal state. Critically, per the continuation.md prompt OpenAI ships, the agent is explicitly instructed not to call this until it has performed a real audit — file inspection, test execution, requirement-by-requirement verification — and confirmed actual completion. "Tests pass" is not enough. "I implemented the function" is not enough. The audit must show the original objective is genuinely satisfied.

`unmet`

The agent has determined the goal cannot be achieved in the current state — usually because of a blocker that requires user input, an external dependency that is not available, or a conflict between the objective and the workspace constraints. The agent is supposed to explain the blocker and stop. Think of this as the equivalent of writing a BLOCKED file in a hand-rolled Ralph loop.

`budget-limited`

The token budget configured for the goal has been exhausted. The runtime injects the budget_limit.md prompt instead of continuation.md for the final turn, instructing the agent to wrap up gracefully — summarize progress, identify remaining work, leave the user with a concrete next step — and not to start new substantive work. Importantly, the agent is also told not to mark the goal complete falsely just because budget is exhausted. This is the equivalent of hitting MAX_ITERS in a bash loop, but with a structured graceful exit instead of a hard exit 3.

8. The Two Prompts That Make It Work: `continuation.md` and `budget_limit.md`

The cleverness of /goal is not the slash command surface. It is the two prompt templates OpenAI ships in codex-rs/core/templates/goals/. These are the prompts the runtime injects automatically; they are the actual policy that makes Codex behave like a Ralph loop.

`continuation.md` — injected at the end of every turn while the goal is `pursuing`

The prompt is structured as a directive plus an audit protocol. The directive is straightforward:

Continue working toward the active thread goal.

The audit protocol is where the engineering lives. The prompt instructs the agent to:

Restate the objective as concrete deliverables. Take the original natural-language goal and decompose it into a checklist of testable requirements.
Build a checklist mapping requirements to evidence. For each deliverable, identify what would constitute evidence of completion — a file existing, a test passing, a function having a specific signature, a behavior being observable.
Inspect actual files, outputs, and test results. Do not infer. Read the file. Run the test. Look at the output.
Verify coverage comprehensively before declaring success. Every requirement must be checked.

And then the critical guardrail, quoted as faithfully as the public source allows:

Do not accept proxy signals as completion by themselves.

The prompt explicitly warns the model that passing tests, completed implementation effort, or partial progress are not, on their own, proof of completion. Only genuine requirement fulfillment counts. This is OpenAI hard-coding the lesson every Ralph loop user learned the hard way: agents will mark themselves done after writing a function whether or not the function actually does what was asked. The continuation.md prompt is a system-level inoculation against that failure mode.

The prompt closes with a procedural rule: only call update_goal with status complete when the audit confirms actual achievement. If budget exhausts before completion, do not falsely mark the goal as done. If blockers arise, explain them and await new input — that is, transition to unmet — rather than prematurely terminating the work.

This single prompt is the most opinionated piece of agent design OpenAI has shipped in a public-facing product. It is essentially OpenAI saying: here is how a coding agent should think about completion.

`budget_limit.md` — injected on the final turn when the goal hits its token budget

Where continuation.md is the working prompt, budget_limit.md is the graceful exit prompt. It is structured around the directive:

The active thread goal has reached its token budget.

The prompt has placeholder variables for {{ objective }}, {{ time_used_seconds }}, {{ tokens_used }}, and {{ token_budget }} so the model has full context on the resource situation. The primary directive is also explicit:

Do not start new substantive work for this goal. Wrap up this turn soon: summarize useful progress, identify remaining work or blockers, and leave the user with a clear next step.

And the matching guardrail:

Do not call update_goal unless the goal is actually complete.

Read those two prompts together and you can see the design clearly: continuation.md says keep going, but do not lie about being done. budget_limit.md says stop, but do not lie about being done. The agent has two failure modes the runtime cares about — false completion and runaway spend — and the two prompts close both doors.

9. The `update_goal` Model Tool

The agent does not signal goal completion by writing English. It calls a structured tool — update_goal — that the runtime parses. The tool takes a status (pursuing, paused, achieved, unmet, budget-limited per the documented state set) and presumably an explanation field. This matters for two reasons.

First, it means goal state transitions are auditable. You can log every update_goal call, see the agent's reasoning at the moment it claimed completion, and replay the decision later. If the agent marked a goal achieved when it was not actually complete, you have a clean record of the failure to feed back into prompt tuning or evaluation suites.

Second, it means external tooling can react to state changes. A CI pipeline can wait on goal completion. A monitoring dashboard can show which goals are pursuing, paused, achieved, or unmet across a fleet of Codex sessions. A team lead can get a Slack notification when a long-running goal finally hits achieved. Because the state lives in the app-server, not in a parsed log file, all of that is API-addressable.

This is the part of /goal that bash Ralph loops genuinely cannot match. A bash loop's state is whatever convention you adopted — DONE files, exit codes, log greps. None of it is structured. /goal makes goal state a real object.

10. `/goal` vs a Hand-Rolled Bash Ralph Loop

You will keep both in your toolkit. They are not redundant. Here is the comparison that matters.

Dimension	Hand-rolled bash Ralph loop	Codex `/goal`
Setup cost	Write a bash script (50-150 lines)	Set one config flag
Cross-tool portability	Works with any agent (Codex, Claude Code, gemini-cli, custom)	Codex CLI only
Per-iteration overhead	One process spawn per iteration	In-process, no spawn
Context handling	Fresh context per iteration (by design)	Continuation prompt extends context across turns; runtime may compact mid-goal
Completion logic	Whatever your script checks for (DONE file, test pass, etc.)	Model-side audit per `continuation.md` prompt
Budget control	`MAX_ITERS` cap on iteration count	Token budget enforced at runtime
Pause/resume	Not supported (you ctrl-C and lose state)	First-class `/goal pause`, `/goal resume`
Auditability	Per-iteration git commits, per-iteration logs	`update_goal` tool call history, per-turn structured state
Multi-agent	Trivial (your script can call any binary)	Single-agent only
Graceful budget exit	Hard exit on `MAX_ITERS`	`budget_limit.md` prompt for clean wrap-up
Drift across turns	Zero (each iteration is independent)	Possible — issue #19910 reports continuation prompt loss after compaction
Cost overhead	Process spawn cost per iteration	Minimal

The summary: use /goal for tasks that fit cleanly inside Codex's continuation model — bounded scope, well-defined success criteria, single agent, where the slight risk of mid-turn compaction loss is acceptable. Use a bash Ralph loop when you need agent-agnostic execution, multi-agent reviewer setups (e.g., generator agent plus reviewer agent), absolute context isolation between turns, or per-iteration git diff auditability for production systems where a human will review every commit.

In practice, most teams will use /goal for everyday tasks and keep the bash loop in their back pocket for the gnarly ones. Some — especially those building cross-tool Ralph wrappers for Claude Code and Codex simultaneously — will keep the bash loop as the default precisely because it is portable.

11. `/goal` vs Claude Code Skills

This comparison is more interesting because the two products are genuinely close in spirit but architecturally distinct.

Claude Code skills are reusable, invokable units of agent behavior — atomic prompts plus tool permissions plus context — that you trigger explicitly. A skill is a verb you can call: /refactor-to-async, /migrate-to-typescript, /add-stripe-webhook. Claude Code itself runs a single conversation; the skill defines what the agent does inside that conversation. Iteration, when needed, is handled either by the skill prompt itself (instructing Claude to iterate until criteria are met) or by an external Ralph loop wrapping the claude invocation. /goal in Codex is closer to a runtime mode. You set a goal once, and the runtime keeps the agent in a continuation cycle until the goal is met or budget is hit. There is no skill library; there is no reusable invocable unit. The goal is the unit.

Practically:

For repeated, named workflows you run across many projects: Claude Code skills win. You invoke them by name and they encode your team's conventions.
For one-off, project-specific objectives that just need to get done: /goal wins. You write the objective in plain language and it runs.
For atomic-prompt discipline (one prompt does one thing): Claude Code skills win, because skills are designed to be small.
For long, multi-step objectives that span dozens of files and need many turns: /goal wins, because the runtime handles the continuation natively.
For team-wide standardization: Claude Code skills win, because skills are versioned files in your repo.
For experimentation and one-shot ad-hoc work: /goal wins, because there is zero ceremony.

The two are complementary. A serious team in 2026 will use Claude Code skills for the operations they do every day (their playbook), and Codex /goal for the operations they need to invent on the spot.

12. `/goal` vs Cursor Agent and GitHub Copilot CLI

Cursor's agent mode and GitHub Copilot CLI both have their own takes on autonomous loops, but neither is a Ralph loop in the same sense. Cursor's agent runs inside the IDE and is heavily oriented around in-context reasoning with diff previews; it is not designed to run unattended for tens of minutes against a token budget. Copilot CLI has agentic capabilities and there are public reports of Ralph loops on Copilot CLI, but again the loop is user-implemented, not first-party.

If you want first-party autonomous looping with explicit goal tracking and budget control, Codex /goal is currently the only game in town. Anthropic will almost certainly ship something analogous in Claude Code at some point — the pattern is too obviously correct to ignore — but as of May 2026 it does not exist as a built-in.

For a richer side-by-side of the major coding agents, see Cursor vs GitHub Copilot 2026 and the Claude Code hub. For the meta-question of which agent to adopt for which kind of work, the hub on alternatives covers the trade-offs in detail.

13. When to Use `/goal` (and When to Skip It)

The temptation with any new autonomous-loop feature is to wrap every task in it. Resist. Most tasks do not need a loop at all. The right mental model is:

Use `/goal` when:

The task is multi-step but deterministic. You can articulate what "done" looks like as a checklist, even if the path to get there involves dozens of edits.
The task is bounded to a known repository area. "Refactor src/auth/ to use async/await" is bounded. "Improve the codebase" is not.
The task is easy to validate with commands. Tests, linters, type checkers, and integration suites are your friends. If completion can be verified by running a command, the audit logic in continuation.md will work for you.
The task is likely to need more than three turns. Below three turns, a single chat usually wins because there is no real loop benefit.
You are OK with budget-bounded autonomy. You can set a token budget you are comfortable with the agent spending unattended. This is the production guardrail.

Skip `/goal` when:

The task fits in one chat turn. Don't loop a one-shot.
The task is open-ended exploration with no clear success criteria. "Figure out what's wrong with our checkout flow" is exploration; you want a human in the loop on every step.
The task involves decisions a human must make. Architecture choices, naming, product calls — these need you, not a loop.
The task is high-stakes irreversible. Production migrations, mass deletes, anything you cannot easily roll back. Do these manually with maximum oversight, or use a hand-rolled bash loop with per-iteration git commits and a human review gate between iterations.
The task is subjective. "Make the UI feel snappier." Subjective targets break audit logic; the agent will declare victory the moment its own taste is satisfied, which is rarely the moment yours is.

A useful heuristic: if you can write the task as a passing test, use /goal. If you cannot, you probably want a chat, not a loop.

14. Writing Goal Objectives That Actually Work

The objective you pass to /goal <objective> is doing more work than you think. It is the input to the audit protocol in continuation.md. The agent will decompose your objective into a checklist of deliverables, then iterate until the checklist passes. If your objective is vague, the checklist will be vague, the audit will be vague, and the goal will be marked achieved based on a vague satisfaction. Garbage in, garbage out, autonomously.

A good objective has four properties. Borrowed from our stop-asking-AI-to-think framework and the explicit pass/fail criteria principle, they are:

1. A scoped target

Name the file, directory, module, function, or system you are operating on. "Refactor the auth code" is bad. "Refactor src/auth/login.ts and src/auth/session.ts to async/await" is good.

2. A behavior contract

Describe what the system must do (or not do) when the goal is achieved. "Add error handling" is bad. "Every async function in src/api/ catches errors, logs them with traceId, and returns a 500 response with a stable error code from src/lib/errors.ts" is good.

3. Explicit non-goals

Tell the agent what not to change. "Migrate to TypeScript" is bad because the agent will rewrite half your codebase. "Add TypeScript to src/utils/, do not change the JavaScript runtime behavior, do not modify src/api/" is good.

4. A verification path

Tell the agent how it can verify success. This is the audit protocol's input. "Done means: npm run typecheck passes with zero errors, npm run test passes the new tests in src/utils/__tests__/, and a sample import from src/api/ to src/utils/ works without type errors" gives the agent a checklist it can actually verify.

A worked example. Bad:

/goal fix the slow dashboard

Better:

/goal Reduce p95 render time of src/pages/dashboard.tsx from the current 1800ms to under 500ms.
Constraints: do not change the API contract of any component exported from src/components/dashboard/.
Verification: run scripts/perf-dashboard.ts after the change; the reported p95 must be under 500ms.
Do not memoize anything in src/lib/ — confine changes to src/pages/dashboard.tsx and src/components/dashboard/.

The second version gives continuation.md a real checklist: scoped to one page, with one constraint, one quantitative target, one verification command, and one explicit non-goal. The audit will be sharp. The completion call will be honest.

15. Token Budget Design: Why Budget Beats Iteration Count

Bash Ralph loops bound execution by iteration count: MAX_ITERS=12. /goal bounds execution by token budget. This is a meaningful design choice.

Iteration count is a proxy. What you actually care about is how much money the agent is allowed to spend. A 12-iteration loop where each iteration uses 50k tokens spends very differently than a 12-iteration loop where each iteration uses 500k tokens. The same is true of /goal: an iteration that needs to read a 200kB file is genuinely more expensive than an iteration that just edits a function. Iteration count obscures this; token budget exposes it.

A practical budget framework:

Small targeted goals (single file, well-bounded): 100k–500k tokens. Enough for the agent to read the file, make a change, run a test, and verify. Two to four turns at most.
Medium goals (multi-file, single concern): 500k–2M tokens. Refactors, feature additions, integration test suites.
Large goals (cross-cutting, many files): 2M–10M tokens. Migrations, large refactors, architecture changes. Be prepared to budget aggressively and check in.
Open-ended goals: Do not use /goal for these. The right tool is a human in a chat.

The point of the budget is not to never hit it. The point is to hit it consciously. When budget_limit.md fires, you get a structured wrap-up: progress summary, remaining work, next-step recommendation. That is exactly the report you want before deciding whether to bump the budget and resume, or pivot, or bring a human in. A bash Ralph loop's MAX_ITERS exit gives you no such structured handoff.

The corollary: log the budget consumption per goal. Over a few weeks you will develop a strong intuition for what kind of objective consumes what kind of budget, and you will start writing more cost-aware objectives. This is one of the highest-leverage habits a team running autonomous AI workflows can build.

16. Audit-First Completion: The "Do Not Accept Proxy Signals" Rule

The single most important sentence in the entire /goal design is, paraphrased from continuation.md: do not accept proxy signals as completion by themselves. This is OpenAI codifying the deepest failure mode of LLM agents.

A proxy signal is something that correlates with completion but is not completion. Examples:

The implementation file exists. (But does it do what was asked?)
The tests pass. (But do they actually exercise the new behavior, or do they cover the old behavior with the new code stubbed out?)
The TypeScript checker is green. (But are the types meaningful, or did the agent escape-hatch with as any?)
The function returns the right value for the example input. (But what about the edge cases the objective implied?)
The build succeeds. (But does the built artifact actually run?)

Untrained agents collapse into proxy-signal completion immediately. They will write a function, run the tests, see green, and declare victory. The user looks at the output and finds the function does not actually meet the spec — but the agent stopped because its proxy signal said done. continuation.md is OpenAI's attempt to break this collapse at the prompt level. It tells the agent: build the audit checklist from the original objective, not from the implementation; verify each item directly; do not call update_goal complete until every item passes audit, not just every item passes proxy.

In practice this means goals that have explicit, verifiable deliverables work much better than goals with implicit ones. "Add a function validateEmail(s: string): boolean that returns true for RFC 5322-compliant addresses and false otherwise" gives the audit a clear target. "Improve email validation" invites proxy collapse.

When you see /goal mark something achieved and you check and find it is not actually done, the failure is almost always in the objective, not the audit. Tighten the objective; the audit will follow.

17. Production Patterns: Combining `/goal` with AGENTS.md, MCPs, and Skills

/goal is the loop. The agent inside the loop still has the full context of your project: AGENTS.md (Codex's per-project instructions file, analogous to CLAUDE.md), any MCP servers you have wired up, and any skills-style atomic prompts you keep in your repo. Production teams will combine these layers deliberately.

AGENTS.md as the always-on context

Your AGENTS.md should cover the things the agent must know on every turn: tech stack, build commands, test commands, linting rules, branch naming conventions, where to put new files, what not to touch. The continuation prompt does not need to repeat any of this; the AGENTS.md is read every turn. A high-quality AGENTS.md is the single highest-leverage thing you can do to make /goal produce work that matches your team's standards.

MCPs as the permitted tools

If your goal requires access to your database, your monitoring system, your task tracker, or your design system, wire those up as MCP servers and grant the goal access. The agent will use them inside the loop. Without MCPs, the agent is constrained to filesystem and terminal; with them, it can query your real environment. For long-running goals that need real-world feedback (e.g., "add a feature flag and verify it's wired correctly in our LaunchDarkly project"), MCP access is the difference between speculation and verified work.

Goal objective as the per-task spec

Use the objective itself for things that change task to task: which files, which behavior, which non-goals, which verification command. Do not put framework-level conventions in the objective; those go in AGENTS.md. The objective is the job ticket. AGENTS.md is the company handbook.

Atomic prompt files for invokable sub-procedures

If your goal involves a step you do often — "after every change, run the full lint+typecheck+test suite and only proceed if it passes" — write that as an atomic prompt file in your repo (e.g., prompts/verify-clean.md) and reference it from your AGENTS.md. The agent will follow the same protocol every time. This is the atomic skills idea applied inside a /goal loop.

The combined pattern: AGENTS.md sets the floor, the /goal objective sets the target, MCPs widen the agent's reach, and atomic prompt files standardize repeated sub-procedures. With all four layers in place, you are running a serious autonomous-coding setup, not a toy.

18. Known Issues in 0.128.0 (and What to Do About Them)

/goal is new. Treat 0.128.0 as a strong beta. Three known issues are worth flagging if you are putting this into production today.

Issue #20591: `/goal` slash command does not work

If /goal is not recognized in your session, the most likely cause is the goals = true feature flag is not set in ~/.codex/config.toml. See section 5 for the fix. There are also reports that some shell environments need a fresh Codex restart after the config change; if the flag is set and /goal is still unknown, fully exit and reinvoke Codex.

Issue #19910: active goal continuation prompt and audit requirements can be lost after mid-turn compaction

This is the more interesting one. Codex compacts long contexts mid-turn to fit the model's window. There are reports that the continuation prompt and audit requirements injected by continuation.md can be lost in the compaction, leaving the agent without its goal-completion guardrails for the affected turn. The practical implication: long-running goals that hit compaction may exhibit drift or false completion. Until OpenAI fixes this (which they will — the issue is acknowledged), prefer goals that are unlikely to need mid-turn compaction. That means smaller objectives, tighter token budgets, and breaking large goals into a sequence of smaller ones that you launch one after another. A 1M-token goal is much safer than a 10M-token goal in 0.128.0.

Issue #20536: documentation for `/goal` and the goals lifecycle is incomplete

The slash command is shipped but the user-facing docs are still in flight. Expect the official /help output and the developer docs to fill in over the next few releases. In the meantime, the canonical references are the prompt source files in the codex repo and Simon Willison's writeup.

The general advice: read the prompt source files. They are short, clear, and they are the actual policy. Anything you would learn from official docs is downstream of what those two prompts say.

19. Scaling `/goal` Across a Team

If you are the only person on your team using /goal, you can do whatever you want. If you are introducing it to a team, a few practices pay off.

Standardize your AGENTS.md template. Every project that uses /goal should have an AGENTS.md that the team agrees on. Treat AGENTS.md as code: review it, version it, lint it (if you have an internal tool for that). Goals that run on top of inconsistent AGENTS.md files will produce inconsistent work. Document your team's objective style. Write a short internal document — three pages, not thirty — covering how to write a /goal objective: scope, behavior contract, non-goals, verification path. Share it. Refer to it in code review when someone runs a goal that produced bad output. Set token budget defaults. Pick a default token budget for your team based on the cost-per-feature you are willing to spend. Document it. Encourage people to override it consciously (up or down) rather than accidentally letting goals run unbounded. Audit the audits. Periodically pick a goal that was marked achieved and verify by hand whether it actually was. If you find a gap between the agent's audit and reality, the gap usually traces back to objective wording. Tighten the templates. Pair /goal with code review. Goals that touch shared code should still go through code review before merging. The agent's audit is a quality gate, not a substitute for human review of architectural and product decisions. Build a goal library. Capture the objective wordings that worked well as templates. Over time you will accumulate a small library of "this is how we write a goal for adding a new API endpoint," "this is how we write a goal for fixing a flaky test," "this is how we write a goal for migrating a module." This library is the team's institutional knowledge of /goal discipline.

20. The Future: Weaving Loom, Evolutionary Code, and the End of the Bash Wrapper

Step back from the implementation details. What does /goal mean for the broader trajectory of AI coding?

It means OpenAI has officially endorsed the Ralph loop pattern. The thing the autonomous-coding community has been doing in bash for two years is now a first-party feature in the dominant CLI agent. That is a strong signal that the autonomous, goal-bounded, budget-capped, audit-gated loop is the correct mental model for serious agentic work. Not chat. Not single-shot generation. Not "agent platforms" with elaborate orchestration UIs. A loop with a goal, a budget, and an audit.

Geoffrey Huntley's framing of "everything is a Ralph loop" — and the broader vision he calls the weaving loom, where evolutionary software autonomously optimizes for a target signal — gets a lot more practical when the loop is built into the runtime. You no longer need a custom infrastructure layer to run Ralph loops at scale. You need Codex CLI, an objective, and a budget.

The natural extensions, which OpenAI will almost certainly ship over the next few releases:

Multiple concurrent goals. Right now /goal is one active goal per thread. The obvious next step is multiple goals running in parallel against the same workspace, with the runtime arbitrating concurrency.
Goal templates. Reusable goal definitions you invoke by name, the way you invoke a Claude Code skill. "/goal apply standard-api-endpoint resource=user fields=name,email" — a templated goal with parameter slots.
Inter-goal dependencies. Goal B does not start until goal A is achieved. Pipelines of goals.
Cross-thread goal continuation. A goal survives session restarts and can be resumed in a fresh terminal hours later.
Multi-agent reviewer goals. A goal where one agent writes and a second agent reviews, with the loop terminating only when both agents are satisfied. This is the pattern OpenAI's own harness engineering work describes as the "Ralph Wiggum loop with reviewers."
Goals as observable infrastructure. Goal status, budget consumption, and audit transitions exposed as metrics and traces. Datadog dashboards for your autonomous coding agents.

The bigger story is that the bash wrapper era is ending, and the runtime-native Ralph loop era is starting. Wrappers are not going away — there will always be cases where you need cross-tool portability or multi-agent orchestration that is easier to express in shell than in a single CLI's mode. But for the median case, /goal is now the right answer.

If you are building developer tools, this should reshape how you think about your product surface. The interesting unit is no longer "the prompt" or even "the tool call." It is "the goal" — an objective, a budget, an audit, and a runtime that keeps the agent honest while it gets there.

21. Frequently Asked Questions

Is `/goal` the same as just running Codex in a `while true` loop?

No. A naive shell loop has no completion logic — it will keep invoking Codex forever, or until you kill it. /goal has model-side audit logic that decides when the goal is genuinely complete, plus a token budget that bounds runaway spend, plus structured state transitions you can observe. Closer in spirit, but a real Ralph loop is what you want, not a while true.

Does `/goal` work without an internet connection?

No. Codex is a CLI that calls hosted OpenAI models; it requires network access. /goal is a runtime layer on top of that, not a local model.

Which model does `/goal` use?

Whatever model you have configured for Codex. The /goal runtime is model-agnostic in the sense that the continuation and budget prompts work with any model Codex supports. In practice, more capable models produce better audits and better completion judgments, so prefer the strongest model your budget allows.

Will `/goal` keep working if I close my terminal?

The goal object is persisted, but execution is tied to an active Codex session. Closing the terminal stops the agent. The goal will be in whatever state it was at the moment of close (likely pursuing or paused). When you reopen Codex, you can resume.

Can I use `/goal` headlessly in CI?

Yes — the codex exec invocation can carry a goal, and the runtime handles continuation in the same way as in interactive mode. This is one of the genuinely powerful production patterns: kick off a goal in CI for a long-running maintenance task (cleanup, migration, refactor) and let it run against your repo with an aggressive budget.

How does `/goal` handle git commits?

It does not, by default. The agent edits the workspace; you commit. If you want per-iteration commits for auditability, the cleanest pattern is to put a "commit a snapshot of the current change with a descriptive message every time you complete a meaningful step" instruction in your AGENTS.md. The agent will then issue git commits as it goes.

What happens if the agent's audit is wrong and it marks a goal `achieved` falsely?

The goal goes into the achieved state and the loop stops. You inspect the workspace, find that the work is not actually done, and either re-run with a tighter objective or take manual control. This is why the "audit the audits" practice in section 19 matters: catch and tighten the cases where the audit drifts from reality.

Can I cancel a goal mid-execution?

Yes. /goal pause suspends; /goal clear cancels entirely. You can also Ctrl-C the active turn, which stops the current execution but leaves the goal in pursuing state — the next turn would resume it.

How do I know how much budget I have left on a goal?

The TUI surfaces budget consumption. Programmatically, query the app-server API for the goal object. The budget_limit.md prompt also includes {{ tokens_used }} and {{ token_budget }} so the agent itself is aware of consumption on the final turn.

Does `/goal` work with custom models or self-hosted endpoints?

It works with whatever Codex is pointed at. If your Codex install is configured for a custom endpoint, /goal will use that endpoint. The continuation prompts are model-agnostic English, so any sufficiently capable instruction-following model should work, though the audit quality will track the model's reasoning quality.

Is `/goal` going to replace AGENTS.md?

No. They are layered. AGENTS.md is the always-on project context. /goal is the per-task autonomous loop. Both should be used together.

What's the difference between `/goal` and a Claude Code skill?

A skill is a reusable, named operation you invoke (/refactor-to-async). A /goal is a one-off objective the runtime keeps working on. Skills are vocabulary; goals are jobs. See section 11 for the fuller comparison.

Can `/goal` be used for non-coding work?

Codex is a coding agent. /goal is built on top of Codex. So in principle, yes, for any task Codex can do — which includes writing docs, generating reports, manipulating filesystem-based assets — but the audit logic in continuation.md is most powerful when the verification is verifiable with a command. Coding is the sweet spot. For non-coding agentic work, see Ralph Loop Beyond Code for the broader pattern applied outside software engineering.

How is `/goal` priced?

Codex pricing applies normally — you pay for the tokens the agent uses. There is no separate /goal upcharge. The budget you set on a goal is the budget of tokens you are willing to spend; the runtime enforces it, but the spend itself is the same per-token rate as any other Codex usage.

Should I migrate my existing bash Ralph loops to `/goal`?

Some of them. The ones where you do not need cross-tool portability, multi-agent setups, or absolute context isolation between turns. The ones where /goal's built-in audit and budget logic genuinely replace your wrapper. For everything else — especially anything reviewer-based or anything that runs across both Codex and Claude Code — keep the bash loop. See section 10 for the decision framework.

Where can I see real `/goal` usage examples?

The early-adopter community is publishing examples on YouTube (search "codex /goal command" — there are good walkthroughs from the first week of the release) and in the openai/codex issue tracker. Simon Willison's writeup is the canonical first-day analysis. For more on the underlying Ralph loop pattern with worked examples, see our Ralph Loop methodology guide and 75+ Ralph Loop examples collection.

22. Where to Go Next

If you have read this far, you are serious about autonomous coding. The next moves:

Install Codex CLI 0.128.0 or later, set goals = true in ~/.codex/config.toml, and run your first /goal against a low-stakes task. Pick something with a verifiable success criterion — "add type hints to every function in src/utils/format.ts so npm run typecheck passes with strict mode" — and let it run.

Read the prompt source files. continuation.md and budget_limit.md are short. They are the actual policy. Understanding them changes how you write objectives.

Write or upgrade your AGENTS.md. A great AGENTS.md is the highest-leverage thing you can do to make /goal produce team-quality work. Cover stack, build, test, lint, branch conventions, and explicit don't-touch zones.

Sit with the Ralph loop pattern. If you have not internalized the Ralph loop conceptually, spend an hour with the methodology guide and Geoffrey Huntley's "everything is a ralph loop". /goal makes a lot more sense once the underlying pattern is clear.

Build a goal-objective library. Start collecting the wordings that worked well. Within a month you will have a personal style guide for how to write objectives that produce honest audits.

Generate purpose-built objectives with Ralphable. If you want a head start on writing high-quality objectives — with the right scope, behavior contract, non-goals, and verification path baked in — the Ralphable generator turns a one-line task into a full Ralph loop spec. Paste it into /goal and run.

The Ralph loop stopped being a clever bash trick the moment OpenAI shipped /goal. It is now a first-class engineering primitive. Every team that ships software in 2026 should understand it, write objectives in its idiom, and use it where it fits. The teams that do will move at a different speed than the teams that don't.

The agents are ready. The runtime is ready. The pattern is named, shipped, and documented. The only remaining question is what objective you write first.

---

Related reading on Ralphable:

The Ralph Loop: 75+ Examples of AI That Iterates Until Done — the full methodology, with worked examples
47 AI Coding Prompts That Actually Work — copy-paste templates you can wrap in /goal
How to Write Prompts for Claude — the equivalent discipline on Anthropic's stack
Ralph Loop Beyond Code — applying the pattern to non-coding agentic work
Stop Letting AI Guess Intent: Explicit Pass/Fail Criteria — the discipline that makes /goal audits honest
How to Structure Atomic Skills for Claude Code Autonomous Refactoring — the per-procedure layer that complements /goal
Claude Code Tutorial: Zero to Autonomous Coding in 30 Minutes — for teams running both Codex and Claude Code
Cursor vs GitHub Copilot 2026 — the IDE-native side of the agent landscape

Ready to try structured prompts?

Generate a skill that makes Claude iterate until your output actually hits the bar. Free to start.

Ralphable Team

Building tools for better AI outputs

View all articles

Table of Contents

1. The Three-Sentence Summary

2. What is a Ralph Loop, in Honest Engineering Terms

3. The Bash-Loop Era: How We Got Here

4. What /goal Actually Ships

5. Enabling /goal: The Config Flag You Will Miss

6. The Slash Command Surface: /goal, /goal pause, /goal resume, /goal clear

/goal <objective>

/goal pause

/goal resume

/goal clear

7. Goal Lifecycle States: pursuing, paused, achieved, unmet, budget-limited

pursuing

paused

achieved

unmet

budget-limited

8. The Two Prompts That Make It Work: continuation.md and budget_limit.md

continuation.md — injected at the end of every turn while the goal is pursuing

budget_limit.md — injected on the final turn when the goal hits its token budget

9. The update_goal Model Tool

10. /goal vs a Hand-Rolled Bash Ralph Loop

11. /goal vs Claude Code Skills

12. /goal vs Cursor Agent and GitHub Copilot CLI

13. When to Use /goal (and When to Skip It)

Use /goal when:

Skip /goal when: