Prompt Templates

Ralph Prompt: 75+ Copy-Paste Ready Templates for Self-Improving AI (2026)

>-

Ralphable Team
(Updated March 21, 2026)
26 min read
ralph promptclaude codeiterative promptingai promptsself-improving prompts

Introduction: The End of "Good Enough" AI Output

Ralph prompts transform Anthropic's Claude, OpenAI's GPT-4, Cursor, and GitHub Copilot from one-shot generators into self-improving loops -- atomic tasks with binary pass/fail criteria produce verified, production-ready output every time.

For years, we've been trapped in a cycle of prompt-and-pray AI interactions. You write a prompt, cross your fingers, and hope the AI produces something useful. When it doesn't, you tweak the prompt, try again, and repeat this frustrating dance until you either settle for "good enough" or give up entirely. This traditional prompting approach has a fundamental flaw: it treats AI as a one-shot generator rather than an iterative problem-solver. The result? Wasted time, inconsistent quality, and AI outputs that look promising but fail under real scrutiny.

Enter the Ralph Prompt—a revolutionary approach that transforms AI from a suggestion engine into a self-improving problem-solver. Named after the Ralph Loop methodology developed at Ralphable, a ralph prompt doesn't just ask for output; it creates a systematic process where the AI breaks complex work into atomic tasks, defines explicit pass/fail criteria for each, tests its own work, and iterates automatically until every single criterion is met. This isn't about getting "pretty good" results; it's about achieving objectively correct, verifiable outcomes.

What makes ralph prompts fundamentally different is their built-in quality control mechanism. Traditional prompts might say "write a Python function to sort data," and you'll get something that looks right but might have edge cases or inefficiencies. A ralph prompt says: "Break this into atomic tasks: 1) Design the algorithm, 2) Implement with error handling, 3) Create test cases, 4) Run tests, 5) Fix any failures. For each task, define pass/fail criteria. Iterate until all criteria pass." The AI becomes its own quality assurance team, catching and fixing its mistakes without human intervention.

This article represents the most comprehensive collection of ralph prompt templates available anywhere. We've distilled months of research and testing into 75+ copy-paste ready templates that you can use immediately with Claude Code and other advanced AI systems. These aren't just theoretical concepts—they're battle-tested templates for code generation, content creation, data analysis, system design, debugging, and more. Each template follows the proven Ralph Loop methodology that ensures your AI doesn't stop at "looks good" but continues until "all criteria pass."

You'll discover how to structure prompts that make AI work like a senior engineer who documents their assumptions, tests their code, validates their logic, and refuses to deliver incomplete work. We'll show you the five essential components of every effective ralph prompt, provide detailed examples across multiple domains, and give you templates you can adapt to your specific needs. Whether you're building software, analyzing data, creating content, or solving complex problems, these ralph prompts will transform how you work with AI.

What Is a Ralph Prompt?

A ralph prompt decomposes work into atomic tasks, defines pass/fail criteria, and iterates until every criterion passes -- Claude Code (Anthropic) and GPT-4 (OpenAI) become self-testing engineers instead of suggestion engines.

A ralph prompt is a structured instruction set that initiates what we call a "Ralph Loop"—a systematic process where AI breaks down complex work into small, verifiable pieces (atomic tasks), defines objective pass/fail criteria for each piece, tests its own output against those criteria, and automatically iterates until all criteria are satisfied. Unlike traditional prompts that produce a single response, a ralph prompt creates an ongoing conversation where the AI acts as both creator and critic, refusing to deliver work that doesn't meet explicitly defined standards.

The term originates from Ralphable, where we discovered that the most reliable AI outputs came from prompts that enforced rigorous self-testing. The core insight was simple: AI makes mistakes, just like humans, but unlike humans, AI can test its own work instantly and objectively if given the right framework. A ralph prompt provides that framework by requiring the AI to:

  • Decompose the problem into independently verifiable atomic tasks
  • Define clear, testable success criteria for each task
  • Execute each task while documenting its approach
  • Test its output against the defined criteria
  • Iterate on any failures until all criteria pass
  • Signal completion only when everything is verified
  • Let's examine a basic example to illustrate the difference. A traditional prompt for creating a website component might look like this:

    Create a responsive navigation bar with a logo on the left and three menu items on the right.

    The AI might produce something that looks right but could have hidden issues: maybe it's not truly responsive on all devices, perhaps the menu doesn't work on mobile, or maybe the code has accessibility issues. You'd need to manually test it, find problems, and go back and forth with the AI.

    A ralph prompt for the same task transforms the interaction:

    I need a responsive navigation bar. Follow the Ralph Loop methodology:

    TASK CONTEXT: Create a production-ready responsive navigation bar with logo left, menu right.

    ATOMIC TASK BREAKDOWN:

  • Design HTML structure with semantic elements
  • Create CSS for desktop layout
  • Create CSS for mobile responsiveness with hamburger menu
  • Add JavaScript for mobile menu toggle
  • Implement accessibility features
  • Cross-browser testing simulation
  • PASS/FAIL CRITERIA FOR EACH TASK: Task 1: Must use <nav>, <ul>, <li> elements appropriately Task 2: Must align logo left, menu right on screens >768px Task 3: Must collapse to hamburger menu on screens <769px Task 4: Must toggle menu visibility on click Task 5: Must include ARIA labels and keyboard navigation Task 6: Must render correctly in Chrome, Firefox, Safari

    ITERATION LOGIC: After completing all tasks, test each criterion. If any fail, diagnose the issue, fix it, and retest. Continue until all criteria pass.

    COMPLETION SIGNAL: Only say "ALL CRITERIA PASS: Navigation bar complete" when every single criterion above is satisfied.

    Begin the Ralph Loop now.

    The AI will now approach this systematically. It will first break down the work, then for each atomic task, it will define even more specific criteria. When it writes the HTML, it will check if it used semantic elements. When it creates the CSS, it will verify the layouts. It will test the mobile responsiveness, check the JavaScript functionality, validate accessibility, and simulate cross-browser rendering. If the hamburger menu doesn't work on the first try, the AI will diagnose why, fix it, and retest—all without you asking.

    This approach works because it leverages AI's strengths (rapid iteration, pattern recognition, code generation) while mitigating its weaknesses (overconfidence, missing edge cases, inconsistency). The ralph prompt creates a feedback loop where the AI's output becomes input for its own quality assessment. This is particularly powerful with Claude Code, which can execute code, test outputs, and analyze results within a single conversation.

    The philosophical shift is significant: instead of viewing AI as a tool that produces answers, we view it as a process that produces verified solutions. This aligns with how expert humans work—we don't write code and assume it works; we write tests, run them, fix failures, and repeat. The ralph prompt simply makes this rigorous engineering mindset explicit and enforceable in AI interactions.

    The Anatomy of a Perfect Ralph Prompt

    Five components -- task context, atomic breakdown, pass/fail criteria, iteration logic, and completion signal -- create the self-improving loop that Claude, GPT-4, Cursor, and GitHub Copilot use to produce verified solutions.

    Every effective ralph prompt contains five essential components that work together to create the self-improving loop. Missing any component reduces the effectiveness, while mastering all five creates AI interactions that consistently produce verified, production-ready results.

    1. Task Context

    The task context sets the stage by clearly defining what needs to be accomplished, why it matters, and any constraints or requirements. This isn't just a restatement of the request—it provides the "why" behind the "what," which helps the AI make better decisions during execution. Example template:
    markdown
    TASK CONTEXT:
    [Clear description of the overall goal]
    [Why this task matters or how it will be used]
    [Any constraints: time, resources, standards, dependencies]
    [Success looks like: description of the end state]
    Complete example:
    markdown
    TASK CONTEXT:
    Create a Python data validation module for user registration.
    This will be used in production with 100K+ daily users, so it must be robust.
    Constraints: Must use Pydantic v2, support async validation, and include comprehensive error messages.
    Success looks like: A reusable module that catches all invalid inputs before database insertion.

    2. Atomic Task Breakdown

    This is where complex work gets decomposed into small, independently verifiable pieces. Each atomic task should be:
    • Specific enough to have clear boundaries
    • Independent enough to be testable on its own
    • Small enough that failure points are obvious
    • Sequential when dependencies exist
    Example template:
    markdown
    ATOMIC TASK BREAKDOWN:
    
  • [First discrete unit of work]
  • [Second discrete unit of work]
  • [Third discrete unit of work]
  • ... continue until all aspects are covered
    Complete example:
    markdown
    ATOMIC TASK BREAKDOWN:
    
  • Define Pydantic models for user input with type annotations
  • Add custom validators for password strength and email format
  • Create async validation functions for checking unique username/email
  • Implement error collection and user-friendly error messages
  • Write unit tests for valid and invalid inputs
  • Create usage examples in documentation
  • 3. Pass/Fail Criteria

    For each atomic task, define objective, testable conditions that determine success. These should be:
    • Binary (either pass or fail, no ambiguity)
    • Testable (the AI can verify them programmatically or logically)
    • Specific (avoid subjective terms like "good" or "clean")
    • Complete (cover all important aspects)
    Example template:
    markdown
    PASS/FAIL CRITERIA:
    Task 1: [Criterion 1], [Criterion 2], [Criterion 3]
    Task 2: [Criterion 1], [Criterion 2]
    ... continue for all tasks
    Complete example:
    markdown
    PASS/FAIL CRITERIA:
    Task 1: Models include username, email, password fields; All fields have type hints
    Task 2: Password validator requires 8+ chars, 1 uppercase, 1 number; Email validator checks format
    Task 3: Async functions query mock database; Return True/False without exceptions
    Task 4: All validation errors collected in list; Messages explain fix to user
    Task 5: Tests cover valid input, all invalid cases; All tests pass when run
    Task 6: Examples show basic usage, error handling, and async usage

    4. Iteration Logic

    This component defines what happens when criteria fail. It should specify:
    • How failures are detected
    • The diagnosis process
    • The fix-and-retest cycle
    • When to try different approaches vs. debug current approach
    Example template:
    markdown
    ITERATION LOGIC:
    After completing all tasks, test each criterion systematically.
    If any criterion fails:
      1. Diagnose the root cause
      2. Implement a fix
      3. Retest that specific criterion
      4. Continue testing remaining criteria
    Repeat until ALL criteria pass.
    If stuck after 3 attempts on same issue, try a fundamentally different approach.
    Complete example:
    markdown
    ITERATION LOGIC:
    Complete all 6 tasks, then test each criterion in order.
    For any failing criterion: analyze why it failed, fix the issue, then retest.
    Example: If Task 5 tests fail, check if implementation is wrong or tests are wrong, fix accordingly.
    Continue loop until all 12 criteria (2 per task) pass.
    If same criterion fails twice, approach from different angle on third attempt.

    5. Completion Signal

    The final component tells the AI how to indicate successful completion. This creates a clear endpoint and prevents premature stopping. Example template:
    markdown
    COMPLETION SIGNAL:
    Only say "[SPECIFIC PHRASE]" when ALL criteria pass.
    Before that, continue iterating.
    Do not indicate completion prematurely.
    Complete example:
    markdown
    COMPLETION SIGNAL:
    Only say "ALL CRITERIA PASS: Data validation module complete" when all 12 criteria pass.
    Before that, continue testing and iterating.
    Do not say "done" or "complete" until every criterion is verified.

    Putting It All Together

    Here's a complete ralph prompt template you can copy and adapt:

    markdown
    TASK CONTEXT:
    [Your overall goal and purpose]
    [Constraints and requirements]
    [What success looks like]
    

    ATOMIC TASK BREAKDOWN:

  • [Task 1 description]
  • [Task 2 description]
  • [Task 3 description]
  • [Add as needed]

    PASS/FAIL CRITERIA: Task 1: [Criterion 1], [Criterion 2] Task 2: [Criterion 1], [Criterion 2] Task 3: [Criterion 1], [Criterion 2] [Match to tasks]

    ITERATION LOGIC: After all tasks, test each criterion. For failures: diagnose, fix, retest. Continue until ALL criteria pass. If stuck, try different approach.

    COMPLETION SIGNAL: Only say "ALL CRITERIA PASS: [Project name] complete" when verified.

    Begin the Ralph Loop now.

    The power of this structure is its adaptability. In the following sections, you'll see 75+ specialized templates applying this anatomy to different domains—from code generation to content creation to data analysis. Each maintains the five-component structure while adapting to specific use cases, giving you a comprehensive toolkit for self-improving AI interactions. For more prompt templates by role, see our guides for developers, content creators, product managers, and solopreneurs. If your prompt library is growing unwieldy, our analysis of the AI prompt debt crisis provides the organizational framework.

    Ralph Prompts for Code Development (15 Templates)

    Ralph prompts transform Claude Code from a suggestion engine into an autonomous, iterative developer. These templates enforce the Ralph Loop—breaking work into atomic tasks with explicit pass/fail criteria, ensuring Claude tests, diagnoses, and iterates until every objective condition is met. Below are 15+ detailed, production-ready templates you can copy and paste directly.

    ---

    1. Function Implementation

    Use when: You need a robust, production-ready function with error handling and tests.
    markdown
    RALPH PROMPT: Implement the function calculate_invoice_total based on the specification.
    

    SPECIFICATION:

    • Input: items (list of dicts with 'price' (float), 'quantity' (int), 'taxable' (bool)), discount_percent (float, 0-100), customer_type ('retail', 'wholesale')
    • Output: Final total (float), rounded to 2 decimal places.
    • Logic: Sum item subtotals (price * quantity). Apply 8% tax to taxable items only. Apply discount based on customer_type: retail gets discount_percent, wholesale gets discount_percent + 5%. Minimum charge is $1.00.
    ATOMIC TASKS:
  • Write function signature with type hints and a clear docstring.
  • Implement core calculation logic for item summation and tax.
  • Implement discount logic based on customer_type.
  • Add validation: ensure discount_percent is 0-100, quantities are positive, prices are non-negative.
  • Enforce the minimum $1.00 total.
  • Write 5 unit tests using pytest that cover edge cases (empty list, wholesale discount, zero taxable items).
  • PASS/FAIL CRITERIA:

    • [PASS] Function executes without syntax errors.
    • [PASS] 5/5 unit tests pass.
    • [PASS] Handles invalid input with clear ValueError messages.
    • [PASS] Output is correctly rounded to 2 decimals.
    • [PASS] Minimum total of $1.00 is enforced (e.g., $0.75 input yields $1.00).
    ITERATION LOGIC:
    • Run the unit tests. If any fail, analyze the failure, correct the function, and re-run ALL tests.
    • Manually test with the invalid input case {'price': -5, 'quantity': 1, 'taxable': True}. If no ValueError is raised, diagnose validation logic and fix.
    • Test the minimum charge edge case. If output is < $1.00, adjust logic.
    Explanation: This prompt forces Claude to act as a test-driven developer, not stopping at "code that works" but iterating until all validation and edge cases are formally verified.

    ---

    2. API Endpoint Development

    Use when: Creating a new Flask/FastAPI endpoint with full CRUD, validation, and error responses.
    markdown
    RALPH PROMPT: Develop a RESTful API endpoint POST /api/v1/products for product creation.
    

    SPECIFICATION:

    • Framework: FastAPI. Use Pydantic for request/response models.
    • Database: Assume an async SQLAlchemy Product model with fields: id (int, PK), name (str), sku (str, unique), price (float), category_id (int, FK), is_active (bool).
    • Request Body: { "name": "string", "sku": "string", "price": number, "category_id": integer }
    • Behavior: Create product. SKU must be unique (return 409 if conflict). category_id must exist in database (return 404 if not found). Return 201 with created product data.
    ATOMIC TASKS:
  • Define Pydantic ProductCreate and ProductResponse schemas.
  • Write the FastAPI route decorator and function signature.
  • Implement database session dependency and async create logic.
  • Add integrity check for duplicate SKU (simulate query).
  • Add foreign key validation for category_id.
  • Implement proper HTTP exception responses (409, 404, 422).
  • Write 3 integration test cases (success, duplicate SKU, invalid category).
  • PASS/FAIL CRITERIA:

    • [PASS] Code is syntactically valid FastAPI.
    • [PASS] Pydantic schemas correctly validate/restrict input types.
    • [PASS] Simulated "duplicate SKU" condition returns a 409 status.
    • [PASS] Simulated "invalid category" returns a 404 status.
    • [PASS] All 3 integration tests pass when logically executed.
    ITERATION LOGIC:
    • Validate the Pydantic schema rejects {"price": "ten"}. If it accepts, tighten schema.
    • Test the duplicate SKU logic: if the code does not raise/return 409, debug the uniqueness check.
    • Run the integration test suite conceptually. For any failing scenario, revise the endpoint logic and retest.
    Explanation: This ensures the endpoint is robust against invalid data and real-world conflicts, with Claude iterating on validation and error handling until all HTTP criteria are met.

    ---

    3. Database Query Optimization

    Use when: An existing SQL query is slow; you need an optimized, indexed, and analyzed version.
    markdown
    RALPH PROMPT: Optimize the provided slow SQL query for PostgreSQL.
    

    ORIGINAL QUERY:

    sql SELECT o.id, o.order_date, c.name, SUM(oi.quantity * p.price) as total FROM orders o JOIN customers c ON o.customer_id = c.id JOIN order_items oi ON o.id = oi.order_id JOIN products p ON oi.product_id = p.id WHERE o.order_date > NOW() - INTERVAL '30 days' GROUP BY o.id, o.order_date, c.name HAVING SUM(oi.quantity * p.price) > 1000 ORDER BY total DESC;
    ATOMIC TASKS:
    
  • Analyze the query: identify missing indexes, unnecessary joins, or inefficient clauses.
  • Rewrite the query for optimal performance (e.g., use CTEs, subqueries, better joins).
  • Propose 3 specific indexes (with CREATE INDEX statements).
  • Write an equivalent query using window functions if beneficial.
  • Provide a brief performance comparison explanation (what was improved).
  • PASS/FAIL CRITERIA:

    • [PASS] Rewritten query returns identical result set to original (logically verify).
    • [PASS] Proposed indexes are on columns used in JOIN, WHERE, and GROUP BY.
    • [PASS] No Cartesian products or unnecessary table scans are introduced.
    • [PASS] The HAVING clause logic is preserved and efficient.
    • [PASS] Explanation clearly states estimated performance gain (e.g., "Indexes avoid full scan on orders.date").
    ITERATION LOGIC:
    • Compare the output schema of the new query with the original. If different, adjust SELECT/JOIN logic.
    • Check if any proposed index is on a low-cardinality column (like is_active). If so, replace it with a more selective one.
    • Ensure the HAVING clause doesn't force calculation of all sums before filtering. If it does, consider moving logic to a subquery.
    Explanation: Claude must prove equivalence and justify each optimization, iterating until the query is both correct and demonstrably more efficient.

    ---

    4. React Component Creation

    Use when: Building a reusable, accessible, and stateful React component with TypeScript.
    markdown
    RALPH PROMPT: Create a DataTable React component with sorting, pagination, and filtering.
    

    SPECIFICATION:

    • Tech: React 18+, TypeScript, Tailwind CSS.
    • Props: data (array of objects), columns (array defining key, header, sortable), pageSize (number).
    • Features: Client-side sorting (click headers), pagination (prev/next, page numbers), text filter input (filters all columns).
    • UI: Clean, accessible table with clear visual states for sort direction.
    ATOMIC TASKS:
  • Define TypeScript interfaces for DataTableProps, Column, and component state.
  • Build the component structure with JSX, using <table> and semantic HTML.
  • Implement sorting logic (toggle ascending/descending).
  • Implement pagination logic (slice data based on current page).
  • Implement global filter input and logic.
  • Add ARIA attributes for accessibility (aria-sort, aria-label).
  • Create a usage example with sample data.
  • PASS/FAIL CRITERIA:

    • [PASS] Component compiles with tsc --noEmit (no TypeScript errors).
    • [PASS] Sorting toggles correctly between asc/desc/unsorted on click.
    • [PASS] Pagination correctly limits displayed rows to pageSize.
    • [PASS] Filter input reduces visible rows based on text match in any column.
    • [PASS] All interactive elements have appropriate ARIA attributes.
    ITERATION LOGIC:
    • Run a TypeScript check. For any errors, fix the interface or prop usage.
    • Test sort logic: click a sortable column twice; it must cycle states. If stuck, debug the state management.
    • Test pagination with 25 items and pageSize=10. Page 3 should show items 21-25. If not, fix the slice calculation.
    • Verify filter: typing "test" with no matches should show empty table. If not, adjust filter function.
    Explanation: This loop ensures a fully functional, type-safe, and accessible UI component, with Claude iterating on interactivity and compliance until all criteria pass.

    ---

    5. Unit Test Suite

    Use when: You have existing code that lacks tests and needs comprehensive coverage.
    markdown
    RALPH PROMPT: Write a complete pytest suite for the PaymentProcessor class.
    

    CLASS CODE:

    python class PaymentProcessor: def __init__(self, gateway): self.gateway = gateway self.transactions = []

    def charge(self, amount, currency="USD"): if amount <= 0: raise ValueError("Amount must be positive") if currency not in ["USD", "EUR"]: raise ValueError("Unsupported currency") result = self.gateway.charge(amount, currency) self.transactions.append({"amount": amount, "currency": currency, "id": result["id"]}) return result

    def get_total_revenue(self, currency="USD"): total = sum(t["amount"] for t in self.transactions if t["currency"] == currency) return round(total, 2)

    ATOMIC TASKS:
    
  • Create test file test_payment_processor.py.
  • Write fixtures for a mock gateway using unittest.mock.Mock.
  • Test charge(): success path, validates positive amount, validates currency.
  • Test charge(): verifies transaction is recorded.
  • Test get_total_revenue(): sums correctly, filters by currency, rounds.
  • Test integration: multiple charges correctly affect total revenue.
  • Achieve 100% logical branch coverage.
  • PASS/FAIL CRITERIA:

    • [PASS] All tests pass when executed (simulate execution).
    • [PASS] Negative amount test raises ValueError with correct message.
    • [PASS] Unsupported currency ("GBP") test raises ValueError.
    • [PASS] Mock gateway charge is called with correct arguments.
    • [PASS] get_total_revenue returns 150.0 for transactions [100.0, 50.0] in USD.
    ITERATION LOGIC:
    • Run the test suite conceptually. For any failing test, examine the assertion and fix the test or the understanding of the class.
    • Check coverage: ensure there's a test for the currency not in ["USD", "EUR"] branch. If missing, add it.
    • Verify the mock is asserted. If tests pass without checking gateway.charge was called, add the assertion.
    Explanation: Claude becomes a quality engineer, iterating until the test suite is exhaustive, passes, and validates both happy and error paths.

    ---

    6. Code Review Automation

    Use when: You want Claude to rigorously review a code diff for bugs, security, and style.
    markdown
    RALPH PROMPT: Perform a code review on the following GitHub-style diff. Identify bugs, security issues, and style deviations.
    

    DIFF:

    diff def user_login(request): if request.method == 'POST': username = request.POST.get('username') password = request.POST.get('password') user = User.objects.filter(username=username).first()
    • if user.password == password:
    + if user and user.check_password(password): login(request, user) return redirect('/dashboard') else: return render(request, 'login.html', {'error': 'Invalid credentials'}) return render(request, 'login.html')
    ATOMIC TASKS:
    
  • Analyze the fix: does it correctly address the plain-text password vulnerability?
  • Identify a new bug introduced: what if user is None?
  • Check for other security issues (e.g., lack of rate limiting, information leakage).
  • Evaluate style: is the error message generic enough?
  • Suggest an additional improvement (e.g., using authenticate()).
  • Output a review checklist with [PASS]/[FAIL] items.
  • PASS/FAIL CRITERIA:

    • [PASS] The review identifies the potential AttributeError when user is None.
    • [PASS] The review confirms the fix properly uses check_password().
    • [PASS] The review suggests at least one additional security improvement.
    • [PASS] The review notes the error message is appropriately generic (doesn't reveal if user exists).
    • [PASS] The output is a structured checklist, not just prose.
    ITERATION LOGIC:
    • Examine the user.check_password(password) line. If the review doesn't note it's safe only if user exists, fail and re-analyze.
    • Check if the review suggests adding from django.contrib.auth import authenticate. If not, suggest it as an improvement.
    • Ensure the checklist format is used. If output is a paragraph, reformat into a checklist and re-evaluate criteria.
    Explanation: This turns Claude into an automated review bot, forcing it to apply a structured checklist and iterate until all review points are systematically covered.

    ---

    7. Bug Fix with Root Cause Analysis

    Use when: A bug is reported; you need a fix, not just a patch, with understood root cause.
    markdown
    RALPH PROMPT: Diagnose and fix the bug in the merge_user_data function.
    

    BUG REPORT: "Function sometimes returns duplicate user IDs when merging lists."

    CODE:

    python def merge_user_data(list_a, list_b): """Merge two lists of user dicts by 'id'.""" merged = list_a.copy() for user_b in list_b: if not any(user_a['id'] == user_b['id'] for user_a in list_a): merged.append(user_b) return merged
    ATOMIC TASKS:
    
  • Reproduce the bug: create test inputs where list_a itself has duplicate IDs.
  • Identify the root cause: the logic only checks duplicates between lists, not within list_a.
  • Write a corrected version that ensures all IDs in the output are unique.
  • Preserve order: keep items from list_a first, then unique items from list_b.
  • Write 3 test cases: within-list duplicates, cross-list duplicates, and empty lists.
  • Propose a more efficient data structure (e.g., dictionary) for large lists.
  • PASS/FAIL CRITERIA:

    • [PASS] Corrected function returns no duplicate IDs for any input.
    • [PASS] Ordering rule is preserved (list_a items first).
    • [PASS] All 3 test cases pass.
    • [PASS] Root cause is clearly stated in one sentence.
    • [PASS] Suggested optimization uses a dict or set for O(n) performance.
    ITERATION LOGIC:
    • Test with list_a = [{'id': 1}, {'id': 1}]. If output contains duplicate id=1, the fix is insufficient; revise.
    • Verify order: input list_a = [{'id': 2}], list_b = [{'id': 1}] must output [{'id': 2}, {'id': 1}]. If reversed, fix.
    • Ensure the explanation of root cause is precise. If vague, refine it.
    Explanation: Claude must first prove it understands why the bug happens, then fix it completely, iterating on test cases until the output is guaranteed unique.

    ---

    8. Performance Optimization

    Use when: A script or function is functionally correct but unacceptably slow.
    markdown
    RALPH PROMPT: Optimize the find_common_tags function for speed.
    

    ORIGINAL CODE:

    python def find_common_tags(posts): """Find tags common to all posts.""" common_tags = [] first_post_tags = posts[0]['tags'] for tag in first_post_tags: tag_in_all = True for post in posts: if tag not in post['tags']: tag_in_all = False break if tag_in_all: common_tags.append(tag) return common_tags
    ATOMIC TASKS:
    
  • Analyze time complexity: currently O(n*m) where n=tags in first post, m=posts.
  • Optimize by converting post['tags'] lists to sets for O(1) lookups.
  • Use set intersection operation to find common tags directly.
  • Handle edge case: empty posts list.
  • Write a benchmark comparison (original vs. optimized) using pseudo-timing.
  • Ensure result order is not required (set intersection january change order).
  • PASS/FAIL CRITERIA:

    • [PASS] Optimized function returns the same logical result as original.
    • [PASS] Code uses set.intersection() or equivalent.
    • [PASS] Edge case posts=[] is handled (return empty list or raise error).
    • [PASS] Time complexity is correctly stated as O(m*k) where k is avg tag count, but with much lower constant factors.
    • [PASS] Benchmark shows at least 10x speedup for large inputs (e.g., 1000 posts, 100 tags each).
    ITERATION LOGIC:
    • Test with sample data: posts = [{'tags':['a','b']}, {'tags':['a','c']}]. Result should be ['a']. If not, debug set logic.
    • Check empty list handling. If original code crashes on posts[0], the optimized version must handle it gracefully.
    • Verify the use of set. If still using nested loops with in on lists, fail and enforce set conversion.
    Explanation: Claude must not only rewrite the function but prove the optimization is correct and significantly faster, iterating until the algorithmic improvement is achieved.

    ---

    9. Security Audit

    Use when: Reviewing code for vulnerabilities (SQLi, XSS, auth flaws, etc.).
    markdown
    RALPH PROMPT: Conduct a security audit on the following snippet of a Django view.
    

    CODE:

    python import json from django.http import JsonResponse from django.db import connection

    def search_products(request): query = request.GET.get('q', '') category = request.GET.get('category', '') sql = f"SELECT * FROM products WHERE name LIKE '%{query}%'" if category: sql += f" AND category = '{category}'" with connection.cursor() as cursor: cursor.execute(sql) results = cursor.fetchall() return JsonResponse({'results': results})

    ATOMIC TASKS:
    
  • Identify the critical SQL Injection vulnerability.
  • Identify any additional issues (JSON serialization of raw tuples, lack of input sanitization).
  • Provide a fixed version using Django's ORM or parameterized queries.
  • Suggest protection against potential XSS in the JSON response if query/category were reflected.
  • Recommend a rate-limiting strategy for this endpoint.
  • Output a vulnerability report with severity (Critical, High, Medium).
  • PASS/FAIL CRITERIA:

    • [PASS] The audit identifies the SQLi via string interpolation as Critical.
    • [PASS] Fixed code uses cursor.execute(sql, [params]) or Django ORM.
    • [PASS] The report mentions the risk of exposing raw DB tuples (information disclosure).
    • [PASS] Suggests using json.dumps with a default serializer or a DRF serializer.
    • [PASS] At least one additional hardening recommendation (rate limiting, input validation) is provided.
    ITERATION LOGIC:
    • Check the fixed code: if it still uses f-string or .format() on the SQL string, fail and enforce parameterized queries.
    • Ensure the vulnerability report is structured. If it's a paragraph, reformat into a list with severity labels.
    • Verify the recommendation for JSON serialization addresses the fetchall() tuple issue. If not, add it.
    Explanation: Claude acts as a security analyst, required to find all issues and provide corrected code, iterating until the fix eliminates the critical vulnerability and addresses secondary concerns.

    ---

    10. Documentation Generation

    Use when: You have a module or API that needs comprehensive, ready-to-publish docs.
    markdown
    RALPH PROMPT: Generate complete documentation for the StringUtils class.
    

    CLASS CODE:

    python class StringUtils: @staticmethod def slugify(text, separator="-"): """Convert text to URL-safe slug.""" # ... implementation ...

    @staticmethod def truncate(text, length, suffix="..."): """Truncate text to given length, preserving words.""" # ... implementation ...

    @classmethod def is_palindrome(cls, text): """Check if text reads same forwards/backwards, ignoring case/punctuation.""" # ... implementation ...

    ATOMIC TASKS:
    
  • Write an overview module docstring.
  • Document each method with Args, Returns, Raises, and Examples.
  • Include a "Quick Start" usage example.
  • Create a table of common use cases and which method to use.
  • Format for MkDocs or Sphinx compatibility (using Markdown).
  • Ensure all examples are copy-paste runnable in a Python shell.
  • PASS/FAIL CRITERIA:

    • [PASS] Each method docstring includes at least one working code example.
    • [PASS] The slugify example shows input "Hello World!" and output "hello-world".
    • [PASS] The truncate example demonstrates the suffix parameter.
    • [PASS] The is_palindrome example correctly handles "A man, a plan, a canal: Panama".
    • [PASS] The final output is a single, well-structured Markdown document.
    ITERATION LOGIC:
    • Test each code example by mentally executing it. If slugify("Hello World!") wouldn't produce the claimed output, correct the example or the understanding.
    • Check for a "Raises" section in truncate for negative length. If missing, add it.
    • Ensure the document has a clear table of contents via headers. If it's a wall of text, restructure with headers.
    Explanation: Claude iterates until the documentation is practical, example-driven, and accurate, serving as a reliable reference.

    ---

    11. Refactoring Legacy Code

    Use when: You have working but messy "legacy" code that needs cleaning without breaking

    Ready to try structured prompts?

    Generate a skill that makes Claude iterate until your output actually hits the bar. Free to start.

    R

    Ralphable Team

    Building tools for better AI outputs