My Claude Code Setup
A Comprehensive Guide to AI-Assisted Academic Workflows: Slides, Papers, Analysis, and Beyond
1 Why This Workflow Exists
1.1 The Problem
If you’ve ever done serious academic work — built lecture slides, drafted a research paper, run a data analysis pipeline — you know the pain:
- Context loss between sessions. You pick up where you left off, but Claude doesn’t remember why you chose that notation, what the instructor approved, or which bugs were fixed last time.
- Quality is inconsistent. One slide has perfect spacing; the next overflows. One regression table has proper formatting; the next is missing standard errors. Citations compile in Overleaf but break locally.
- Review is manual and exhausting. You proofread 140 slides by hand. You re-read your paper for the fifth time looking for the same kinds of errors. You miss a typo in an equation. A student or referee catches it.
- No one checks the math. Grammar checkers catch “teh” but not a flipped sign in a decomposition theorem, a misspecified regression, or a broken replication.
This workflow solves all of these problems. You describe what you want — “translate Lecture 5 to Quarto,” “review my paper before submission,” or “analyze this dataset and produce publication-ready tables” — and Claude handles the rest: plans the approach, implements it, runs specialized reviewers, fixes issues, verifies quality, and presents results. Like a contractor who manages the entire job.
1.2 What Makes Claude Code Different
Claude Code runs on your computer with full access to your file system, terminal, and git. It works as a CLI tool, a VS Code extension, or through the Claude Desktop app — same capabilities, same configuration, different interface. Here is what that enables:
| Capability | What It Means for You |
|---|---|
| Read & edit your files | Surgical edits to .tex, .qmd, .R files in place |
| Run shell commands | Compile LaTeX, run R scripts, render Quarto — directly |
| Access git history | Commits, PRs, branches — all from the conversation |
| Persistent memory | CLAUDE.md + MEMORY.md survive across sessions |
| Orchestrator mode | Claude autonomously plans, implements, reviews, fixes, and verifies |
| Multi-agent workflows | 10 specialized agents for proofreading, layout, pedagogy, code review |
| Quality gates | Automated scoring — nothing ships below 80/100 |
This workflow was developed over 6+ sessions building a PhD course on Causal Panel Data. The result: 6 complete lectures (140+ slides each), with Beamer + Quarto versions, interactive Plotly charts, TikZ diagrams, and R replication scripts — all managed by the orchestrator and reviewed by 10 specialized agents across 5 quality dimensions.
While this case study centers on slides, every component — agents, orchestrator, quality gates — works identically for research papers, data analysis, and proposals.
1.3 How It All Works Together
Before diving into setup, here is the key insight: most of this workflow is automatic. You describe what you want in plain English, and Claude figures out which tools to use.
1.3.1 What You Do vs What Happens Automatically
| You Do | Happens Automatically |
|---|---|
| Describe what you want | Claude selects and runs the right skills |
| Approve plans | Orchestrator coordinates agents |
| Review final output | Hooks fire on events (edit, save, compact) |
| Say “commit” when ready | Rules load based on files you touch |
1.3.2 Example: “Fix my slides before tomorrow”
You: "Review my lecture slides and fix all issues before tomorrow's class"
↓
Claude automatically:
→ Runs /proofread (grammar, typos, consistency)
→ Runs /visual-audit (overflow, layout, spacing)
→ Runs /pedagogy-review (narrative flow, notation clarity)
→ Synthesizes findings into prioritized fix list
→ Applies fixes (critical → major → minor)
→ Re-verifies everything compiles
→ Scores against quality gates
↓
You see: "Done. Fixed 12 issues. Score: 88/100. Ready to commit?"
You: "Yes"
↓
Claude runs /commit (only because you explicitly approved)
You: "Review my paper draft and prepare it for submission"
↓
Claude automatically:
→ Runs /review-paper (argument structure, methods, citations)
→ Runs /proofread (grammar, consistency, formatting)
→ Runs /validate-bib (cross-reference citations)
→ Synthesizes findings into prioritized fix list
→ Applies fixes (critical → major → minor)
→ Re-verifies everything compiles
→ Scores against quality gates
↓
You see: "Done. Fixed 18 issues. Score: 91/100. Ready to commit?"
You: "Analyze this dataset and produce publication-ready output"
↓
Claude automatically:
→ Runs /data-analysis (explore, model, tables, figures)
→ Runs /review-r (code quality, reproducibility)
→ Produces R script, tables, and figures
→ Scores against quality gates
↓
You see: "Analysis complete. 3 tables, 4 figures. Score: 85/100."
1.3.3 What You Never Touch Directly
- Agents — Specialized reviewers called by skills, not by you
- Hooks — Fire automatically on events (you never run them)
- Rules — Load automatically based on file paths
1.3.4 Skills: The Only Commands You Might Type
Skills like /proofread or /compile-latex can be invoked two ways:
- Explicitly — You type
/proofread MySlides.tex - Automatically — Claude invokes them when relevant to your request
Most of the time, you just describe what you want and Claude handles the rest. Explicit skill invocation is there when you want precise control.
You talk, Claude orchestrates. The 10 agents, 22 skills, and 18 rules exist so you don’t have to think about them. Describe your goal, approve the plan, and let the system work.
This guide describes the full system — 10 agents, 22 skills, 18 rules. That is the ceiling, not the floor. Start with just CLAUDE.md and 2–3 skills (/compile-latex, /proofread, /commit). Add rules and agents as you discover what you need. The template is designed for progressive adoption: fork it, fill in the placeholders, and start working. Everything else is there when you’re ready.
2 Getting Started
You need two things: fork the repo, and paste a prompt. Claude handles everything else.
That’s it. Everything else — agents, hooks, rules — runs automatically in the background.
Claude seems to ignore the configuration files: Make sure you ran claude from inside the project directory (not a parent folder). Claude reads .claude/ and CLAUDE.md from the current working directory.
Hooks not firing (no notifications, no reminders): Check that Python 3 is installed (python3 --version) and hook files are executable (chmod +x .claude/hooks/*).
“What does plan approval look like?” Claude presents a numbered plan and asks for your input. Say “approved”, “looks good”, or “revise step 3”. That’s it — no special commands needed.
For more, see Troubleshooting in the Appendix.
2.1 Step 1: Fork & Clone
# Fork this repo on GitHub (click "Fork" on the repo page), then:
git clone https://github.com/YOUR_USERNAME/claude-code-my-workflow.git my-project
cd my-projectReplace YOUR_USERNAME with your GitHub username.
2.2 Step 2: Start Claude Code and Paste This Prompt
Open your terminal in the project directory, run claude, and paste the following. Fill in the bolded placeholders with your project details:
Everything in this guide works the same in any Claude Code interface. In VS Code, open the Claude Code panel (click the Claude icon in the sidebar or press Cmd+Shift+P → “Claude Code: Open”). In Claude Desktop, open your project folder and start a local session. Then paste the starter prompt below.
The guide shows terminal commands because they are the most universal way to explain things, but every skill, agent, hook, and rule works identically regardless of which interface you use.
I am starting to work on [PROJECT NAME] in this repo. [Describe your project in 2–3 sentences — what you’re building, who it’s for, what tools you use (e.g., LaTeX/Beamer, R, Quarto).]
I want our collaboration to be structured, precise, and rigorous — even if it takes more time. When creating visuals, everything must be polished and publication-ready. I don’t want to repeat myself, so our workflow should be smart about remembering decisions and learning from corrections.
I’ve set up the Claude Code academic workflow (forked from pedrohcgs/claude-code-my-workflow). The configuration files are already in this repo (.claude/, CLAUDE.md, templates, scripts). Please read them, understand the workflow, and then update all configuration files to fit my project — fill in placeholders in CLAUDE.md, adjust rules if needed, and propose any customizations specific to my use case.
After that, use the plan-first workflow for all non-trivial tasks. Once I approve a plan, switch to contractor mode — coordinate everything autonomously and only come back to me when there’s ambiguity or a decision to make. For our first few sessions, check in with me a bit more often so I can learn how the workflow operates.
Enter plan mode and start by adapting the workflow configuration for this project.
What this does: Claude will read CLAUDE.md and all the rules, fill in your project name, institution, Beamer environments, CSS classes, and project state table, then propose any rule adjustments for your specific use case. You approve the plan, and Claude handles the rest. From there, you just describe what you want to build.
2.3 Optional: Manual Setup
If you prefer to configure things yourself instead of letting Claude handle it:
Customize CLAUDE.md — Open CLAUDE.md and replace all [BRACKETED PLACEHOLDERS]:
- Project name and institution
- Folder structure (adjust to your layout)
- Current project state (your lectures/papers)
- Beamer environments (your custom LaTeX environments)
- CSS classes (your Quarto theme classes)
Create your knowledge base — Open .claude/rules/knowledge-base-template.md and fill in:
- Notation registry — every symbol you use, where it’s introduced, and anti-patterns
- Applications database — datasets, papers, and R packages you reference
- Design principles — what you’ve approved and what you’ve overridden
Configure permissions — Review .claude/settings.json. The template includes permissions for git, LaTeX, Quarto, R, and utility scripts. Add any additional tools you use.
Test it:
# In Claude Code, type:
/compile-latex MyFirstLecture
/proofread Slides/MyFirstLecture.tex
python scripts/quality_score.py Slides/MyFirstLecture.texYou don’t need to fill everything in upfront. Start with 5–10 notation entries and add more as you develop lectures. The starter prompt will set up the essentials — you can always refine later.
2.4 Requirements Specification (For Complex Tasks)
For complex or ambiguous tasks, Claude may ask 3-5 clarifying questions to create a requirements specification before planning. This catches ambiguity early and reduces rework.
You: "Improve my lecture on hypothesis testing"
Claude: "Let me clarify:
1. What needs improvement (clarity, examples, pacing)?
2. Any student feedback to address?
3. Constraints (time, notation changes)?
4. Update which outputs (slides, web version, both)?"
[Creates spec with MUST/SHOULD/MAY priorities]
[You approve]
[Then enters plan mode]
You: "Analyze the survey data"
Claude: "Let me clarify:
1. What's the research question?
2. Expected output format (tables, plots, report)?
3. Any replication targets or baselines?
4. Computational constraints (runtime, memory)?"
You: "Clean up my analysis scripts"
Claude: "Let me clarify:
1. What's the pain point (speed, readability, bugs)?
2. Breaking changes okay or preserve compatibility?
3. Testing infrastructure available?
4. Refactor scope (one file, whole project)?"
After 3-5 questions, Claude creates a specification document in quality_reports/specs/ with:
- MUST have (non-negotiable requirements)
- SHOULD have (preferred features)
- MAY have (optional enhancements)
- Clarity status (CLEAR/ASSUMED/BLOCKED for each aspect)
You approve the spec, then Claude plans implementation. This reduces mid-plan pivots by 30-50%.
Template: templates/requirements-spec.md
3 The System in Action
With setup covered, here is what the system actually does. This section walks through the three core mechanisms that make the workflow powerful: specialized agents, adversarial QA, and automatic quality scoring.
3.1 Why Specialized Agents Beat One-Size-Fits-All
Consider proofreading a 140-slide lecture deck. You could ask Claude:
“Review these slides for grammar, layout, math correctness, code quality, and pedagogical flow.”
Claude will skim everything and catch some issues. But it will miss:
- The equation on slide 42 where a subscript changed from \(m_t^{d=0}\) to \(m_t^0\)
- The TikZ diagram where two labels overlap at presentation resolution
- The R script that uses
k=10covariates but the slide saysk=5
Now compare with specialized agents:
| Agent | Focus | What It Catches |
|---|---|---|
proofreader |
Grammar only | “principle” vs “principal” |
slide-auditor |
Layout only | Text overflow on slide 37 |
pedagogy-reviewer |
Flow only | Missing framing sentence before Theorem 3.1 |
r-reviewer |
Code only | Missing set.seed() |
domain-reviewer |
Substance | Slide says 10,000 MC reps, code runs 1,000 |
Each agent reads the same file but examines a different dimension with full attention. The /slide-excellence skill runs them all in parallel.
3.2 The Adversarial Pattern: Critic + Fixer
The single most powerful pattern in this system is the adversarial QA loop:
+------------------+
| quarto-critic | "I found 12 issues. 3 Critical."
| (READ-ONLY) |
+--------+---------+
|
+----v----+
| Verdict |
+----+----+
/ \
APPROVED NEEDS WORK
| |
Done +----v---------+
| quarto-fixer | "Fixed 12/12 issues."
| (READ-WRITE) |
+----+---------+
|
+----v----------+
| quarto-critic | "Re-audit: 2 remaining."
| (Round 2) |
+----+----------+
|
... (up to 5 rounds)
Why it works: The critic can’t fix files (read-only), so it has no incentive to downplay issues. The fixer can’t approve itself (the critic re-audits). This prevents the common failure of Claude saying “looks good” about its own work.
In Econ 730 Lecture 6, the critic caught that the Quarto version used \cdots (a placeholder) where the Beamer version had the full Hajek weight formula. The fixer replaced it. On re-audit, the critic found 8 more instances of missing (X) arguments on outcome models. After 4 rounds, the Quarto slides matched the Beamer source exactly.
3.3 The Orchestrator: Coordinating Agents Automatically
Individual agents are specialists. Skills like /slide-excellence and /qa-quarto coordinate a few agents for specific tasks. But in day-to-day work, you should not have to think about which agents to run. That is the orchestrator’s job.
The orchestrator protocol (.claude/rules/orchestrator-protocol.md) is an auto-loaded rule that activates after any plan is approved. It implements the plan, runs the verifier, selects review agents based on file types, applies fixes, re-verifies, and scores against quality gates. It loops until the score meets threshold or max rounds are exhausted.
You never invoke the orchestrator manually — it is the default mode of operation for any non-trivial task. Skills remain available for standalone use (e.g., /proofread for a quick grammar check), but the orchestrator handles the full lifecycle automatically. See Pattern 2 for the complete workflow.
3.4 Quality Scoring: The 80/90/95 System
Every file gets a quality score from 0 to 100:
| Score | Threshold | Meaning | Action |
|---|---|---|---|
| 80+ | Commit | Safe to save progress | git commit allowed |
| 90+ | PR | Ready for deployment | gh pr create encouraged |
| 95+ | Excellence | Exceptional quality | Aspirational target |
| < 80 | Blocked | Critical issues exist | Must fix before committing |
3.4.1 How Scores Are Calculated
Points are deducted for issues:
| Issue | Deduction | Why Critical |
|---|---|---|
| Equation overflow | -20 | Math cut off = unusable |
| Broken citation | -15 | Academic integrity |
| Equation typo | -10 | Teaches wrong content |
| Text overflow | -5 | Content cut off |
| Label overlap | -5 | Diagram illegible |
| Notation inconsistency | -3 | Student confusion |
3.4.2 Mandatory Verification
The verification protocol (.claude/rules/verification-protocol.md) requires that Claude compile, render, or otherwise verify every output before reporting a task as complete. The orchestrator enforces this as an explicit step in its loop (Step 2: VERIFY). This means Claude cannot say “done” without actually checking the output.
In Econ 730, verification caught unverified TikZ diagrams that would have deployed with overlapping labels, broken SVGs in Quarto slides that wouldn’t display, and R scripts with missing intercept terms that produced silently wrong estimates.
3.5 Creating Your Own Domain Reviewer
The template includes domain-reviewer.md — a skeleton for building a substance reviewer specific to your field.
3.5.1 The 5-Lens Framework
Every domain can benefit from these five review lenses:
| Lens | What It Checks | Example (Economics) | Example (Physics) |
|---|---|---|---|
| Assumption Audit | Are stated assumptions sufficient? | Is overlap required for ATT? | Is the adiabatic approximation valid here? |
| Derivation Check | Does the math check out? | Do decomposition terms sum? | Do the units balance? |
| Citation Fidelity | Do slides match cited papers? | Is the theorem from the right paper? | Is the experimental setup correctly described? |
| Code-Theory Alignment | Does code implement the formula? | R script matches the slide equation? | Simulation parameters match theory? |
| Logic Chain | Does the reasoning flow? | Can a PhD student follow backwards? | Are prerequisites established? |
To customize, open .claude/agents/domain-reviewer.md and fill in:
- Your domain’s common assumption types
- Typical derivation patterns to verify
- Key papers and their correct attributions
- Code-theory alignment checks for your tools
- Logic chain requirements for your audience
4 The Building Blocks
Understanding the configuration layers helps you customize the workflow and debug when things go wrong. Claude Code’s power comes from five configuration layers that work together — think of them as the operating system for your academic project.
4.1 CLAUDE.md — Your Project’s Constitution
CLAUDE.md is the single most important file. Claude reads it at the start of every session. But here is the critical insight: Claude reliably follows about 100–150 custom instructions. Your system prompt already uses ~50, leaving ~100–150 for your project. CLAUDE.md and always-on rules share this budget.
This means CLAUDE.md should be a slim constitution — short directives and pointers, not comprehensive documentation. Aim for ~120 lines:
- Core principles — 4–5 bullets (plan-first, verify-after, quality gates, LEARN tags)
- Folder structure — where everything lives
- Commands — compilation, deployment, key tools
- Customization tables — Beamer environments, CSS classes
- Current state — what’s done, what’s in progress
- Skill quick reference — table of available slash commands
Move everything else into .claude/rules/ files (with path-scoping so they only load when relevant).
# CLAUDE.MD --- My Course Development
**Project:** Econ 730 --- Causal Panel Data
**Institution:** Emory University
## Core Principles
1. **Plan-first** — enter plan mode before non-trivial tasks
2. **Verify-after** — compile/render and check before reporting done
3. **Quality gates** — 80 to commit, 90 for PR, 95 for excellence
4. **LEARN tags** — persist corrections in MEMORY.md
5. **Single source of truth** — Beamer is authoritative; derive, don't duplicate
## Quick Reference
| Command | What It Does |
|---------|-------------|
| `/compile-latex [file]` | 3-pass XeLaTeX compilation |
| `/proofread [file]` | Grammar/typo review |
| `/deploy [Lecture]` | Render and deploy to GitHub Pages |CLAUDE.md loads every session. If it exceeds ~150 lines, Claude starts ignoring rules silently. Put detailed standards in path-scoped rules (.claude/rules/) instead — they only load when Claude works on matching files, so they don’t compete for attention.
4.2 Rules — Domain Knowledge That Auto-Loads
Rules are markdown files in .claude/rules/ that Claude loads automatically. They encode your project’s standards. The key design principle is path-scoping: rules with a paths: YAML frontmatter only load when Claude works on matching files.
Always-on rules (no paths: frontmatter) load every session. Keep these few and focused:
.claude/rules/
├── plan-first-workflow.md # ~83 lines — plan before you build
├── orchestrator-protocol.md # ~42 lines — contractor mode loop
├── session-logging.md # ~23 lines — three logging triggers
└── meta-governance.md # ~251 lines — template vs working project
Path-scoped rules load only when relevant:
.claude/rules/
├── r-code-conventions.md # paths: ["**/*.R"] — R standards
├── quality-gates.md # paths: ["*.tex", "*.qmd", "*.R"] — scoring
├── verification-protocol.md # paths: ["*.tex", "*.qmd", "docs/"] — verify before done
├── replication-protocol.md # paths: ["scripts/**/*.R"] — replicate first
├── exploration-folder-protocol.md # paths: ["explorations/**"] — sandbox rules
├── orchestrator-research.md # paths: ["scripts/**/*.R", "explorations/**"] — simple loop
└── ...14 path-scoped rules total
The first three always-on rules total ~148 lines of actionable instructions. meta-governance is a reference document for the template’s dual nature (working project vs. public template) and loads passively. Path-scoped rules add rich, domain-specific guidance exactly when Claude needs it.
Sync vs. translate: The beamer-quarto-sync rule handles incremental edits — fix a typo in Beamer, same fix goes to Quarto. The /translate-to-quarto skill is for full initial translation of a new lecture. Translate once, sync thereafter.
Why rules matter: Without them, Claude will use generic defaults. With them, Claude follows your standards consistently across sessions.
4.2.1 Example: Path-Scoped R Code Conventions Rule
---
paths:
- "**/*.R"
- "Figures/**/*.R"
- "scripts/**/*.R"
---# R Code Standards
## Reproducibility
- set.seed() called ONCE at top (YYYYMMDD format)
- All packages loaded at top via library()
- All paths relative to repository root
## Visual Identity
primary_blue <- "#012169"
primary_gold <- "#f2a900"The paths: block means this rule only loads when Claude reads or edits an .R file. When Claude works on a .tex file, this rule doesn’t consume any of the instruction budget.
4.3 Constitutional Governance (Optional)
As your project grows, some decisions become non-negotiable (to maintain quality, reproducibility, or collaboration standards). Others remain flexible.
The templates/constitutional-governance.md template helps you distinguish between:
- Immutable principles (Articles I-V): Non-negotiable rules that ensure consistency
- User preferences: Flexible patterns that can vary by context
4.3.1 Example Articles You Might Define
- Article I: Primary Artifact — Which file is authoritative (e.g.,
.texvs.qmd,.Rmdvs.html, notebook vs script) - Article II: Plan-First Threshold — When to enter plan mode (e.g., >3 files, >30 min, multi-step workflows)
- Article III: Quality Gate — Minimum score to commit (e.g., 80/100, all tests passing)
- Article IV: Verification Standard — What must pass before commit (e.g., compile, tests, render)
- Article V: File Organization — Where different file types live (prevents scattering)
The template includes examples for LaTeX, R, Python, Jupyter, and multi-language workflows.
Use constitutional governance after you’ve established 3-7 recurring patterns that you want to enforce consistently. Don’t create it on day one — let patterns emerge first, then codify them. Skip it for solo projects with evolving standards, or when you prefer case-by-case decisions.
Template: templates/constitutional-governance.md
4.4 Skills — Reusable Slash Commands
Skills are multi-step workflows invoked with /command. Each skill lives in .claude/skills/[name]/SKILL.md:
---
name: compile-latex
description: Compile LaTeX with 3-pass XeLaTeX + bibtex
argument-hint: "[filename without .tex extension]"
---
# Steps:
1. cd to Slides/
2. Run xelatex pass 1
3. Run bibtex
4. Run xelatex pass 2
5. Run xelatex pass 3
6. Check for errors
7. Report resultsSkills you get in the template:
| Skill | Purpose | When to Use |
|---|---|---|
/compile-latex |
Build PDF from .tex | After any Beamer edit |
/deploy |
Render Quarto + sync to docs/ | Before pushing to GitHub Pages |
/proofread |
Grammar and consistency check | Before every commit |
/qa-quarto |
Adversarial Quarto QA | After translating Beamer to Quarto |
/slide-excellence |
Full multi-agent review | Before major milestones |
/create-lecture |
New lecture from scratch | Starting a new topic |
/commit |
Stage, commit, PR, merge | After any completed task |
4.5 Agents — Specialized Reviewers
Agents are the real power of this system. Each agent is an expert in one dimension of quality:
.claude/agents/
+-- proofreader.md # Grammar, typos, consistency
+-- slide-auditor.md # Visual layout, overflow, spacing
+-- pedagogy-reviewer.md # Narrative arc, notation clarity, pacing
+-- r-reviewer.md # R code quality and reproducibility
+-- tikz-reviewer.md # TikZ diagram visual quality
+-- quarto-critic.md # Adversarial Quarto vs Beamer comparison
+-- quarto-fixer.md # Applies critic's fixes
+-- beamer-translator.md # Beamer -> Quarto translation
+-- verifier.md # Task completion verification
+-- domain-reviewer.md # YOUR domain-specific substance review
4.5.1 Agent Anatomy
Each agent file has YAML frontmatter + detailed instructions:
---
name: proofreader
description: Reviews slides for grammar, typos, and consistency
---
# Proofreader Agent
## Role
You are an expert academic proofreader reviewing lecture slides.
## What to Check
1. Grammar and spelling errors
2. Inconsistent notation
3. Missing or broken citations
4. Content overflow (text exceeding slide bounds)
## Report Format
Save findings to: quality_reports/[FILENAME]_report.md
## Severity Levels
- **Critical:** Math errors, broken citations
- **Major:** Grammar errors, overflow
- **Minor:** Style inconsistenciesA single Claude prompt trying to check grammar, layout, math, and code simultaneously will do a mediocre job at all of them. Specialized agents focus on one dimension and do it thoroughly. The /slide-excellence skill runs them all in parallel, then synthesizes results.
4.5.2 Multi-Model Strategy: Cost vs. Quality
Not all agents need the same model. Each agent file has a model: field in its YAML frontmatter. By default, all agents use model: inherit (they use whatever model your main session runs). But you can customize this to optimize cost:
| Task Type | Recommended Model | Why | Examples |
|---|---|---|---|
| Complex translation | model: opus |
Needs deep understanding of both formats | beamer-translator, quarto-critic |
| Fast, constrained work | model: sonnet |
Speed matters more than depth | r-reviewer, quarto-fixer |
| Default | model: inherit |
Uses whatever the main session runs | proofreader, slide-auditor |
The principle: Use Opus for tasks that require holding two large documents in mind simultaneously (translation, adversarial comparison). Use Sonnet for tasks with clear, bounded scope (fix these 12 issues, check this R script). Let everything else inherit.
To change an agent’s model, edit its YAML frontmatter:
---
name: quarto-critic
model: opus # was: inherit
---If you configure model-per-agent, a typical Beamer-to-Quarto translation runs the critic on Opus (2–4 rounds) while the fixer runs on Sonnet (same rounds). This can save roughly 40–60% compared to running everything on Opus, with no quality loss on the fixing step.
4.6 Settings — Permissions and Hooks
.claude/settings.json controls what Claude is allowed to do. Here is a simplified excerpt — the template includes additional permission entries for git, GitHub CLI, PDF tools, and more:
{
"permissions": {
"allow": [
"Bash(git status *)",
"Bash(xelatex *)",
"Bash(Rscript *)",
"Bash(quarto render *)",
"Bash(./scripts/sync_to_docs.sh *)"
]
},
"hooks": {
"Stop": [
{
"hooks": [{
"type": "command",
"command": "python3 \"$CLAUDE_PROJECT_DIR\"/.claude/hooks/log-reminder.py",
"timeout": 10
}]
}
]
}
}The Stop hook runs a fast Python script after every response. No LLM call, no latency. It checks whether the session log is current and reminds Claude to update it if not. Behavioral rules like verification and Beamer-Quarto sync are enforced via auto-loaded rules in .claude/rules/, which is the right tool for nuanced judgment that Claude can evaluate in-context.
4.7 Memory — Cross-Session Persistence
Claude Code has an auto-memory system at ~/.claude/projects/[project]/memory/MEMORY.md. This file persists across sessions and is loaded into every conversation.
Use it for: - Key project facts that never change - Corrections you don’t want repeated ([LEARN:tag] format) - Current plan status
# Auto Memory
## Key Facts
- Project uses XeLaTeX, not pdflatex
- Bibliography file: Bibliography_base.bib
## Corrections Log
- [LEARN:r-code] Package X drops obs silently when covariate is missing
- [LEARN:citation] Post-LASSO is Belloni (2013), NOT Belloni (2014)
- [LEARN:workflow] Every Beamer edit must auto-sync to Quarto4.7.1 Plans — Compression-Resistant Task Memory
While MEMORY.md stores long-lived project facts, plans store task-specific strategy. Every non-trivial plan is saved to quality_reports/plans/ with a timestamp. This means:
- Plans survive auto-compression (they are on disk, not just in context)
- Plans survive session boundaries (readable in any future session)
- Plans create an audit trail of design decisions
See Pattern 1 in Workflow Patterns for the full protocol.
4.7.2 Session Logs — Why-Not-Just-What History (with Automated Reminders)
Git commits record what changed, but not why. Session logs fill this gap. Claude writes to quality_reports/session_logs/ at three points: right after plan approval, incrementally during implementation (as decisions happen), and at session end. This means the log captures reasoning as it happens, before auto-compression can discard it.
Because relying on instructions alone is fragile (Claude forgets during long sessions), a Stop hook (.claude/hooks/log-reminder.py) fires after every response. It tracks how many responses have passed since the session log was last updated. After a threshold, it blocks Claude from stopping until the log is current. This turns a best practice into an enforced behavior.
New sessions can read these logs to understand not just the current state of the project, but the reasoning behind it. See Pattern 1 in Workflow Patterns for the full protocol.
4.7.3 How It All Fits Together
With CLAUDE.md, MEMORY.md, plans, and session logs, the system has four distinct memory layers. Here is what each one does and when it matters:
| Layer | File | Survives Compression? | Updated When | Purpose |
|---|---|---|---|---|
| Project context | CLAUDE.md |
Yes (on disk) | Rarely | Project rules, folder structure, commands |
| Corrections | MEMORY.md |
Yes (on disk) | On [LEARN] tag |
Prevent repeating past mistakes |
| Task strategy | quality_reports/plans/ |
Yes (on disk) | Once per task | Plan survives planning-to-implementation handoff |
| Decision reasoning | quality_reports/session_logs/ |
Yes (on disk) | Incrementally | Record why decisions were made |
| Conversation | Claude’s context window | No (compressed) | Every response | Current working memory |
The first four layers are your safety net. Anything written to disk survives indefinitely. The conversation context is ephemeral — auto-compression will eventually discard details. The workflow’s design ensures that anything worth keeping is written to one of the four persistent layers before compression can erase it.
4.7.4 Hooks — Automated Enforcement
The session log reminder above is one example of a broader pattern: using hooks to enforce rules that Claude might otherwise forget during long sessions. Rules live in context and can be compressed away. Hooks live in .claude/settings.json and fire every time, regardless of context state.
The template includes hooks for logging, notifications, file protection, and context survival:
| Hook | Event | What It Does |
|---|---|---|
| Session log reminder | Stop |
Reminds about session logs after every response |
| Desktop notification | Notification |
Desktop alert when Claude needs attention (macOS/Linux) |
| File protection | PreToolUse[Edit|Write] |
Blocks accidental edits to bibliography and settings |
| Context state capture | PreCompact |
Saves plan state before auto-compaction |
| Context restoration | SessionStart[compact|resume] |
Restores context after compaction or resume |
| Context monitor | PostToolUse[Bash|Task] |
Progressive warnings at 40%/55%/65%/80%/90% context |
| Verification reminder | PostToolUse[Write|Edit] |
Reminds to compile/render before marking done |
Verification and Beamer-Quarto sync are enforced via auto-loaded rules, which are the right tool for nuanced judgment. Hooks are reserved for enforcement that must survive context compression.
Use command-based hooks for fast, mechanical checks (file exists? counter threshold?). Use rules for nuanced judgment (did Claude verify correctly?). Avoid prompt-based hooks that trigger an LLM call on every response — the latency adds up fast.
4.7.5 Context Survival System (Advanced)
When context compaction happens, Claude loses working memory. The context survival system ensures you can recover seamlessly.
4.7.5.1 How It Works
Two hooks work together to preserve and restore state:
Session running → context fills up → PreCompact fires
↓
pre-compact.py saves:
• Active plan path
• Current task
• Recent decisions
↓
Auto-compaction happens
↓
SessionStart(compact|resume) fires
↓
post-compact-restore.py:
• Reads saved state
• Prints context summary
• Claude knows where it left off
4.7.5.2 What Gets Saved
| State | Location | Purpose |
|---|---|---|
| Plan path | Session cache | So Claude can read the plan file |
| Current task | Session cache | First unchecked - [ ] item |
| Recent decisions | Session cache | Last 3 decision-like entries from session log |
| Compaction note | Session log | Timestamp marker for reference |
4.7.5.3 Context Monitoring
The context-monitor.py hook tracks approximate context usage and provides progressive warnings:
| Threshold | Message | Purpose |
|---|---|---|
| 40%, 55%, 65% | Suggest /learn |
Capture non-obvious discoveries before compaction |
| 80% | Info message | Auto-compact approaching, no rush |
| 90% | Caution | Complete current task with full quality |
Use /context-status to check current session health at any time.
Note: the monitor uses tool call count as a proxy for context usage, so warnings may appear earlier or later than actual compaction.
4.7.5.4 Recovery After Compaction
If compaction happens mid-task, Claude will automatically see:
- Restoration message — what plan was active, what task was in progress
- Recovery actions — read the plan, check git status, continue
You can also manually point Claude to the right context:
“We just had compaction. Read
quality_reports/plans/2026-02-06_translate-lecture5.mdand continue from where we left off.”
5 Workflow Patterns
The first two patterns are meta-patterns — they govern how every task flows. Learn these first, then the specific workflows make more sense.
5.1 Pattern 1: Plan-First Development
The plan-first pattern ensures that non-trivial tasks begin with thinking, not typing.
5.1.1 Why Planning Matters
The most common failure mode in AI-assisted development is not bad code — it is solving the wrong problem, or solving the right problem in a fragile order. Plan-first development forces an explicit design step before any file is touched.
Without a plan:
- Claude starts editing immediately, discovers a dependency on slide 3 that changes the approach, and has to undo work
- Context compression discards the reasoning behind a design choice, and Claude makes a contradictory decision later
- The user and Claude have different mental models of what “done” looks like
With a plan:
- The approach is agreed upon before any edits happen
- The plan is saved to disk, so it survives compression and session boundaries
- Implementation has a checklist to follow, reducing drift
5.1.2 The Protocol
Non-trivial task arrives
|
+-- Step 1: Claude enters plan mode (automatic, or say "plan this first")
+-- Step 2: Draft plan (approach, files, verification)
+-- Step 3: Save to quality_reports/plans/YYYY-MM-DD_description.md
+-- Step 4: Present plan to user
+-- Step 5: User approves (or revises)
+-- Step 6: Save initial session log (capture context while fresh)
+-- Step 7: Orchestrator takes over (see Pattern 2)
+-- Step 8: Update session log + plan status to COMPLETED
5.1.3 Context Preservation
Plans are saved to disk specifically so they survive context compression. The rule: avoid /clear — prefer auto-compression. Use /clear only when context is genuinely polluted.
For details on how the system automatically preserves and restores context during compaction, see Context Survival System in the Building Blocks section.
5.1.4 Session Logging
Session logs (quality_reports/session_logs/YYYY-MM-DD_description.md) are a running record of why things happened. They have three distinct behaviors, each solving a different problem:
After plan approval — create the log with the goal, plan summary, and rationale for the chosen approach (including rejected alternatives). This captures decisions while context is richest. If you wait, auto-compression may discard the reasoning.
During implementation — append to the log as you work. Every time a design decision is made, a problem is discovered, or the approach deviates from the plan, write a 1-3 line entry immediately. This is the most important behavior: context gets compressed as the session progresses, and decisions that live only in the conversation will be lost.
At session end — add a final section with what was accomplished, open questions, and unresolved issues.
Git records what; session logs record why. A commit message says “Update Lecture 5 TikZ diagrams.” A session log says “Redesigned the TWFE decomposition diagram because the DA challenge revealed students couldn’t trace the path from weights to bias. Considered a table format but chose a flow diagram because it shows directionality.”
Incremental logging is the key. A 4-hour session that only logs at the start and end loses everything in the middle. Appending decisions as they happen means auto-compression can never erase them — they are already on disk.
Claude writes all three log entries automatically — no need to ask.
For multi-project academics, start each week by asking Claude to read all session logs from the past week and synthesize a status report with priorities and open questions. The session log infrastructure already captures what you need — the weekly review is just a synthesis prompt: “Read all session logs from this week. Summarize: what was accomplished, what’s blocked, what should I prioritize next?”
5.2 Pattern 2: Contractor Mode (Orchestrator)
Once a plan is approved, the orchestrator takes over. It is the natural continuation of Pattern 1: the plan says what, the orchestrator handles how — autonomously.
5.2.1 The Mental Model
Think of the orchestrator as a general contractor. You are the client. You describe what you want. The plan-first protocol is the blueprint phase. Once you approve the blueprint, the contractor takes over: hires the right specialists (agents), inspects their work (verification), sends them back to fix issues (review-fix loop), and only calls you when the job passes inspection (quality gates).
5.2.2 The Loop
User: "Translate Lecture 5 to Quarto"
|
|-- Plan-first (Pattern 1): draft plan, save to disk, get approval
|
|-- User: "Approved"
|
+-- Orchestrator activates:
|
Step 1: IMPLEMENT
| Execute plan steps (create QMD, translate content, etc.)
|
Step 2: VERIFY
| Run verifier: render Quarto, check HTML output
| If render fails -> fix -> re-render
|
Step 3: REVIEW (agents selected by file type)
| +--- proofreader ------+
| +--- slide-auditor ----+ (parallel)
| +--- pedagogy-reviewer +
| +--- quarto-critic ----+ (needs others first)
|
Step 4: FIX
| Apply fixes: Critical -> Major -> Minor
| For quarto-critic issues: invoke quarto-fixer
|
Step 5: RE-VERIFY
| Render again, confirm fixes are clean
|
Step 6: SCORE
| Apply quality-gates rubric
|
+-- Score >= 80?
YES -> Present summary to user
NO -> Loop to Step 3 (max 5 rounds)
5.2.3 Agent Selection
The orchestrator selects agents based on which files were touched:
| Files Modified | Agents Selected |
|---|---|
.tex only |
proofreader + slide-auditor + pedagogy-reviewer |
.qmd only |
proofreader + slide-auditor + pedagogy-reviewer |
.qmd with matching .tex |
Above + quarto-critic (parity check) |
.R scripts |
r-reviewer |
| TikZ diagrams present | tikz-reviewer |
| Domain content | domain-reviewer (if configured) |
| Multiple formats | verifier for cross-format parity |
Agents that are independent of each other run in parallel. The quarto-critic runs after other agents because it may need their context.
5.2.4 “Just Do It” Mode
Sometimes you do not want to approve the final result — you just want it done:
“Translate Lecture 5 to Quarto. Just do it.”
In this mode, the orchestrator still runs the full verify-review-fix loop (quality is non-negotiable), but skips the final approval pause and auto-commits if the score is 80 or above. It still presents the summary so you can see what was done.
5.2.5 Relationship to Existing Skills
The orchestrator does NOT replace skills. It coordinates them:
/qa-quartoremains available as a standalone adversarial QA loop/slide-excellenceremains available for comprehensive multi-agent review/create-lectureremains available as a guided creation workflow
The difference: when you invoke a skill directly, it runs its specific workflow. When the orchestrator is active, it decides which agents to invoke based on context. The orchestrator is the default; skills are for targeted use.
Orchestrator (automatic): “Translate Lecture 5 to Quarto” — the orchestrator figures out the agents.
Skill (explicit): “/qa-quarto Lecture5” — you specifically want the adversarial critic-fixer loop, nothing else.
Both are valid. The orchestrator is the “I trust you, handle it” path. Skills are the “I know exactly what I want” path.
5.3 Pattern 3: Creating a New Lecture
The /create-lecture skill guides you through a structured lecture creation workflow — from gathering source material to deploying polished slides:
/create-lecture
|
+-- Phase 1: Gather materials (papers, outlines)
+-- Phase 2: Design slide structure
+-- Phase 3: Draft Beamer slides
+-- Phase 4: Generate R figures
+-- Phase 5: Polish and verify
| +-- /slide-excellence (domain + visual + pedagogy)
| +-- /proofread (grammar/typos)
| +-- /visual-audit (layout)
+-- Phase 6: Deploy
+-- /translate-to-quarto (optional)
+-- /deploy
5.4 Pattern 4: Translating Beamer to Quarto
Translation preserves all content while adapting format, converting TikZ to SVG and ggplot to interactive Plotly charts:
/translate-to-quarto Lecture5_Topic.tex
|
+-- Phase 1-3: Environment mapping + content translation
+-- Phase 4-5: Figure conversion (TikZ -> SVG)
+-- Phase 6-7: Interactive charts (ggplot -> plotly)
+-- Phase 8-9: Render + verify
+-- Phase 10-11: /qa-quarto adversarial QA
+-- Critic: finds issues
+-- Fixer: applies fixes
+-- Critic: re-audits
+-- ... (until APPROVED or 5 rounds)
5.5 Pattern 5: Replication-First Coding
When working with papers that have replication packages:
Phase 1: Inventory original code
+-- Record "gold standard" numbers (Table X, Column Y = Z.ZZ)
Phase 2: Translate (e.g., Stata -> R)
+-- Match original specification EXACTLY (same covariates, same clustering)
Phase 3: Verify match
+-- Compare every target: paper value vs. our value
+-- Tolerance: < 0.01 for estimates, < 0.05 for SEs
+-- If mismatch: STOP. Investigate before proceeding.
Phase 4: Only then extend
+-- New estimators, new specifications, course-specific figures
In one course, we discovered that a widely-used R package silently produced incorrect estimates due to a subtle specification issue. This bug was caught 3 times in different scripts. Without the replication-first protocol, these wrong numbers would have been taught to PhD students.
5.6 Pattern 6: Multi-Agent Review
The /slide-excellence skill runs up to 6 agents in parallel:
/slide-excellence Lecture5_Topic.tex
|
+-- Agent 1: Visual Audit (slide-auditor)
+-- Agent 2: Pedagogical Review (pedagogy-reviewer)
+-- Agent 3: Proofreading (proofreader)
+-- Agent 4: TikZ Review (tikz-reviewer, if applicable)
+-- Agent 5: Content Parity (if Quarto version exists)
+-- Agent 6: Substance Review (domain-reviewer)
|
+-- Synthesize: Combined quality score + prioritized fix list
5.7 Pattern 7: Self-Improvement Loop
There are two levels of self-improvement: quick corrections via [LEARN] tags and full skill extraction via /learn.
5.7.2 Automated Skill Capture: /learn
For discoveries that deserve more than a one-line tag, use /learn to create a full skill:
/learn fixest-missing-covariate-handling
The /learn skill guides you through a 4-phase workflow:
Phase 1: EVALUATE
"Was this non-obvious? Would future-me benefit?"
→ If YES to any, continue
↓
Phase 2: CHECK EXISTING
Search .claude/skills/ for related skills
→ Nothing related? Create new. Overlap? Update existing.
↓
Phase 3: CREATE SKILL
Write to .claude/skills/[name]/SKILL.md
• Problem statement
• Trigger conditions (exact errors, symptoms)
• Step-by-step solution
• Verification steps
↓
Phase 4: QUALITY GATE
• Description has specific triggers?
• Solution verified to work?
• Specific enough to be actionable?
• General enough to be reusable?
5.7.2.1 When to Use /learn
The context monitor suggests /learn at 40%, 55%, and 65% context usage. Consider extracting a skill when you encounter:
| Trigger | Example |
|---|---|
| Non-obvious debugging | 10+ minute investigation not in docs |
| Misleading errors | Error message was wrong, found real cause |
| Workarounds | Found limitation with creative solution |
| Undocumented APIs | Tool integration not in official docs |
| Trial-and-error | Multiple attempts before success |
| Repeatable workflows | Multi-step task you’d do again |
5.7.2.2 Skill vs. [LEARN] Tag
| Situation | Use |
|---|---|
| One-liner fix | [LEARN:category] tag in MEMORY.md |
| Multi-step workflow | /learn to create full skill |
| Error + root cause + solution | /learn if reusable, [LEARN] if not |
| Package quirk | /learn if affects multiple projects |
Skills saved to .claude/skills/ survive compaction and session boundaries — if you discover something valuable late in a session, extract it with /learn before compaction erases the details.
5.8 Pattern 8: Devil’s Advocate
At any design decision, invoke the Devil’s Advocate:
“Create a Devil’s Advocate. Have it challenge this slide design with 5-7 specific pedagogical questions. Work through each challenge and tell me what survives.”
This catches:
- Unstated assumptions
- Alternative orderings that might work better
- Notation that could confuse students
- Missing intuition before formalism
- Cognitive load issues
A stronger variant: when Claude reviews its own work in the same conversation, it suffers confirmation bias — it has internalized its own reasoning and will systematically find the work acceptable. The fix: spawn a new agent via the Task tool with NO access to the original conversation. Give it only the artifact and a critique prompt. The fresh agent has no sunk cost in the work and will be ruthless.
“Spawn a new agent. Have it read only my paper draft — not our conversation. Ask it to find the 5 weakest points and suggest how a hostile referee would attack each one.”
Like handing your draft to a colleague who wasn’t in the room when you wrote it.
5.9 Research Workflows
Patterns 1–8 apply broadly, with course materials as the primary example. The next four patterns are designed for research projects — papers, simulations, and empirical analysis — where the rhythm is different: ideas are uncertain, experiments may fail, and code is often written to answer a question rather than to ship. Patterns 13–14 then extend the foundation to reproducibility standards and presentation rhetoric.
5.9.1 Pattern 9: Parallel Agents for Research Tasks
Claude Code can spawn multiple agents simultaneously using the Task tool. This is not limited to review — you can use it for any research or analysis task where independent subtasks can run at the same time.
5.9.1.1 When to Use Parallel Agents
| Scenario | Sequential (slow) | Parallel (fast) |
|---|---|---|
| Reviewing a lecture | Run proofreader, then auditor, then pedagogy | Run all 3 simultaneously |
| Analyzing 3 papers for a new lecture | Read paper 1, then 2, then 3 | Spawn 3 agents, each reading one paper |
| Generating figures | Create plot 1, then plot 2, then plot 3 | Spawn agents for independent plots |
| Comparing estimators | Run simulation 1, then 2, then 3 | Spawn agents for each simulation |
| Debating research design | Consider DiD, then SC, then RDD | 3 agents, each advocating one approach |
5.9.1.2 How It Works
You do not need to manage this manually. The orchestrator recognizes independent subtasks in a plan and spawns parallel agents automatically — both during implementation (Step 1) and review (Step 3). For example, if your plan says “read three papers and extract key results,” the orchestrator will spawn 3 agents, one per paper, without you asking.
You can also request parallelism explicitly:
“Read these three papers in parallel. For each, extract the key identification assumption, the main estimator, and whether they have a replication package. Summarize in a table.”
Either way, Claude spawns up to 3 Task agents, each processing one paper simultaneously, then synthesizes the results.
5.9.1.3 Agent Debates
A powerful variant: give each parallel agent a distinct methodological perspective and have them argue. Instead of asking “which estimator should I use?”, spawn 3 agents — one advocates for DiD, one for synthetic control, one for RDD — each arguing why their approach fits your research question best and critiquing the others. Synthesize the debate into a decision matrix. This produces genuinely diverse perspectives that a single conversation cannot, because each agent commits fully to its position.
5.9.1.4 Practical Limits
- 3 agents is the sweet spot. More than that increases overhead without proportional speedup.
- Agents are independent — they cannot see each other’s work. If task B depends on task A’s output, they must run sequentially.
- Each agent consumes its own context window. For very large files, sequential processing may be more reliable.
Parallel agents multiply token usage. For cost-sensitive tasks, run the expensive work (Opus agents) sequentially and the cheap work (Sonnet agents) in parallel. The orchestrator already does this: it runs Sonnet-level reviewers in parallel, then the Opus-level critic sequentially.
5.9.2 Pattern 10: Research Exploration Workflow
The exploration workflow provides a structured sandbox for experimental work.
5.9.2.1 The Problem
Without structure, experimental code scatters across the repository: analysis scripts in scripts/, test files in root, comparison documents in quality_reports/. After a week of exploration, the repo is cluttered with files that may or may not be useful, and nobody remembers which version was the good one.
5.9.2.2 The Solution: Exploration Folder
All experimental work goes into explorations/ first:
explorations/
├── [active-project]/
│ ├── README.md # Goal, hypotheses, status
│ ├── R/ # Code iterations (_v1, _v2)
│ ├── scripts/ # Test scripts
│ └── output/ # Results
└── ARCHIVE/
├── completed_[name]/ # Graduated to production
└── abandoned_[name]/ # Documented why stopped
5.9.2.3 Fast-Track vs. Plan-First
The decision tree is simple:
| Question | Answer | Workflow |
|---|---|---|
| “Will this ship?” | YES | Plan-First (80/100 quality) |
| “Am I testing an idea?” | YES | Fast-Track (60/100 quality) |
| “Does this improve the project?” | NO | Don’t build it |
Fast-Track explorations skip formal planning. Instead, a 2-minute research value check gates the work: “Does this improve the paper/slides/analysis?” If the answer is “maybe”, explore. If “no”, skip. If “yes”, use Plan-First rigor.
5.9.2.4 The Lifecycle
Research value check (2 min)
↓
Create explorations/[project]/ (5 min)
↓
Code without overhead (60/100 quality)
↓
Decision point (1-2 hours):
├── Graduate → Move to R/, scripts/, tests/ (upgrade to 80/100)
├── Keep exploring → Stay in explorations/
└── Abandon → Archive with brief explanation
The kill switch is explicit: at any point, you can stop, archive with a one-paragraph explanation, and move on. No guilt, no sunk cost. See .claude/rules/exploration-folder-protocol.md and .claude/rules/exploration-fast-track.md for the full protocols.
5.9.2.5 Simplified Orchestrator for Research
The full orchestrator (Pattern 2) is designed for course materials with multi-agent review loops. For research projects, the simple variant strips this down to: implement → verify → score → done. No multi-round reviews, no parallel agent spawning. This lives in its own path-scoped rule (.claude/rules/orchestrator-research.md) that loads only when working on R scripts or explorations.
5.9.2.6 Merge-Only Quality Reporting
In research projects, commits are frequent and incremental. Generating a quality report for each commit creates noise. Instead, quality reports are generated only at merge time — a permanent snapshot of what was merged and why. Session logs capture the ongoing reasoning. See .claude/rules/session-logging.md.
5.9.3 Pattern 11: Research Skills
Five skills support the research workflow beyond slide development:
| Skill | What It Does | When to Use |
|---|---|---|
/lit-review [topic] |
Search, synthesize, and identify gaps in the literature | Starting a new project or section |
/research-ideation [topic] |
Generate research questions, hypotheses, and empirical strategies | Brainstorming phase |
/interview-me [topic] |
Interactive interview to formalize a vague idea into a concrete specification | When you have an intuition but not a plan |
/review-paper [file] |
Full manuscript review with referee objections | Before submission or after a draft |
/data-analysis [data] |
End-to-end R analysis: explore, regress, produce publication-ready output | Empirical analysis phase |
These skills produce structured reports saved to quality_reports/. The /data-analysis skill also generates R scripts (saved to scripts/R/) and runs the r-reviewer agent automatically.
5.9.3.1 The Research Lifecycle as a Dependency Graph
A research project is not a waterfall — it is a dependency graph. Some phases run in parallel; others are strictly sequential:
/research-ideation ─────┐
├──→ /lit-review ──→ /data-analysis ──→ /review-paper
/interview-me ──────────┘ ↑ ↑
│ │
(can run in parallel) │
│
(enter mid-pipeline: ───┘
start with data and
work backwards)
Enter mid-pipeline. You do not have to start from ideation. If you already have data, start with /data-analysis and work backwards to the research question. If you already have a draft, start with /review-paper. The skills are modular — use what you need, skip what you don’t.
For a production-grade paper pipeline, a dedicated fork takes these same skills and wraps them in full research infrastructure: 15 adversarial worker-critic agent pairs, simulated blind peer review, weighted aggregate scoring, journal targeting, and R&R response routing. If your primary output is research papers, see The Ecosystem for details.
5.9.4 Pattern 12: Branch Isolation with Git Worktrees (Advanced)
This pattern is optional and primarily useful for major translations, risky refactors, or multi-day projects. Most day-to-day work doesn’t need it.
Git worktrees create a separate working directory linked to the same repository. Each directory has its own branch but shares commit history.
your-project/ ← main branch (stays clean)
.worktrees/lecture-06-quarto/ ← isolated branch (Claude works here)
5.9.4.1 Why Use Worktrees?
| Benefit | Example |
|---|---|
| Safe experimentation | Translate Lecture 6 to Quarto — if it fails, main is untouched |
| Clean history | 50 intermediate commits squash into one clean commit |
| Easy discard | Wrong approach? Delete worktree, no trace in main |
| Multi-session work | Resume worktree next day, no context loss |
| Parallel work | Work on slides (main) while Claude translates (worktree) |
5.9.4.2 The Workflow
1. CREATE WORKTREE
git worktree add .worktrees/lecture-06-quarto -b quarto/lecture-06
cd .worktrees/lecture-06-quarto
↓
2. IMPLEMENT
All changes happen in the worktree
Commit frequently (intermediate commits are OK)
↓
3. VERIFY
Run tests, render, review against worktree only
↓
4. SYNC TO MAIN (when ready)
git checkout main
git merge --squash quarto/lecture-06
git commit -m "feat: add Lecture 6 Quarto version"
↓
5. CLEANUP
git worktree remove .worktrees/lecture-06-quarto
git branch -d quarto/lecture-06
5.9.4.3 Commands Reference
# Create a worktree with new branch
git worktree add .worktrees/[name] -b [branch-name]
# List active worktrees
git worktree list
# Remove a worktree (after merging or abandoning)
git worktree remove .worktrees/[name]
# Delete the branch (after removal)
git branch -d [branch-name]
# Squash-merge into main
git checkout main
git merge --squash [branch-name]
git commit -m "feat: description of changes"5.9.4.4 When to Use
| Situation | Use Worktree? |
|---|---|
| Quick fix to one file | No — just edit main |
| New lecture creation | Maybe — if multi-session |
| Beamer → Quarto translation | Yes — many intermediate states |
| Major refactor | Yes — safe rollback |
| Experimenting with new approach | Yes — easy discard |
5.9.4.5 Complexity Cost
- Adds ~3 commands to learn
- Adds mental model: “Where am I working?”
- Requires discipline to sync/discard, not leave orphan worktrees
For most novice users, working directly on main with frequent commits is simpler and sufficient. Use worktrees when the benefits of isolation outweigh the added complexity.
5.10 Advanced Patterns: Reproducibility and Presentation Design
The patterns above use slides as the primary example, but the infrastructure is domain-agnostic. The next two patterns address dimensions no existing pattern covers: reproducibility standards and presentation rhetoric.
5.10.1 Pattern 13: Reproducibility & Replication Compliance
Pattern 5 covers matching someone else’s results before extending them. This pattern is the complement: packaging your own work so that others — and journal data editors — can verify it.
5.10.1.1 The AEA Data Editor Standard
The Template README for Social Science Replication Packages is the compliance standard for the AEA, Review of Economic Studies, Economic Journal, and other major journals. It requires eight structured sections:
| Section | What It Covers |
|---|---|
| Overview | What the code does, data sources, software, runtime |
| Data Availability Statements | Provenance, access rights, redistribution permissions for every data source |
| Dataset List | Every data file: source, format, whether provided |
| Computational Requirements | Software versions, packages, random seeds, memory, runtime |
| Description of Programs | Directory structure, execution order, dependencies |
| Instructions to Replicators | Numbered steps — ideally one command |
| Table/Figure Mapping | Every exhibit mapped to the specific program and line that generates it |
| References | Proper bibliographic citations for all data sources |
5.10.1.2 Pre-Submission Checklist
Documentation:
Code:
Data:
Verification:
5.10.1.3 Recommended Directory Structure
project/
├── README.pdf # Following AEA template
├── LICENSE.txt # Code: MIT/BSD; Data: CC-BY
├── data/
│ ├── raw/ # Source data (untouched)
│ └── derived/ # Processed/analysis data
├── code/
│ ├── 00_setup.R # Install dependencies
│ ├── 01_dataprep/ # Data cleaning
│ ├── 02_analysis/ # Main results
│ └── 03_appendix/ # Appendix results
└── results/ # Output tables, figures
The orchestrator applies these standards automatically: ask Claude to “prepare a replication package” and the replication-protocol rule activates, enforcing the directory structure and checklist above. No manual invocation needed — the path-scoped rule fires whenever Claude works on replication-related files.
A key principle that maps directly to this workflow: separate scientific reasoning from computational execution. Humans design diagnostic templates (what to measure); AI handles execution (how to run it).
This is the template-executor architecture — and you are already using it:
- Your spec (requirements specification) = the template. It says what must be true.
- The orchestrator = the executor. It handles how to make it true.
- Plans = why decisions were made (audit trail)
- Session logs = reasoning documentation
- Git = what changed and when
- MEMORY.md = corrections and accumulated learning
The /learn skill already implements version-controlled knowledge accumulation: each discovery saved as a SKILL.md file with problem, solution, and verification steps — the same pattern sometimes called “structured knowledge bases.”
Key insight: “For a fixed pipeline version and fixed inputs, the workflow produces identical numerical outputs and retains a complete audit trail of intermediate artifacts and logs.”
5.10.2 Pattern 14: The Rhetoric of Decks
The slide-auditor checks technical quality (overflow, spacing). The pedagogy-reviewer checks teaching quality (notation density, prerequisites). Neither addresses rhetorical quality — whether the slides persuade, whether the argument flows, whether beauty serves function.
The Rhetoric of Decks framework fills this gap.
5.10.2.1 The Three Laws
Law 1: Beauty is function. Beautiful slides are not decorated slides. Beauty is clarity made visible. Every element earns its presence. Nothing distracts from the point. “Decoration without function is noise.”
Law 2: Cognitive load is the enemy. One idea per slide. ONE. This is not a guideline — this is the law. The audience has limited working memory. Every unnecessary word, data point, or “just in case” inclusion steals bandwidth from the actual message.
Law 3: The slide serves the spoken word. “If your slides can be understood without you speaking, you have written a document and called it a presentation.” The slide is a visual anchor for speech — a focal point for attention, a memory hook for retention.
5.10.2.2 The MB/MC Equivalence
The most original contribution of this framework — applying marginal analysis to slide design:
Optimal rhetoric equalizes the marginal benefit to marginal cost ratio across all slides: MB₁/MC₁ = MB₂/MC₂ = … = MBₙ/MCₙ
What this means in practice:
- Overloaded slides (MB/MC too low): text running into footer, competing ideas, audience gives up
- Underloaded slides (MB/MC too high): wasted opportunity, attention captured but unused
- The goal is smoothness — consistent cognitive load throughout — not maximum density
- Exception: deliberate “jump scares” — intentional spikes for rhetorical effect (a striking statistic, a provocative claim)
5.10.2.3 Actionable Principles
| Principle | Why | Anti-Pattern |
|---|---|---|
| Titles are assertions | “Treatment increased distance by 61 miles” carries the argument | “Results” tells the audience nothing |
| Bullets are defeat | A list says “I couldn’t find the structure” | Find the sequence, contrast, hierarchy, or causal chain |
| White space signals confidence | Crowded slides signal fear — fear of silence, fear of forgetting | Filling every pixel with text |
| Direct labels, not legends | Legends force the eye to travel; labels stay with the data | Color-coded legends requiring a key |
| One message per chart | If you can’t explain it in one sentence, it’s too complex | Multi-panel figures with competing stories |
| Min 24pt body, max 2 fonts | Sans-serif for projection; test from the back row | 12pt text, decorative fonts |
5.10.2.4 How Existing Agents Support This
The /slide-excellence skill already invokes the pedagogy-reviewer and slide-auditor, which enforce many of these principles automatically. To enforce all of them — including title-as-assertion and MB/MC smoothness — customize the domain-reviewer agent (.claude/agents/domain-reviewer.md) with rhetoric-of-decks lenses. The orchestrator will then apply them during every review cycle without manual invocation.
For the complete philosophical treatment — from Aristotle’s three modes of persuasion through neuroaesthetics and the Netflix analogy — see The Rhetoric of Decks. The repository includes a full essay, example Beamer decks with professional color palettes, a theme_rhetoric() ggplot2 theme, and a tested deck generation prompt for Claude Code.
6 The Ecosystem: What Others Have Built
This repository provides the foundation — the infrastructure patterns (plan-first, orchestrator, quality gates, adversarial review, context survival) that work for any academic task. Others have taken these patterns further, building specialized workflows for specific needs. Here are the principles these projects share and how to apply them:
| Principle | Source | How to Implement Here |
|---|---|---|
| Adversarial review (not self-review) | All | Use fresh-context critique (Pattern 8) or worker-critic pairs |
| Structured intermediate files | Xu & Yang | Save every computed object to disk; agents communicate via files |
| Phase-appropriate rigor | clo-author | Light review for exploration (60/100), full adversarial for submission (95/100) |
| Voice preservation | claudeblattman | Maintain a reference doc with your writing style; load as context |
| Template-executor separation | Xu & Yang | Spec = what to measure, orchestrator = how to execute |
| Self-improving configuration | claudeblattman | Use /learn to capture discoveries; review MEMORY.md periodically |
| Human judgment, AI execution | Xu & Yang | You design the diagnostic; Claude runs it |
| Beauty is function | MixtapeTools | Every visual element earns its presence; decoration without function is noise |
Here is what each project does and when you should use it.
6.2 claudeblattman: Workflows for Non-Technical Academics
Website: claudeblattman.com Repository: chrisblattman/claudeblattman Author: Chris Blattman (University of Chicago)
claudeblattman is a comprehensive guide for academics who do not write code, built by a political economist who describes himself as someone who “has never written a line of code.” It demonstrates that Claude Code workflows extend far beyond technical tasks into daily academic life.
What it adds:
- Executive assistant workflows — morning briefings (weather, calendar, inbox, VIP tracking), smart email triage with 14 phases, daily check-in ritual, schedule queries, todo management
- Proposal writing — donor profiles, voice packs (maintain consistent writing style across documents), template gates, resubmission handling with reviewer comment categorization
- Fresh-context critique — the intellectual centerpiece: spin up a fresh-context agent to review your work without self-bias (see Pattern 8)
- Agent debates — multiple agents with distinct identities argue about research design, producing genuinely novel perspectives (see Pattern 9)
- Tips pipeline — self-improving system: capture tips by emailing yourself,
/tips-curatequality-filters them,/tips-integrateconverts them into concrete configuration changes - Depth calibration — Light/Standard/Deep thoroughness levels that prevent over-engineering simple requests
- Graceful degradation — every skill works with partial infrastructure. Missing MCP integrations produce explanations, not errors
- Writing style rules — numbers over adjectives, topic sentences make claims, no throat-clearing, hedge only with a reason or number
When to use it: You are new to Claude Code, want practical daily workflows beyond coding, or want to see how an academic non-programmer built a sophisticated system.
6.3 Xu & Yang (2026): Reproducibility as Architecture
Paper: Yiqing Xu and Leo Yang Yang, “Scaling Reproducibility: An AI-Assisted Workflow for Large-Scale Reanalysis,” Stanford University, 2026.
This paper formalizes many principles that this workflow uses intuitively. It demonstrates an AI-assisted pipeline that achieved 100% reproducibility across 92 papers (215 specifications) — conditional on accessible data and code — with each paper processed in under four minutes.
Key principles:
- Template-executor separation — humans design diagnostic templates (what to measure), AI handles execution (how to run it). Maps to our spec-then-plan workflow.
- Three-layer architecture — LLM orchestrator (coordination) → skill descriptions and knowledge bases (contracts and accumulated experience) → deterministic agent code (numerical work). Maps to our orchestrator → skills/rules → agents.
- Structured intermediate files — agents communicate through standardized files on disk (JSON, CSV, logs), not hidden state. Ensures every step is inspectable and rerunnable.
- Version-controlled knowledge accumulation — SKILL.md files with Context/Problem/Fix/Impact format. Maps to our
/learnskill. - Adaptation between runs, not during runs — fixes are incorporated as version-controlled updates between sessions, never as ad hoc patches within a session. This ensures reproducibility.
When to reference it: You are designing a reproducibility workflow, building a replication package, or want to formalize the principles underlying this guide’s architecture.
6.4 MixtapeTools: The Rhetoric of Decks
Repository: scunning1975/MixtapeTools Author: Scott Cunningham (Baylor University), author of Causal Inference: The Mixtape
MixtapeTools provides the philosophical and practical framework for academic presentation design (see Pattern 14). Beyond the Rhetoric of Decks, it includes:
- Referee 2 — a systematic 5-audit adversarial protocol for reviewing and replicating empirical work
- Deck generation prompt — a tested, customizable multi-agent prompt for creating Beamer decks (builder → rhetoric reviewer → graphics specialist)
- Example decks with professional color palettes, custom ggplot2 themes (
theme_rhetoric()), and complete Beamer templates - Zero-warning compilation standard — even 0.5pt overfull hbox must be fixed
When to use it: You want to make your presentations genuinely beautiful and rhetorically effective, or you want a tested deck generation workflow for Claude Code.
6.5 AEA Data Editor Template
Website: social-science-data-editors.github.io/template_README Repository: social-science-data-editors/template_README Maintainer: Lars Vilhuber (Cornell University) and editors from REStat, EJ, CJE
The compliance standard for replication packages at 5+ major economics journals (see Pattern 13). Available in Markdown, Word, LaTeX, and PDF formats.
When to use it: You are preparing a replication package for journal submission and need the exact template that data editors will check against.
7 Customizing for Your Domain
7.1 Step 1: Build Your Knowledge Base
The knowledge base (.claude/rules/knowledge-base-template.md) is the most domain-specific component. It provides skeleton tables for notation conventions, lecture progression, applications, design principles, anti-patterns, and R code pitfalls. Fill them in as you develop your project — you don’t need everything upfront.
7.1.1 Notation Registry
| Symbol | Meaning | Introduced | Anti-Pattern |
|--------|---------|------------|-------------|
| $\beta$ | Regression coefficient | Lecture 1 | Don't use $b$ |
| $\hat{\theta}$ | Estimator | Lecture 2 | Don't use $\hat{\beta}$ for different estimand |7.1.2 Applications Database
| Application | Paper | Dataset | Package | Lecture |
|------------|-------|---------|---------|--------|
| Minimum Wage | Card & Krueger (1994) | NJ/PA fast food | `fixest` | 3 |7.1.3 Validated Design Principles
| Principle | Evidence | Lectures Applied |
|-----------|----------|-----------------|
| Motivation before formalism | DA challenge: "students lost" | All |
| Max 3 new symbols per slide | Pedagogy review caught overload | 2, 4 |7.2 Step 2: Create Your Domain Reviewer
Copy .claude/agents/domain-reviewer.md and customize the 5 lenses for your field. The template provides the structure; you fill in domain-specific checks.
7.3 Step 3: Adapt Your Theme
The template includes an example Quarto theme SCSS file. To customize:
- Change the color palette to your institution’s colors
- Update CSS class names if needed
- Modify the beamer-translator environment mapping to match your classes
7.4 Step 4: Creating Custom Skills
The guide includes 22 skills for common academic tasks. But if you have repetitive workflows specific to your domain, you can create your own.
7.4.1 When to Create a Skill
Create a skill when: - You repeatedly explain the same 3+ step workflow to Claude - You need domain-specific quality checks (citation style, notation consistency, lab protocols) - You enforce field-specific output formats (thesis structure, journal templates) - You coordinate multi-tool workflows (data → analysis → manuscript)
Don’t create a skill for: - One-time tasks - Workflows that change frequently - Simple 1-2 step operations
7.4.2 Skill Structure
Each skill is a directory in .claude/skills/ with a SKILL.md file:
---
name: your-skill-name
description: [What it does] + [When to use] + [Key capabilities]
argument-hint: "[brief hint for user]"
allowed-tools: ["Read", "Write", "Edit", "Bash", "Task"]
---
# Your Skill Name
## Instructions
Step 1: [First action with details]
Step 2: [Second action]
...
## Examples
Example 1: [Common scenario]
...
## Troubleshooting
Error: [Common error]
Solution: [How to fix]7.4.3 Writing Effective Trigger Descriptions
The description field determines when Claude loads your skill. Use specific trigger phrases users would actually say:
Good (Citation Style Enforcement):
description: Enforces APA 7th edition citation format. Use when user asks to "check citations", "fix references", "apply APA style", or when reviewing .tex/.qmd files with bibliographies.Good (Lab Notebook Entry):
description: Generates structured lab notebook entries from experimental notes. Use when user provides "experiment notes", "protocol results", or asks to "format lab entry".Bad (Too Vague):
description: Helps with citations7.4.4 Domain-Specific Examples
Regression Output Formatter
Converts R regression outputs to publication-ready LaTeX tables with proper formatting (standard errors in parentheses, significance stars, fixed effects rows).
Trigger: User runs regressions and says “make a table”, “format results”, “export to LaTeX”
Tools: Read, Write, Bash (to run R scripts)
Protocol Validator
Validates lab protocols against safety and reproducibility standards. Checks for: required sections (materials, procedure, safety), quantitative specifications, controls, and replication details.
Trigger: User provides protocol documents, asks “check protocol”, “validate procedure”
Tools: Read, Write
Citation Cross-Reference Checker
Cross-references in-text citations against bibliography entries. Identifies missing entries, unused references, and formatting inconsistencies.
Trigger: User asks “check citations”, “validate references”, when working on manuscripts
Tools: Read, Grep, Glob, Write
7.4.5 Quick Start
Copy the template:
mkdir -p .claude/skills/your-skill-name cp templates/skill-template.md .claude/skills/your-skill-name/SKILL.mdCustomize for your domain:
- Replace trigger phrases with your field’s terminology
- Add domain-specific file types and tools
- Include field conventions and common errors
Test the skill:
- Skills hot-reload automatically — changes are detected without restarting
- Use one of your trigger phrases
- Verify the skill loads and produces correct output
Iterate:
- If skill doesn’t trigger: Revise description with more specific phrases
- If instructions unclear: Add more examples
- If output wrong: Add validation steps
Full template: See templates/skill-template.md for comprehensive examples from biology, economics, and physics.
The /deep-audit skill was itself extracted from a repeating workflow using /learn. After running 7 rounds of manual consistency audits — each time launching 4 parallel agents to check guide accuracy, hook code quality, skills/rules consistency, and cross-document counts — the pattern was codified into a skill. Now /deep-audit launches those same 4 agents, triages findings, applies fixes, and loops until clean (max 5 rounds). It also encodes a table of known bug patterns from past audits so future rounds catch regressions faster.
This is the /learn lifecycle in action: discover a repeating workflow → extract it → never repeat the manual steps again.
7.5 Tips from 6+ Sessions of Iteration
- Keep CLAUDE.md under 150 lines. Claude follows ~150 instructions reliably. A 400-line CLAUDE.md means rules get silently ignored. Use path-scoped rules for detailed standards.
- Add rules incrementally. Don’t try to write all rules upfront. Add them when you discover patterns. Use
paths:frontmatter so they only load when relevant. - Use the [LEARN] format. Every correction gets tagged and persisted in MEMORY.md. This prevents repeating mistakes across sessions.
- Trust the adversarial pattern. The critic-fixer loop catches things you won’t. Let it run.
- Verify everything. The verification rule exists for a reason. Never skip compilation or rendering checks.
- Session logs matter. Document design decisions, not just what changed. Future-you will thank present-you.
- Devil’s Advocate early. Challenge slide structure before you’ve built 50 slides on a shaky foundation.
- Progressive disclosure. Start with CLAUDE.md + 2–3 rules. Add more as your workflow matures. Newcomers should not face 18 rules on day one.
- Use
CLAUDE.local.mdfor personal overrides. This file is automatically gitignored and loaded alongsideCLAUDE.md. Put machine-specific paths, personal preferences, and local tool versions here — they won’t pollute the shared repo.
For capabilities beyond file editing and shell commands — web search during literature review, database queries for replication, or reference manager integration (Zotero, Mendeley) — Claude Code supports MCP servers. Configure them in .claude/settings.json under "mcpServers". Start with skills and agents first; add MCP when you need external integrations.
8 Appendix: File Reference
8.1 All Agents
| Agent | File | Purpose |
|---|---|---|
| Proofreader | .claude/agents/proofreader.md |
Grammar, typos, consistency |
| Slide Auditor | .claude/agents/slide-auditor.md |
Visual layout, overflow, spacing |
| Pedagogy Reviewer | .claude/agents/pedagogy-reviewer.md |
Narrative arc, notation clarity |
| R Reviewer | .claude/agents/r-reviewer.md |
R code quality, reproducibility |
| TikZ Reviewer | .claude/agents/tikz-reviewer.md |
Diagram visual quality |
| Beamer Translator | .claude/agents/beamer-translator.md |
LaTeX to Quarto translation |
| Quarto Critic | .claude/agents/quarto-critic.md |
Adversarial Quarto QA |
| Quarto Fixer | .claude/agents/quarto-fixer.md |
Applies critic’s fixes |
| Verifier | .claude/agents/verifier.md |
Task completion verification |
| Domain Reviewer | .claude/agents/domain-reviewer.md |
Your domain-specific review |
8.2 All Skills
| Skill | Directory | Purpose |
|---|---|---|
/compile-latex |
.claude/skills/compile-latex/ |
XeLaTeX 3-pass compilation |
/deploy |
.claude/skills/deploy/ |
Quarto render + GitHub Pages sync |
/extract-tikz |
.claude/skills/extract-tikz/ |
TikZ to SVG conversion |
/proofread |
.claude/skills/proofread/ |
Run proofreading agent |
/visual-audit |
.claude/skills/visual-audit/ |
Run layout audit agent |
/pedagogy-review |
.claude/skills/pedagogy-review/ |
Run pedagogy review agent |
/review-r |
.claude/skills/review-r/ |
Run R code review agent |
/qa-quarto |
.claude/skills/qa-quarto/ |
Critic-fixer adversarial loop |
/slide-excellence |
.claude/skills/slide-excellence/ |
Combined multi-agent review |
/translate-to-quarto |
.claude/skills/translate-to-quarto/ |
Beamer to Quarto translation |
/validate-bib |
.claude/skills/validate-bib/ |
Bibliography validation |
/devils-advocate |
.claude/skills/devils-advocate/ |
Design challenge questions |
/create-lecture |
.claude/skills/create-lecture/ |
Full lecture creation |
/commit |
.claude/skills/commit/ |
Stage, commit, PR, and merge |
/lit-review |
.claude/skills/lit-review/ |
Literature search and synthesis |
/research-ideation |
.claude/skills/research-ideation/ |
Research questions and strategies |
/interview-me |
.claude/skills/interview-me/ |
Interactive research interview |
/review-paper |
.claude/skills/review-paper/ |
Manuscript review |
/data-analysis |
.claude/skills/data-analysis/ |
End-to-end R analysis |
/learn |
.claude/skills/learn/ |
Extract discoveries into persistent skills |
/context-status |
.claude/skills/context-status/ |
Show session health and context usage |
/deep-audit |
.claude/skills/deep-audit/ |
Repository-wide consistency audit |
8.3 All Rules
Always-on (load every session):
| Rule | File | Purpose |
|---|---|---|
| Plan-First Workflow | plan-first-workflow.md |
Plan mode + context preservation |
| Orchestrator Protocol | orchestrator-protocol.md |
Contractor mode loop |
| Session Logging | session-logging.md |
Three logging triggers |
| Meta-Governance | meta-governance.md |
Template vs working project distinctions |
Path-scoped (load only when working on matching files):
| Rule | File | Triggers On |
|---|---|---|
| Verification Protocol | verification-protocol.md |
.tex, .qmd, docs/ |
| Single Source of Truth | single-source-of-truth.md |
Figures/, .tex, .qmd |
| Quality Gates | quality-gates.md |
.tex, .qmd, *.R |
| R Code Conventions | r-code-conventions.md |
*.R |
| TikZ Quality | tikz-visual-quality.md |
.tex |
| Beamer-Quarto Sync | beamer-quarto-sync.md |
.tex, .qmd |
| PDF Processing | pdf-processing.md |
master_supporting_docs/ |
| Proofreading Protocol | proofreading-protocol.md |
.tex, .qmd, quality_reports/ |
| No Pause | no-pause-beamer.md |
.tex |
| Replication Protocol | replication-protocol.md |
*.R |
| Knowledge Base | knowledge-base-template.md |
.tex, .qmd, *.R |
| Orchestrator Research | orchestrator-research.md |
*.R, explorations/ |
| Exploration Folder | exploration-folder-protocol.md |
explorations/ |
| Exploration Fast-Track | exploration-fast-track.md |
explorations/ |
8.4 Hooks
| Hook | Type | Configuration |
|---|---|---|
| Session log reminder | Stop (command) | .claude/hooks/log-reminder.py |
| Desktop notification | Notification (command) | .claude/hooks/notify.sh |
| File protection | PreToolUse (command) | .claude/hooks/protect-files.sh |
| Context state capture | PreCompact (command) | .claude/hooks/pre-compact.py |
| Context restoration | SessionStart[compact|resume] (command) | .claude/hooks/post-compact-restore.py |
| Context monitor | PostToolUse[Bash|Task] (command) | .claude/hooks/context-monitor.py |
| Verification reminder | PostToolUse[Write|Edit] (command) | .claude/hooks/verify-reminder.py |
8.5 Troubleshooting
8.5.1 LaTeX Won’t Compile
Symptom: xelatex errors or missing packages.
Fix: 1. Check you have XeLaTeX installed: which xelatex 2. Ensure TEXINPUTS includes Preambles/: the /compile-latex skill handles this 3. Missing package? Install via TeX Live: tlmgr install [package]
8.5.2 Quarto Won’t Render
Symptom: quarto render fails or produces broken HTML.
Fix: 1. Check Quarto version: quarto --version (need 1.3+) 2. Check for syntax errors in YAML frontmatter 3. Missing TikZ SVGs? Run /extract-tikz first
8.5.3 Hooks Not Firing
Symptom: No context warnings, no verification reminders.
Fix: 1. Check hooks are configured: cat .claude/settings.json | grep hooks 2. Ensure Python 3 is available: which python3 3. Check hook file permissions: ls -la .claude/hooks/
8.5.4 Claude Ignores Rules
Symptom: Claude doesn’t follow conventions in .claude/rules/.
Fix: 1. Rules use paths: frontmatter — check the path matches your files 2. Too many rules? Claude follows ~150 instructions reliably. Consolidate. 3. Try: “Read .claude/rules/[rule].md and follow it for this task”
8.5.5 Context Lost After Compaction
Symptom: Claude forgets what you were working on.
Fix: 1. Point Claude to the plan: “Read quality_reports/plans/[latest].md” 2. Check session log: “Read quality_reports/session_logs/[latest].md” 3. The post-compact-restore.py hook should print recovery info automatically
8.5.6 Quality Score Too Low
Symptom: Score stuck below 80, can’t commit.
Fix: 1. Run /slide-excellence to get detailed issue breakdown 2. Fix critical issues first (they cost -10 to -20 points each) 3. Ask Claude: “What are the remaining critical issues?”
8.5.7 Skills Not Auto-Invoked
Symptom: Claude doesn’t use skills when you describe a task.
Fix: 1. Be explicit in your request: “Review my slides for grammar and layout issues” 2. Check skill has auto-invocation enabled (no disable-model-invocation: true) 3. Skill descriptions help Claude know when to use them — check they’re clear
9 Standing on Shoulders
This guide builds on the work of many. We are grateful to these projects and their authors.
Core Infrastructure:
- Claude Code by Anthropic — the CLI tool, VS Code extension, and Desktop app that makes all of this possible
Research Workflows:
- clo-author by Hugo Sant’Anna (UAB) — paper-centric research workflows with adversarial agent pairs, simulated peer review, and full research lifecycle management
- claudeblattman by Chris Blattman (University of Chicago) — comprehensive workflows for non-technical academics: executive assistant, proposal writing, project management, and the fresh-context critique pattern
Reproducibility & Data Management:
- Yiqing Xu and Leo Yang Yang (2026), “Scaling Reproducibility: An AI-Assisted Workflow for Large-Scale Reanalysis,” Stanford University — the template-executor architecture and principles of reproducible AI-assisted research
- Template README for Social Science Replication Packages by Lars Vilhuber et al. (Cornell) — the AEA Data Editor compliance standard adopted by major economics journals
Presentation Design:
- MixtapeTools / The Rhetoric of Decks by Scott Cunningham (Baylor) — the philosophical and practical framework for beautiful, rhetorically effective academic presentations
- Scott Cunningham, Causal Inference: The Mixtape — the textbook whose author developed the presentation framework above
Origin:
- This workflow was extracted from Econ 730: Causal Panel Data at Emory University, developed by Pedro Sant’Anna. The econometrics origin is one application — the patterns are domain-agnostic and have been extended by others across fields.