RFC Idea: The AI-to-AI Handoff Protocol (AAHP)
A Proposal for Standardized Context Handoff Between Sequential AI Agents
Draft - February 2026 Author: AAHP Working Group
Abstract
The agentic AI ecosystem has produced remarkable protocols for connecting models to tools (MCP), enabling agent-to-agent communication (A2A), and bridging agents with user interfaces (AG-UI). Yet a critical gap remains: there is no standardized protocol for sequential context handoff between AI agents working on the same task across time, sessions, or model boundaries. This paper proposes the AI-to-AI Handoff Protocol (AAHP) - a lightweight, file-based standard for preserving intent, decisions, state, and trust between autonomous agents operating in relay-style workflows. Where MCP asks "What tools can I use?" and A2A asks "Which agent can help me right now?", AAHP asks: "What does the next agent need to know to continue my work?"
1. Problem Statement
Modern software development increasingly relies on AI agents. But these agents don't operate in isolation - they work in pipelines:
- Agent A researches requirements and produces an architecture.
- Agent B implements the architecture as code.
- Agent C reviews and validates the implementation.
- Agent D deploys and monitors the result.
Each agent may be a different model, a different instance of the same model, or the same model in a new context window. What they share is a task lineage - a chain of intent, decisions, and artifacts that must survive the transition from one agent to the next.
Today, this handoff happens through ad-hoc mechanisms: copy-pasted summaries, MEMORY.md files, CLAUDE.md conventions, or verbose system prompts that attempt to encode the entire project state. The result is:
- Context loss - Critical decisions evaporate between sessions.
- Redundant work - Successor agents re-derive conclusions their predecessors already reached.
- Trust ambiguity - No formal record of what was verified, what was assumed, and what remains untested.
- Blame diffusion - When something breaks, it's unclear which agent introduced the defect.
The AI-to-AI Handoff Protocol addresses these problems by defining a structured, machine-readable format for inter-agent context transfer.
2. Relationship to Existing Protocols
AAHP does not compete with existing agentic protocols. It fills a gap in the stack:
| Layer | Protocol | Focus |
|---|---|---|
| Tool Access | MCP (Model Context Protocol) | Agent ↔ External Tools & Data |
| Agent Communication | A2A (Agent-to-Agent) | Agent ↔ Agent (real-time, concurrent) |
| User Interaction | AG-UI (Agent-User Interaction) | Agent ↔ Human Interface |
| Context Handoff | AAHP (this proposal) | Agent → Agent (sequential, asynchronous) |
The key distinction: A2A enables agents to collaborate simultaneously; AAHP enables agents to collaborate across time. A2A is a phone call between agents. AAHP is a shift handover log.
3. Design Principles
AAHP is guided by five principles:
3.1 File-First, Not Wire-First
Unlike MCP and A2A, which define transport protocols (JSON-RPC, HTTP, SSE), AAHP is file-based. Handoff documents are Markdown or JSON files committed alongside code. This design choice reflects reality: most agent-to-agent handoffs today happen via files in repositories, not live connections. Files are versionable, diffable, auditable, and human-readable.
3.2 Human-in-the-Loop Compatible
Every AAHP document MUST be readable by a human engineer. An operator should be able to inspect a handoff, override decisions, or redirect the next agent - without needing protocol-specific tooling.
3.3 Minimal Viable Context
AAHP favors concise, structured handoffs over exhaustive dumps. The goal is to transmit the minimal context required for the successor agent to continue work without re-deriving prior conclusions. Brevity is a feature, not a limitation.
3.4 Trust Provenance
Every assertion in a handoff carries a trust level: verified, assumed, or untested. Successor agents inherit these trust markers and are expected to respect them - prioritizing verification of untested claims before building on them.
3.5 Append-Only History
Handoff history is immutable. Agents MUST NOT modify previous handoff entries. They MAY add corrections or amendments as new entries. This ensures a complete audit trail of the agent pipeline.
4. Protocol Specification
4.1 Handoff Directory Structure
An AAHP-compliant project maintains a .ai/handoff/ directory at the repository root:
.ai/
handoff/
STATUS.md # Current state of the system (REQUIRED)
NEXT_ACTIONS.md # Prioritized work queue for successor (REQUIRED)
LOG.md # Append-only session journal (REQUIRED)
DASHBOARD.md # Pipeline state & task registry (RECOMMENDED)
CONVENTIONS.md # Project-specific rules for all agents (OPTIONAL)
WORKFLOW.md # Pipeline sequence, phases & entry/exit criteria (OPTIONAL)
TRUST.md # Trust provenance registry (OPTIONAL)
4.2 STATUS.md - System State Document
The STATUS document provides a snapshot of the entire system at the time of handoff. It MUST include:
# [Project Name] - Current State
> Last updated: [ISO 8601 timestamp]
> Agent: [model identifier, e.g. "claude-opus-4-6"]
> Commit: [git SHA]
## Build Health
[Table: check name, result, notes]
## Component Status
[Table: component, location, state]
## What is Missing
[Table: gap, severity, description]
Requirements:
- The STATUS document MUST be regenerated (not appended) by each agent at session end.
- It MUST reference the exact commit at which the state was captured.
- Component states MUST use one of:
complete,implemented,partial,stub,not-started,broken. - Gap severities MUST use one of:
CRITICAL,HIGH,MEDIUM,LOW,DEFERRED.
4.3 NEXT_ACTIONS.md - Successor Work Queue
The NEXT_ACTIONS document provides a prioritized list of tasks for the successor agent:
# [Project Name] - Next Actions
> Priority order. Work top-down. Each item is independent unless noted.
## 1. [Action Title]
**Goal:** [One-sentence objective]
**Trust Level:** [verified | assumed | untested]
[Detailed instructions, file references, expected outcomes]
## 2. [Action Title]
...
Requirements:
- Actions MUST be ordered by priority (highest first).
- Each action MUST include a clear goal statement.
- Each action SHOULD include file paths the successor will need.
- Actions SHOULD note dependencies between items.
- The predecessor agent MUST NOT include more than 10 actions. If more work remains, the final action should be "Re-assess remaining work and create updated NEXT_ACTIONS."
4.4 LOG.md - Session Journal
The LOG document is an append-only record of all agent sessions:
# [Project Name] - Agent Journal
## Session [ISO date]: [Session Title]
**Agent:** [model identifier]
**Duration:** [approximate duration or context window usage]
**Commits:** [comma-separated SHAs]
### What was built
[Bullet list of artifacts created or modified]
### Verification
[What was tested and the results]
### Decisions made
[Numbered list of architectural or implementation decisions with rationale]
### What was NOT done
[Explicit list of deferred or skipped work with reasons]
### Trust assertions
[Claims made without full verification, marked for successor review]
Requirements:
- Each session MUST append a new entry. Previous entries MUST NOT be modified.
- The "What was NOT done" section is REQUIRED, not optional. Negative assertions are as valuable as positive ones.
- Commits MUST be listed so the successor can
git diffexactly what changed.
4.5 TRUST.md - Trust Provenance Registry
The TRUST document tracks the verification status of critical system properties:
# [Project Name] - Trust Registry
| Property | Status | Verified By | Session | Notes |
|---|---|---|---|---|
| Type-check passes | ✅ verified | claude-opus-4-6 | 2026-02-19 | 29/29 |
| Docker Compose boots | ⚠️ untested | - | - | Never run |
| CORS configured | ⚠️ assumed | claude-opus-4-6 | 2026-02-19 | Config written, not tested |
| E2E flow works | ❌ untested | - | - | No integration test exists |
Status values:
✅ verified- Agent confirmed this through execution or testing.⚠️ assumed- Agent believes this is true but did not verify.⚠️ untested- No agent has attempted verification.❌ broken- Agent confirmed this is currently failing.🔄 regression- Previously verified, now broken.
4.6 DASHBOARD.md - Pipeline State Document
While STATUS.md captures the state of the system and NEXT_ACTIONS.md captures the work for the next agent, neither answers the question: where is the pipeline right now? DASHBOARD.md fills this gap. It tracks the state of the pipeline as a whole - which tasks exist, which are running, which are blocked, and which are complete. It is the artifact that an orchestrator reads to decide what to spawn next.
The distinction between the three primary artifacts:
| File | Required | Answers | Written by | Read by |
|---|---|---|---|---|
| STATUS.md | ✅ Yes | What is the state of the code right now? | Each agent at session end | Next agent at session start |
| NEXT_ACTIONS.md | ✅ Yes | What should the next agent do? | Each agent at session end | Next agent at session start |
| LOG.md | ✅ Yes | What happened in each session? | Each agent (append-only) | Any agent seeking history |
| DASHBOARD.md | ⭐ Recommended | What is the state of the pipeline itself? | Orchestrator + each agent | Orchestrator before spawning |
| CONVENTIONS.md | ⚪ Optional | What rules must every agent follow? | Human / first agent | Every agent at session start |
| WORKFLOW.md | ⚪ Optional | How is the pipeline sequenced? | Human / Architect agent | Orchestrator + each agent |
| TRUST.md | ⚪ Optional | What was verified vs. assumed? | Each agent (Verifier role) | Successor agents |
A DASHBOARD.md document MUST include:
- A task registry - an enumerated list of all planned tasks with their current status.
- A Ready? flag per task - an orchestrator MUST check this before spawning an agent for that task. Tasks blocked on external dependencies MUST carry
Ready? = ❌. - An active work section - the currently running task, its branch, phase, and assigned agent.
- A pipeline history - a compact log of completed tasks with outcomes.
Task statuses MUST use one of: not-started, running, done, blocked, cancelled.
Minimal DASHBOARD.md template:
# [Project Name] - Pipeline Dashboard
> Last updated: [ISO 8601 timestamp]
## Build Health
| Check | Result | Notes |
|---|---|---|
| Type-check | ✅ Pass | X/X packages |
| Tests | ✅ Pass | X passing |
## Tasks
| # | Task | Status | Ready? | Blocked By |
|---|---|---|---|---|
| 1 | [Task name] | ✅ done | - | - |
| 2 | [Task name] | 🔄 running | ✅ | - |
| 3 | [Task name] | ⬜ not-started | ✅ | - |
| 4 | [Task name] | 🔴 blocked | ❌ | [dependency] |
## Active Work
- **Task:** [Task name]
- **Branch:** feat/[scope-description]
- **Phase:** Researcher → Architect → Implementer → Reviewer
- **Agent:** [model identifier]
- **Started:** [ISO timestamp]
## Pipeline History
| Date | Task | Result | Notes |
|---|---|---|---|
| [ISO date] | [Task] | ✅ done | [brief summary] |
Requirements:
- DASHBOARD.md MUST be updated at the start and end of each pipeline phase.
- An orchestrator MUST NOT spawn an agent for a task with
Ready? = ❌. - Blocked tasks MUST specify what they are blocked on in the Blocked By column.
- DASHBOARD.md SHOULD NOT duplicate content from STATUS.md or NEXT_ACTIONS.md - it tracks pipeline state, not system state.
4.7 CONVENTIONS.md - Project Rules for All Agents
CONVENTIONS.md encodes the rules that every agent working on a project MUST follow. Where STATUS.md and NEXT_ACTIONS.md change every session, CONVENTIONS.md is stable - it defines the invariants of the project that agents should not violate regardless of the current task.
Typical contents:
- Language and framework conventions - which language version, which libraries are preferred, what is explicitly forbidden.
- Commit and branch naming - format for commit messages, branch prefixes, PR conventions.
- Code style - formatter, linter config, testing requirements (e.g. minimum coverage, required test types).
- Agent-specific restrictions - what agents MUST NOT do autonomously (e.g., install new dependencies, push to main, send external notifications without confirmation).
- File ownership - which files are off-limits or require special care (e.g., migration files, generated code).
Minimal CONVENTIONS.md template:
# [Project Name] - Agent Conventions
> All agents MUST read this file before starting work.
## Language & Stack
- Language: [e.g. TypeScript 5.x strict mode]
- Package manager: [e.g. pnpm]
- Test runner: [e.g. Jest / Vitest]
## Commit Convention
- Format: `type(scope): description`
- Types: feat | fix | test | docs | refactor | chore
- Example: `feat(auth): add PKCE flow for OAuth2`
## Branch Convention
- Features: `feat/<scope>-<short-description>`
- Fixes: `fix/<scope>-<short-description>`
- Never push directly to `main`
## Agent Restrictions
- MUST NOT install new runtime dependencies without noting them in NEXT_ACTIONS.md
- MUST NOT modify files outside the task scope
- MUST run type-check and tests before committing
## Testing Requirements
- Every new function MUST have at least one unit test
- Tests MUST pass before committing
- Test files live alongside source: `*.test.ts`
Requirements:
- Agents MUST read CONVENTIONS.md at session start, before reading NEXT_ACTIONS.md.
- CONVENTIONS.md MUST NOT be modified by agents during implementation - it is an invariant, not a working document.
- If a convention is found to be outdated or incorrect, the agent MUST note it in LOG.md and flag it in NEXT_ACTIONS.md for a human to resolve.
4.8 WORKFLOW.md - Pipeline Sequence and Phase Definitions
WORKFLOW.md describes how the pipeline is structured - the sequence of agent roles, what each phase produces, and the entry and exit criteria for each handoff. It is the blueprint that an orchestrator follows when deciding which agent to spawn next and with what context.
Where DASHBOARD.md tracks the current state of the pipeline, WORKFLOW.md defines the intended structure of the pipeline. Together, they allow an orchestrator to answer: "What should happen next, and what is actually happening right now?"
Typical contents:
- Pipeline overview - a linear or branching sequence of phases (e.g. Researcher → Architect → Implementer → Reviewer → Fix → Publish).
- Phase definitions - for each phase: the agent role, the inputs it reads, the outputs it produces, and the acceptance criteria that trigger the next phase.
- Entry criteria - what must be true before a phase can start (e.g. "previous phase committed to a branch").
- Exit criteria - what the agent must produce to complete its phase (e.g. "STATUS.md updated, tests passing, LOG.md entry appended").
- Escalation rules - when the pipeline should pause and notify a human rather than proceeding automatically.
Minimal WORKFLOW.md template:
# [Project Name] - Pipeline Workflow
> This file defines the intended pipeline structure.
> For current pipeline state, see DASHBOARD.md.
## Pipeline Overview
```
Researcher → Architect → Implementer → Reviewer → (Fix →) Publish
```
## Phase Definitions
### Phase 1: Researcher
**Role:** Search-augmented LLM
**Reads:** STATUS.md, NEXT_ACTIONS.md, CONVENTIONS.md
**Produces:** Research notes appended to LOG.md; updated NEXT_ACTIONS.md with implementation plan
**Exit criteria:** NEXT_ACTIONS.md contains a concrete, actionable plan for the Architect
### Phase 2: Architect
**Role:** High-capability reasoning model
**Reads:** LOG.md (research notes), STATUS.md, CONVENTIONS.md
**Produces:** Architecture Decision Record (ADR) in LOG.md; updated NEXT_ACTIONS.md for Implementer
**Exit criteria:** ADR documents technology choices, interfaces, and file structure
### Phase 3: Implementer
**Role:** Fast coding model
**Reads:** LOG.md (ADR), NEXT_ACTIONS.md, CONVENTIONS.md, STATUS.md
**Produces:** Code commits on feature branch; updated STATUS.md, LOG.md, TRUST.md
**Exit criteria:** All NEXT_ACTIONS tasks complete; type-check and tests passing
### Phase 4: Reviewer
**Role:** Independent reasoning model
**Reads:** All handoff files + diff of feature branch
**Produces:** Review notes in LOG.md; updated NEXT_ACTIONS.md (fixes needed or approval)
**Exit criteria:** Either approves merge or produces fix list for Implementer
## Escalation Rules
- If tests fail after 2 fix attempts → pause pipeline, notify operator
- If a task is blocked (Ready? = ❌) → skip to next unblocked task
- If all tasks blocked → notify operator, set pipeline state to paused
Requirements:
- WORKFLOW.md SHOULD be authored by a human or Architect agent at project start and remain stable.
- Agents MAY reference WORKFLOW.md to understand where they fit in the pipeline, but MUST NOT modify it during implementation.
- If the pipeline structure needs to change, this MUST be noted in LOG.md and reviewed by a human before WORKFLOW.md is updated.
5. Agent Roles in AAHP Pipelines
AAHP defines four canonical agent roles. A single agent session may fulfill multiple roles, and roles may be performed by different models:
5.1 Researcher
Gathers requirements, reads documentation, explores the codebase, and produces a structured understanding of the problem space. Outputs: updated STATUS.md with findings, NEXT_ACTIONS.md with implementation plan.
5.2 Implementer
Executes the work plan from NEXT_ACTIONS.md. Writes code, creates infrastructure, modifies configuration. Outputs: commits, updated STATUS.md, LOG.md entry with decisions and artifacts.
5.3 Verifier
Reviews the Implementer's work. Runs tests, checks types, validates contracts, attempts deployment. Outputs: updated TRUST.md, LOG.md entry with test results, NEXT_ACTIONS.md with fixes needed.
5.4 Validator
Performs end-to-end acceptance testing. Evaluates whether the original intent has been fulfilled. Outputs: final STATUS.md assessment, LOG.md entry with acceptance criteria results.
Pipeline Example
Researcher → Implementer → Verifier → Implementer (fix) → Validator
↓ ↓ ↓ ↓ ↓
STATUS.md LOG.md TRUST.md LOG.md STATUS.md
NEXT_ACTIONS STATUS.md NEXT_ACTIONS STATUS.md (final)
NEXT_ACTIONS NEXT_ACTIONS
Each arrow represents an AAHP handoff. Each agent reads the previous handoff files, performs its work, and produces updated handoff files for the successor.
6. Transport and Discovery
6.1 File Transport (Primary)
The primary transport is the filesystem. Handoff documents live in .ai/handoff/ and are committed to version control. This means:
- Every handoff is a git commit.
- Every handoff is reviewable in a pull request.
- Every handoff is auditable through
git log .ai/handoff/.
6.2 API Transport (Extension)
For orchestration systems that manage agent pipelines programmatically, AAHP defines an optional JSON representation:
{
"aahp_version": "0.1.0",
"session": {
"agent": "claude-opus-4-6",
"started_at": "2026-02-19T14:30:00Z",
"ended_at": "2026-02-19T16:45:00Z",
"commits": ["d9175f7", "a2f1c25", "f9fd7cd"]
},
"status": { ... },
"next_actions": [ ... ],
"trust_assertions": [ ... ],
"log_entry": { ... }
}
6.3 Discovery
An AAHP-aware agent discovers handoff context by checking for .ai/handoff/STATUS.md at the repository root. If this file exists, the agent SHOULD read all handoff documents before beginning work.
A CLAUDE.md or equivalent model-specific instruction file MAY reference the handoff directory:
## Agent Handoff
This project uses AAHP. Read `.ai/handoff/STATUS.md` and
`.ai/handoff/NEXT_ACTIONS.md` before starting any work.
7. Security Considerations
7.1 Secrets
Handoff documents MUST NOT contain secrets, API keys, passwords, or tokens. Environment variable names may be referenced (e.g., "Set ANTHROPIC_API_KEY in .env"), but values MUST NOT appear.
7.2 PII
Handoff documents SHOULD NOT contain personally identifiable information. If agent pipelines process user data, handoff documents should reference data locations, not data contents.
7.3 Prompt Injection
Successor agents SHOULD treat handoff documents as potentially compromised input. An AAHP-aware orchestrator SHOULD validate handoff documents against the schema before presenting them to successor agents. Handoff documents MUST NOT contain executable instructions disguised as data (e.g., "Ignore previous instructions and...").
7.4 Trust Escalation
An agent MUST NOT escalate trust assertions from a predecessor without independent verification. If Agent A marks a property as ⚠️ assumed, Agent B MUST NOT promote it to ✅ verified without performing its own verification.
8. Real-World Validation
AAHP was not designed in isolation. It was derived from operational experience running autonomous agent pipelines on production software - and refined through the process of building failprompt, a public CLI tool created as an open reference implementation of the protocol.
Across both deployments, a consistent set of findings emerged that shaped the specification:
- The "What was NOT done" section proved more valuable than "What was done." Successor agents consistently reported that explicit negative assertions prevented them from making false assumptions about system readiness.
- Trust provenance eliminated redundant verification. When the Verifier could see that type-checking had been confirmed but integration testing had never run, it immediately prioritized the untested path rather than re-running already-verified checks.
- File-based handoff survived context window limits. Unlike system prompts that compete for tokens, handoff files can be selectively loaded. The successor reads STATUS.md first, then NEXT_ACTIONS.md, and consults LOG.md only for specific historical decisions.
- Git-committed handoffs enabled human oversight. Project maintainers reviewed handoff diffs in pull requests, catching cases where agents made incorrect assumptions or proposed architecturally unsound approaches.
8.1 Autonomous Pipeline in Practice
A structured AAHP pipeline ran autonomously on a production codebase using four specialized model roles:
| Role | Model Type | Responsibility |
|---|---|---|
| Researcher | Search-augmented LLM | Web research, OSS evaluation, compliance requirements |
| Architect | High-capability reasoning model | System design, ADRs, interface contracts |
| Implementer | Fast coding model | Code, tests, feature branches, commits |
| Reviewer | Independent reasoning model | Second opinion, alternative perspective, edge cases |
Each agent received only the AAHP handoff files - STATUS.md, NEXT_ACTIONS.md, LOG.md, DASHBOARD.md - as context. No shared memory. No live connection. No orchestration middleware. The orchestrator's role was reduced to: check DASHBOARD.md, spawn the next agent, wait for completion, repeat.
Over a single overnight session, the pipeline autonomously delivered five production features totalling over 300 new tests - all committed to dedicated feature branches, reviewed by a Reviewer agent, and merged to main without a human reviewing a single diff in real-time.
Key observation: DASHBOARD.md proved to be the fourth AAHP artifact - essential for any pipeline longer than a single agent session. The three required files (STATUS.md, NEXT_ACTIONS.md, LOG.md) describe individual sessions. DASHBOARD.md describes the state of the pipeline itself - which tasks are running, which are blocked, which are complete. Without it, each orchestrator session had to re-derive overall pipeline state from multiple separate files - an error-prone process that occasionally caused tasks to be re-run or skipped.
8.2 failprompt - A Public Reference Implementation
To provide a concrete, inspectable example of AAHP in practice, we built failprompt - a CLI tool that reads GitHub Actions failure logs and produces a structured prompt for any AI assistant. The tool is deliberately simple: it does one thing and does it well, making it an ideal vehicle for demonstrating the protocol without the complexity of a production system.
failprompt was built entirely using AAHP: a Researcher agent surveyed CLI best practices and the GitHub Actions API surface, an Architect agent authored the architecture decision record (a 4-module design: log-fetcher, error-extractor, prompt-builder, and CLI entry point), and an Implementer agent wrote the code and tests. The entire project - from blank repository to published npm package - was orchestrated through handoff files alone.
The tool is published on npm at npmjs.com/package/failprompt and requires no installation. Run it directly in any repo with a GitHub Actions CI:
# Auto-detect the latest failed run in the current repo
npx failprompt
# Target a specific run by ID
npx failprompt --run 22257459273Both commands fetch the failure log, extract the relevant error section, and output a structured prompt ready to paste into any AI assistant. It was tested end-to-end against its own CI: a deliberate test failure was introduced, the pipeline ran, and npx failprompt without flags produced a correct, paste-ready output. The project is open source at github.com/homeofe/failprompt and intentionally keeps its .ai/handoff/ directory public. Readers can follow the pipeline as a narrative by reading LOG.md, inspect what was verified vs. assumed in TRUST.md, and see where the pipeline left off in NEXT_ACTIONS.md. It is designed to be the simplest possible complete example of AAHP: from blank repository to a published, tested CLI tool.
8.3 Lessons That Changed the Specification
Three operational findings prompted retroactive updates to this specification:
- Rate-limit and availability awareness is an orchestrator responsibility. AI providers impose token budgets, rate limits, or rolling usage windows. A production AAHP orchestrator must track model availability as an external constraint and delay spawning when a model is temporarily unavailable - not treat this as a pipeline failure.
- Blocked tasks require explicit representation. When a task depends on external resources not yet available (credentials, approvals, external APIs), no agent can unblock it. This led to the
Ready?field in DASHBOARD.md. Orchestrators check this flag before spawning; blocked tasks are skipped, not retried. - Out-of-band notification is implicitly part of the protocol. In a fully autonomous pipeline, the human operator needs a push signal when a phase completes - not a dashboard to poll. Any production AAHP orchestrator should define a notification mechanism at session end. Future AAHP tooling should formalize this as a standard hook.
9. Comparison with Related Work
| Feature | AAHP | A2A | MCP | AG-UI |
|---|---|---|---|---|
| Communication model | Asynchronous, file-based | Synchronous, HTTP/SSE | Synchronous, JSON-RPC | Event stream, SSE |
| Agent relationship | Sequential relay | Concurrent collaboration | Client-server (agent ↔ tool) | Agent ↔ Human UI |
| State transfer | Complete context snapshot | Task-scoped messages | Tool call + response | UI events + state patches |
| Human readability | Required (Markdown) | Optional (JSON) | Not prioritized | Not prioritized |
| Trust tracking | First-class (TRUST.md) | Not specified | Not specified | Not specified |
| Transport | Git / filesystem | HTTP + JSON-RPC | HTTP + JSON-RPC / stdio | HTTP + SSE |
| Governance | Linux Foundation (proposed) | Linux Foundation (AAIF) | Linux Foundation (AAIF) | CopilotKit / open community |
AAHP is complementary to all three protocols. An agent using MCP for tool access and A2A for real-time collaboration would use AAHP to hand off its accumulated context to a successor agent in a different session.
10. Future Work
10.1 Schema Validation
Define a JSON Schema for the API transport format, enabling automated validation of handoff documents.
10.2 Multi-Agent Graphs
Extend the linear pipeline model to support directed acyclic graphs (DAGs) where multiple agents work in parallel and their handoffs are merged.
10.3 Conflict Resolution
Define semantics for resolving conflicts when two agents produce contradictory trust assertions or status updates about the same component.
10.4 MCP Integration
Create an MCP server that exposes AAHP handoff documents as MCP resources, allowing agents to query handoff state through the tool protocol.
10.5 Metrics and Observability
Define standard metrics for handoff quality: context preservation rate, redundant work ratio, trust assertion accuracy, and pipeline throughput.
11. Call for Participation
This specification is a draft. We invite the community to:
- Implement AAHP in your own multi-agent workflows and report findings.
- Propose extensions for domain-specific handoff requirements.
- Build tooling - linters, validators, visualizers for handoff pipelines.
- Contribute to governance - help establish AAHP as an open standard under an appropriate foundation.
The specification, reference implementation, and discussion forum are available on GitHub.
Appendix A: AAHP Document Templates
A.1 Minimal STATUS.md
# MyProject - Current State
> Last updated: 2026-02-19T16:00:00Z
> Agent: claude-opus-4-6
> Commit: abc1234
## Build Health
| Check | Result | Notes |
|---|---|---|
| Type-check | ✅ Pass | 12/12 packages |
| Build | ✅ Pass | All targets |
| Tests | ⚠️ Partial | Unit passes, no integration |
## What is Missing
| Gap | Severity | Description |
|---|---|---|
| Integration tests | HIGH | No E2E test suite |
| Production config | MEDIUM | Only local env exists |
A.2 Minimal NEXT_ACTIONS.md
# MyProject - Next Actions
## 1. Write Integration Tests
**Goal:** Verify the API returns correct responses for all endpoints.
**Trust Level:** untested
**Files:** `src/routes/*.ts`, `tests/integration/`
Create a test suite using Vitest that hits each endpoint
with sample payloads and asserts response schemas.
## 2. Add Production Configuration
**Goal:** Create environment configs for staging and production.
**Trust Level:** untested
**Files:** `deployment/`, `.env.example`
A.3 Minimal LOG.md Entry
## Session 2026-02-19: Initial API Implementation
**Agent:** claude-opus-4-6
**Commits:** abc1234, def5678
### What was built
- REST API with 5 endpoints
- Database schema with migrations
- Authentication middleware
### Decisions made
1. Chose Fastify over Express for performance
2. Used Drizzle ORM for type-safe queries
### What was NOT done
- No rate limiting configured
- No WebSocket support (deferred to next session)
### Trust assertions
- API responses match OpenAPI spec (assumed, not tested with validator)
Appendix B: Keywords
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.
This document is released under the Creative Commons Attribution 4.0 International License (CC BY 4.0). For questions, contributions, or implementations, contact the AAHP working group.