RFC Idea: The AI-to-AI Handoff Protocol (AAHP)
A Proposal for Standardized Context Handoff Between Sequential AI Agents
Draft - February 2026 Author: AAHP Working Group
Abstract
The agentic AI ecosystem has produced remarkable protocols for connecting models to tools (MCP), enabling agent-to-agent communication (A2A), and bridging agents with user interfaces (AG-UI). Yet a critical gap remains: there is no standardized protocol for sequential context handoff between AI agents working on the same task across time, sessions, or model boundaries. This paper proposes the AI-to-AI Handoff Protocol (AAHP) - a lightweight, file-based standard for preserving intent, decisions, state, and trust between autonomous agents operating in relay-style workflows. Where MCP asks "What tools can I use?" and A2A asks "Which agent can help me right now?", AAHP asks: "What does the next agent need to know to continue my work?"
1. Problem Statement
Modern software development increasingly relies on AI agents. But these agents don't operate in isolation - they work in pipelines:
- Agent A researches requirements and produces an architecture.
- Agent B implements the architecture as code.
- Agent C reviews and validates the implementation.
- Agent D deploys and monitors the result.
Each agent may be a different model, a different instance of the same model, or the same model in a new context window. What they share is a task lineage - a chain of intent, decisions, and artifacts that must survive the transition from one agent to the next.
Today, this handoff happens through ad-hoc mechanisms: copy-pasted summaries, MEMORY.md files, CLAUDE.md conventions, or verbose system prompts that attempt to encode the entire project state. The result is:
- Context loss - Critical decisions evaporate between sessions.
- Redundant work - Successor agents re-derive conclusions their predecessors already reached.
- Trust ambiguity - No formal record of what was verified, what was assumed, and what remains untested.
- Blame diffusion - When something breaks, it's unclear which agent introduced the defect.
The AI-to-AI Handoff Protocol addresses these problems by defining a structured, machine-readable format for inter-agent context transfer.
2. Relationship to Existing Protocols
AAHP does not compete with existing agentic protocols. It fills a gap in the stack:
| Layer | Protocol | Focus |
|---|---|---|
| Tool Access | MCP (Model Context Protocol) | Agent ↔ External Tools & Data |
| Agent Communication | A2A (Agent-to-Agent) | Agent ↔ Agent (real-time, concurrent) |
| User Interaction | AG-UI (Agent-User Interaction) | Agent ↔ Human Interface |
| Context Handoff | AAHP (this proposal) | Agent → Agent (sequential, asynchronous) |
The key distinction: A2A enables agents to collaborate simultaneously; AAHP enables agents to collaborate across time. A2A is a phone call between agents. AAHP is a shift handover log.
3. Design Principles
AAHP is guided by five principles:
3.1 File-First, Not Wire-First
Unlike MCP and A2A, which define transport protocols (JSON-RPC, HTTP, SSE), AAHP is file-based. Handoff documents are Markdown or JSON files committed alongside code. This design choice reflects reality: most agent-to-agent handoffs today happen via files in repositories, not live connections. Files are versionable, diffable, auditable, and human-readable.
3.2 Human-in-the-Loop Compatible
Every AAHP document MUST be readable by a human engineer. An operator should be able to inspect a handoff, override decisions, or redirect the next agent - without needing protocol-specific tooling.
3.3 Minimal Viable Context
AAHP favors concise, structured handoffs over exhaustive dumps. The goal is to transmit the minimal context required for the successor agent to continue work without re-deriving prior conclusions. Brevity is a feature, not a limitation.
3.4 Trust Provenance
Every assertion in a handoff carries a trust level: verified, assumed, or untested. Successor agents inherit these trust markers and are expected to respect them - prioritizing verification of untested claims before building on them.
3.5 Append-Only History
Handoff history is immutable. Agents MUST NOT modify previous handoff entries. They MAY add corrections or amendments as new entries. This ensures a complete audit trail of the agent pipeline.
4. Protocol Specification
4.1 Handoff Directory Structure
An AAHP-compliant project maintains a .ai/handoff/ directory at the repository root:
.ai/
handoff/
STATUS.md # Current state of the system (REQUIRED)
NEXT_ACTIONS.md # Prioritized work queue for successor (REQUIRED)
LOG.md # Append-only session journal (REQUIRED)
CONVENTIONS.md # Project-specific rules and patterns (OPTIONAL)
TRUST.md # Trust provenance registry (OPTIONAL)
4.2 STATUS.md - System State Document
The STATUS document provides a snapshot of the entire system at the time of handoff. It MUST include:
# [Project Name] — Current State
> Last updated: [ISO 8601 timestamp]
> Agent: [model identifier, e.g. "claude-opus-4-6"]
> Commit: [git SHA]
## Build Health
[Table: check name, result, notes]
## Component Status
[Table: component, location, state]
## What is Missing
[Table: gap, severity, description]
Requirements:
- The STATUS document MUST be regenerated (not appended) by each agent at session end.
- It MUST reference the exact commit at which the state was captured.
- Component states MUST use one of:
complete,implemented,partial,stub,not-started,broken. - Gap severities MUST use one of:
CRITICAL,HIGH,MEDIUM,LOW,DEFERRED.
4.3 NEXT_ACTIONS.md - Successor Work Queue
The NEXT_ACTIONS document provides a prioritized list of tasks for the successor agent:
# [Project Name] — Next Actions
> Priority order. Work top-down. Each item is independent unless noted.
## 1. [Action Title]
**Goal:** [One-sentence objective]
**Trust Level:** [verified | assumed | untested]
[Detailed instructions, file references, expected outcomes]
## 2. [Action Title]
...
Requirements:
- Actions MUST be ordered by priority (highest first).
- Each action MUST include a clear goal statement.
- Each action SHOULD include file paths the successor will need.
- Actions SHOULD note dependencies between items.
- The predecessor agent MUST NOT include more than 10 actions. If more work remains, the final action should be "Re-assess remaining work and create updated NEXT_ACTIONS."
4.4 LOG.md - Session Journal
The LOG document is an append-only record of all agent sessions:
# [Project Name] — Agent Journal
## Session [ISO date]: [Session Title]
**Agent:** [model identifier]
**Duration:** [approximate duration or context window usage]
**Commits:** [comma-separated SHAs]
### What was built
[Bullet list of artifacts created or modified]
### Verification
[What was tested and the results]
### Decisions made
[Numbered list of architectural or implementation decisions with rationale]
### What was NOT done
[Explicit list of deferred or skipped work with reasons]
### Trust assertions
[Claims made without full verification, marked for successor review]
Requirements:
- Each session MUST append a new entry. Previous entries MUST NOT be modified.
- The "What was NOT done" section is REQUIRED, not optional. Negative assertions are as valuable as positive ones.
- Commits MUST be listed so the successor can
git diffexactly what changed.
4.5 TRUST.md - Trust Provenance Registry
The TRUST document tracks the verification status of critical system properties:
# [Project Name] — Trust Registry
| Property | Status | Verified By | Session | Notes |
|---|---|---|---|---|
| Type-check passes | ✅ verified | claude-opus-4-6 | 2026-02-19 | 29/29 |
| Docker Compose boots | ⚠️ untested | — | — | Never run |
| CORS configured | ⚠️ assumed | claude-opus-4-6 | 2026-02-19 | Config written, not tested |
| E2E flow works | ❌ untested | — | — | No integration test exists |
Status values:
✅ verified- Agent confirmed this through execution or testing.⚠️ assumed- Agent believes this is true but did not verify.⚠️ untested- No agent has attempted verification.❌ broken- Agent confirmed this is currently failing.🔄 regression- Previously verified, now broken.
5. Agent Roles in AAHP Pipelines
AAHP defines four canonical agent roles. A single agent session may fulfill multiple roles, and roles may be performed by different models:
5.1 Researcher
Gathers requirements, reads documentation, explores the codebase, and produces a structured understanding of the problem space. Outputs: updated STATUS.md with findings, NEXT_ACTIONS.md with implementation plan.
5.2 Implementer
Executes the work plan from NEXT_ACTIONS.md. Writes code, creates infrastructure, modifies configuration. Outputs: commits, updated STATUS.md, LOG.md entry with decisions and artifacts.
5.3 Verifier
Reviews the Implementer's work. Runs tests, checks types, validates contracts, attempts deployment. Outputs: updated TRUST.md, LOG.md entry with test results, NEXT_ACTIONS.md with fixes needed.
5.4 Validator
Performs end-to-end acceptance testing. Evaluates whether the original intent has been fulfilled. Outputs: final STATUS.md assessment, LOG.md entry with acceptance criteria results.
Pipeline Example
Researcher → Implementer → Verifier → Implementer (fix) → Validator
↓ ↓ ↓ ↓ ↓
STATUS.md LOG.md TRUST.md LOG.md STATUS.md
NEXT_ACTIONS STATUS.md NEXT_ACTIONS STATUS.md (final)
NEXT_ACTIONS NEXT_ACTIONS
Each arrow represents an AAHP handoff. Each agent reads the previous handoff files, performs its work, and produces updated handoff files for the successor.
6. Transport and Discovery
6.1 File Transport (Primary)
The primary transport is the filesystem. Handoff documents live in .ai/handoff/ and are committed to version control. This means:
- Every handoff is a git commit.
- Every handoff is reviewable in a pull request.
- Every handoff is auditable through
git log .ai/handoff/.
6.2 API Transport (Extension)
For orchestration systems that manage agent pipelines programmatically, AAHP defines an optional JSON representation:
{
"aahp_version": "0.1.0",
"session": {
"agent": "claude-opus-4-6",
"started_at": "2026-02-19T14:30:00Z",
"ended_at": "2026-02-19T16:45:00Z",
"commits": ["d9175f7", "a2f1c25", "f9fd7cd"]
},
"status": { ... },
"next_actions": [ ... ],
"trust_assertions": [ ... ],
"log_entry": { ... }
}
6.3 Discovery
An AAHP-aware agent discovers handoff context by checking for .ai/handoff/STATUS.md at the repository root. If this file exists, the agent SHOULD read all handoff documents before beginning work.
A CLAUDE.md or equivalent model-specific instruction file MAY reference the handoff directory:
## Agent Handoff
This project uses AAHP. Read `.ai/handoff/STATUS.md` and
`.ai/handoff/NEXT_ACTIONS.md` before starting any work.
7. Security Considerations
7.1 Secrets
Handoff documents MUST NOT contain secrets, API keys, passwords, or tokens. Environment variable names may be referenced (e.g., "Set ANTHROPIC_API_KEY in .env"), but values MUST NOT appear.
7.2 PII
Handoff documents SHOULD NOT contain personally identifiable information. If agent pipelines process user data, handoff documents should reference data locations, not data contents.
7.3 Prompt Injection
Successor agents SHOULD treat handoff documents as potentially compromised input. An AAHP-aware orchestrator SHOULD validate handoff documents against the schema before presenting them to successor agents. Handoff documents MUST NOT contain executable instructions disguised as data (e.g., "Ignore previous instructions and...").
7.4 Trust Escalation
An agent MUST NOT escalate trust assertions from a predecessor without independent verification. If Agent A marks a property as ⚠️ assumed, Agent B MUST NOT promote it to ✅ verified without performing its own verification.
8. Real-World Validation
The AAHP specification was derived from practical experience building a large-scale European microservices platform comprising 10 services, 29 TypeScript packages, and infrastructure spanning multiple cloud providers.
During development, multiple AI agent instances worked in relay across extended sessions. The handoff mechanism that emerged organically - STATUS.md, NEXT_ACTIONS.md, LOG.md - became the basis for this specification.
Key findings from this experience:
- The "What was NOT done" section proved more valuable than "What was done." Successor agents consistently reported that explicit negative assertions prevented them from making false assumptions about system readiness.
- Trust provenance eliminated redundant verification. When the Verifier agent could see that type-checking had been verified but Docker Compose had never been tested, it immediately prioritized the untested path rather than re-running type checks.
- File-based handoff survived context window limits. Unlike system prompts that compete for tokens, handoff files can be selectively loaded - the successor reads STATUS.md first, then NEXT_ACTIONS.md, and only consults LOG.md for specific historical decisions.
- Git-committed handoffs enabled human oversight. Project maintainers reviewed handoff diffs in pull requests, catching cases where agents made incorrect assumptions or proposed architecturally unsound approaches.
9. Comparison with Related Work
| Feature | AAHP | A2A | MCP | AG-UI |
|---|---|---|---|---|
| Communication model | Asynchronous, file-based | Synchronous, HTTP/SSE | Synchronous, JSON-RPC | Event stream, SSE |
| Agent relationship | Sequential relay | Concurrent collaboration | Client-server (agent ↔ tool) | Agent ↔ Human UI |
| State transfer | Complete context snapshot | Task-scoped messages | Tool call + response | UI events + state patches |
| Human readability | Required (Markdown) | Optional (JSON) | Not prioritized | Not prioritized |
| Trust tracking | First-class (TRUST.md) | Not specified | Not specified | Not specified |
| Transport | Git / filesystem | HTTP + JSON-RPC | HTTP + JSON-RPC / stdio | HTTP + SSE |
| Governance | Linux Foundation (proposed) | Linux Foundation (AAIF) | Linux Foundation (AAIF) | CopilotKit / open community |
AAHP is complementary to all three protocols. An agent using MCP for tool access and A2A for real-time collaboration would use AAHP to hand off its accumulated context to a successor agent in a different session.
10. Future Work
10.1 Schema Validation
Define a JSON Schema for the API transport format, enabling automated validation of handoff documents.
10.2 Multi-Agent Graphs
Extend the linear pipeline model to support directed acyclic graphs (DAGs) where multiple agents work in parallel and their handoffs are merged.
10.3 Conflict Resolution
Define semantics for resolving conflicts when two agents produce contradictory trust assertions or status updates about the same component.
10.4 MCP Integration
Create an MCP server that exposes AAHP handoff documents as MCP resources, allowing agents to query handoff state through the tool protocol.
10.5 Metrics and Observability
Define standard metrics for handoff quality: context preservation rate, redundant work ratio, trust assertion accuracy, and pipeline throughput.
11. Call for Participation
This specification is a draft. We invite the community to:
- Implement AAHP in your own multi-agent workflows and report findings.
- Propose extensions for domain-specific handoff requirements.
- Build tooling - linters, validators, visualizers for handoff pipelines.
- Contribute to governance - help establish AAHP as an open standard under an appropriate foundation.
The specification, reference implementation, and discussion forum are available on GitHub.
Appendix A: AAHP Document Templates
A.1 Minimal STATUS.md
# MyProject — Current State
> Last updated: 2026-02-19T16:00:00Z
> Agent: claude-opus-4-6
> Commit: abc1234
## Build Health
| Check | Result | Notes |
|---|---|---|
| Type-check | ✅ Pass | 12/12 packages |
| Build | ✅ Pass | All targets |
| Tests | ⚠️ Partial | Unit passes, no integration |
## What is Missing
| Gap | Severity | Description |
|---|---|---|
| Integration tests | HIGH | No E2E test suite |
| Production config | MEDIUM | Only local env exists |
A.2 Minimal NEXT_ACTIONS.md
# MyProject — Next Actions
## 1. Write Integration Tests
**Goal:** Verify the API returns correct responses for all endpoints.
**Trust Level:** untested
**Files:** `src/routes/*.ts`, `tests/integration/`
Create a test suite using Vitest that hits each endpoint
with sample payloads and asserts response schemas.
## 2. Add Production Configuration
**Goal:** Create environment configs for staging and production.
**Trust Level:** untested
**Files:** `deployment/`, `.env.example`
A.3 Minimal LOG.md Entry
## Session 2026-02-19: Initial API Implementation
**Agent:** claude-opus-4-6
**Commits:** abc1234, def5678
### What was built
- REST API with 5 endpoints
- Database schema with migrations
- Authentication middleware
### Decisions made
1. Chose Fastify over Express for performance
2. Used Drizzle ORM for type-safe queries
### What was NOT done
- No rate limiting configured
- No WebSocket support (deferred to next session)
### Trust assertions
- API responses match OpenAPI spec (assumed, not tested with validator)
Appendix B: Keywords
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.
This document is released under the Creative Commons Attribution 4.0 International License (CC BY 4.0). For questions, contributions, or implementations, contact the AAHP working group.