Skip to content
Method: every claim tracked, reviewed every 30–90 days, marked Holding, Partial, or Not holding. Drafted by Claude; signed off by Peter. How this works →
OPS-076pub24 May 2026rev24 May 2026read7 mininOperators

Picking an agent protocol when you are a 6-person agency: MCP, A2A, Llama Stack, and the rule that keeps your tool inventory portable

If your small agency builds agentic features on paid client work, you are picking an agent protocol whether or not you call it that. MCP, A2A, and Llama Stack do not converge in 2026. Pick a default by reading the client's existing stack, not by picking the protocol you find most interesting. The rule that keeps your tool inventory portable across clients: build every tool as a plain HTTP service first, wrap it to the chosen protocol second. The wrapper is the disposable layer; the HTTP service is the asset.

Holding·reviewed24 May 2026·next+38d

If your small agency builds agentic features on paid client work, you are picking an agent protocol whether or not you call it that. MCP, A2A, and Llama Stack do not converge in 2026, and the proprietary surfaces (Microsoft Copilot Agent, Salesforce Agentforce, SAP Joule, ServiceNow Now Assist) do not adopt any of them as defaults. The first time the question becomes a problem for a 6-person agency is the second time the agency is asked to deliver similar functionality on a different protocol estate, and the re-implementation hours land on the agency’s books rather than the client’s, because the engineering work was protocol-coupled by default.

The enterprise framing of the protocol decision is at AM-169, which explains why the protocol decision is harder to reverse than the model decision and what the three procurement clauses are that close the lock-in gap from the contract side. The small-agency version of the same problem has a simpler operational answer: pick by client, build to a portable shape, and track switching cost in hours.

Three rules for picking the protocol

Rule 1: match the client’s existing stack. If the client is Anthropic-aligned, Claude Code in engineering, Cursor as the engineer’s IDE, Claude Enterprise for the broader workspace, pick MCP. If the client is Google-aligned, Vertex AI agent platform, Gemini Enterprise, A2A integrations, pick A2A. If the client is Llama-Stack-aligned, usually because of data sovereignty requirements, fully self-hosted operations, or hard cost constraints, pick Llama Stack. Do not introduce protocol diversity into a client that has none. The client’s existing protocol estate is the path of least resistance for the integration, the audit, and the long-term maintenance.

Rule 2: when the client has no existing stack, default to MCP for tool-heavy work, A2A for agent-collaboration work. MCP has the broadest 2026 server ecosystem, the most reference implementations, and the lowest engineering friction for wrapping internal services as tools. A2A has the cleanest story for cross-agent coordination, if the feature is “agent A delegates to agent B and waits for a result”, A2A is the lower-friction protocol. The two compose; an agency can build the tool surface in MCP and the agent-coordination surface in A2A in the same project.

Rule 3: avoid Llama Stack unless the client has an explicit reason for it. The tooling community is smaller in 2026; the engineering documentation is thinner; the agency’s hour-per-feature productivity is materially lower than on MCP or A2A. If the client says “we want self-hosted, sovereign, with no vendor dependency”, Llama Stack is the answer. If the client has not said that, it is the wrong default.

The portability rule: HTTP first, protocol second

The asset that gets locked in a protocol estate is the tool inventory. The agency’s hours that built the tools are the engineering cost that does not transfer when the next client wants the same functionality on a different protocol.

The rule that breaks the lock-in: build every tool as a plain HTTP service first, wrap it to the chosen protocol second. The wrapper is the disposable layer. The HTTP service is the asset.

An internal Slack-summarisation tool exposed as a REST endpoint at /summarise can be wrapped as an MCP server in roughly 50 lines of glue, as an A2A endpoint in roughly 50 lines of different glue, as a Llama Stack tool plugin in roughly 50 lines of yet-different glue, and as a Copilot connector in approximately the same. The business logic, the prompts, the data joining, the rate limiting, the error handling, lives in the HTTP service once. The protocol wrapping happens at the boundary.

The discipline costs about 20-30% extra on the first tool of a project because the agency is now building two layers instead of one. The payoff is on the second client. Agencies that have adopted this pattern through 2025-2026 report quoting cross-protocol features at roughly 1.2x the single-protocol price and delivering them at 0.4x the single-protocol effort on re-implementation. The savings compound across the agency’s portfolio rather than per-project.

The HTTP service is also the easier asset to support, monitor, and debug. The wrapper is the protocol-specific surface that changes when the protocol changes; the service does not need to change. Maintenance load drops because changes to the business logic are made in one place rather than per protocol.

The proprietary-platform variation

The same rule works for clients on Microsoft Copilot, Salesforce Agentforce, SAP Joule, or ServiceNow Now Assist. The proprietary platforms maintain their own extensibility surfaces (Copilot connectors, Agentforce actions, Joule extensions, Now Assist actions). For a client on one of these, the agency’s tool inventory needs a proprietary wrapper instead of (or in addition to) an open-protocol wrapper.

The plain-HTTP-first discipline still keeps the underlying business logic intact. The same internal service that wraps as an MCP server for one client can wrap as an Agentforce action for another. The wrapper hours are different per platform but the asset is the same. The accounting trick is unchanged: count the plain HTTP services as the asset, count the wrappers as disposable.

The harder case is when the client’s proprietary platform requires the business logic to live inside the platform (Salesforce Apex code embedded in an Agentforce topic, ServiceNow scripts embedded in a Now Assist flow). In those cases, the portability story narrows, the platform-embedded logic is not portable to the next client. The agency should price these projects accordingly and surface the platform-lock to the client in the project scope, rather than discovering it in the next contract negotiation.

Pricing the work

Two practical patterns.

Pattern A: pick the protocol per project, deliver wrapper and service to the client. The agency picks the protocol per project as part of discovery, prices the wrapper hours into the project, and the client owns the wrapper (and usually the service) on delivery. The agency keeps the underlying HTTP service as agency-side intellectual property unless the contract specifies otherwise. The pattern is simple, defensible, and works for one-off projects.

Pattern B: maintain a multi-protocol portfolio internally, reuse the service across clients. The agency builds a portfolio of plain HTTP services covering common patterns (Slack summarisation, CRM enrichment, document Q&A, ticket triage, calendar coordination). For each client project, the agency wraps the relevant services into the client’s protocol estate. The first protocol is priced at the project rate; additional protocols on the same logic are priced at 30-40% of the original because the underlying service is reused. The pattern is more attractive for agencies serving the same vertical across multiple clients on different platforms.

Whichever pattern the agency uses, the contract should be explicit about what is delivered. “Wrapper + service” (client owns everything) versus “wrapper only” (client owns the wrapper, agency retains the service) is the key distinction. Most agency-side ambiguity on this question shows up later as an argument about reuse rights when the second client wants the same functionality.

Tracking the switching cost in hours

The simplest agency-side instrument for the protocol decision is an hours log. Per client project, log: hours building plain HTTP services (the asset), hours building protocol wrappers (the disposable layer), and the protocol(s) the wrappers target. The log is a spreadsheet column added to the existing time-tracking practice.

The log produces two useful answers. First, it makes the asset-versus-disposable ratio visible, agencies typically discover that wrapper hours are 15-25% of the total when the HTTP-first discipline is in place, versus 70-90% when it is not. Second, it produces a credible re-platform cost figure when a client asks “what would it cost to move this to a different protocol”, the answer is the wrapper hours, not the wrapper-plus-service hours. Quoting the re-platform cost accurately is often the difference between winning the second project with the same client and losing it to a competitor with a better number.

The order of operations for the small agency

The credential and identity layer (the NHI starter kit at OPS-074) comes first. The shadow-AI audit (the intra-vendor capability audit at OPS-075) comes second if the agency is consuming AI tools rather than only building them. The protocol picking covered in this piece comes third, and only matters when the agency is building agentic features for clients rather than only using AI tools on its own work.

The enterprise framings at AM-167, AM-168, and AM-169 are the language the agency’s enterprise clients are now using on the procurement side of the table. The agency that can hold both sides of the conversation, the small-team operational version and the enterprise procurement version, has a structural advantage in the 2026 services market. The procurement-mature small agency is the one that ages well as the protocol fragmentation widens through the rest of 2026 and into 2027.

ShareX / TwitterLinkedInEmail

OPS-076holdingsince 24 May 2026SiblingAM-169RegisterReporting

Spotted an error? See corrections policy →

Related reading

OPS-LEDGER · 48 reviewed