Why does protocol choice matter for a small agency that just builds what the client asks for?

Because the protocol is the lock-in layer underneath the agent code. If a 6-person agency builds 15 MCP servers for one client's tool inventory through 2026, those servers are MCP-coupled, they speak the MCP wire format, expect MCP capability discovery, and authenticate the MCP way. If a second client wants the same functionality on an A2A-based stack, the servers do not port. The hours that built them are a sunk cost specific to the first client's protocol estate. Most small agencies do not realise they are accumulating protocol-specific assets until the second client asks for a different stack and the re-implementation bill arrives. The protocol decision is the strategic decision the agency owner should make deliberately, not the engineering choice the first engineer happens to make.

What about clients on the proprietary platforms (Microsoft Copilot, Salesforce Agentforce, SAP Joule, ServiceNow Now Assist)?

The same rule still works. The proprietary platforms have their own extensibility surfaces (Copilot connectors, Agentforce actions, Joule extensions, Now Assist actions) and do not adopt MCP, A2A, or Llama Stack as defaults. For a client on one of these platforms, the agency's tool inventory needs the proprietary wrapper. The plain-HTTP-first discipline keeps the underlying business logic intact: the same internal service that wraps as an MCP server for one client can wrap as an Agentforce action for another. The wrapper hours are different per platform but the asset is the same. The accounting trick is the same: count the plain HTTP services as the asset, count the wrappers as disposable.

How does the agency price this for clients?

Two practical patterns. Pattern A: agency picks the protocol per project as part of the discovery phase, prices the wrapper hours into the project, and the client owns the wrapper on delivery. The agency keeps the underlying HTTP services as agency-side IP unless the contract specifies otherwise. Pattern B: agency builds a multi-protocol portfolio internally, prices the first protocol at the project rate and additional protocols at 30-40% of the original because the underlying service is reused. Pattern B is more attractive for agencies serving the same vertical across multiple clients on different platforms; Pattern A is simpler for agencies doing one-off projects. Whichever pattern, the contract should be clear on what is delivered (wrapper + service vs wrapper only) so the IP and reuse questions are answered before they become an argument.

What does the enterprise sibling piece add?

AM-169 covers the enterprise procurement layer of the same protocol decision. The three procurement clauses it names (protocol portability disclosure, tool inventory exit terms, protocol-roadmap commitment) are the language the agency's enterprise clients will increasingly use in 2026 subcontract and partnership terms. Reading AM-169 gives the agency owner the vocabulary to negotiate from the same starting position as the client's procurement team.

How does this article track its own claim?

Claim OPS-076 in the Holding-up ledger, 45-day review on 8 Jul 2026. Trigger conditions: (1) MCP, A2A, and Llama Stack publish a converged or interoperable spec, would move the claim toward Partial because the protocol fragmentation is closing; (2) one of the major proprietary platforms (Microsoft, Salesforce, SAP, ServiceNow) adopts MCP or A2A as a first-class default, would shift the picking logic materially; (3) a published 2026 SMB-services-agency benchmark on protocol-switching costs, would empirically anchor the plain-HTTP-first rule with real numbers. Sibling: AM-169.

Agent protocol for small agencies: MCP, A2A, Llama Stack

Q: How should the agency pick between MCP, A2A, and Llama Stack?

Three rules. Rule 1: match the client's existing stack. If the client is Anthropic-aligned (Claude Code, Cursor, Claude Enterprise), pick MCP. If the client is Google-aligned (Vertex AI agent platform, Gemini Enterprise), pick A2A. If the client is Llama-Stack-aligned (which usually means a self-hosted or sovereignty-driven deployment), pick Llama Stack. Do not introduce protocol diversity into a client that has none. Rule 2: when the client has no existing stack, default to MCP for tool-heavy work and A2A for inter-agent coordination. MCP has the broadest 2026 server ecosystem; A2A has the cleanest agent-collaboration story. Rule 3: avoid Llama Stack unless the client has an explicit reason for it (data sovereignty, fully self-hosted operations, hard cost constraint), because its tooling community is smaller in 2026 and the agency's hour-per-feature productivity is materially lower.

Q: What is the rule that keeps the agency's tool inventory portable?

Build every tool as a plain HTTP service first. Wrap it to the chosen protocol second. The wrapper is the disposable layer; the HTTP service is the asset. An internal Slack-summarisation tool exposed as a REST endpoint can be wrapped as an MCP server in 50 lines of glue, as an A2A endpoint in 50 lines of different glue, as a Llama Stack tool plugin in 50 lines of yet-different glue, and as a Copilot connector in roughly the same. The business logic, the prompts, the data joining, the rate limiting, the error handling, lives in the HTTP service once. The protocol wrapping happens at the boundary and is rewritable per client. The discipline doubles the up-front cost of the first tool by maybe 20-30% and eliminates the re-implementation cost on protocol switch. Agencies that have adopted this pattern report quoting cross-protocol features at 1.2x the single-protocol price and delivering them at 0.4x the single-protocol effort on re-implementation.

At a glance

Claim

A small agency (1-15 person team) building agentic features on paid client work in 2026 should pick its agent protocol per project by reading the client's existing stack (Anthropic-aligned client → MCP; Google-aligned client → A2A; sovereignty- or self-hosted-aligned client → Llama Stack), default to MCP for tool-heavy work and A2A for agent-collaboration work when the client has no existing stack, and keep its tool inventory portable by building every tool as a plain HTTP service first and wrapping it to the chosen protocol second. The plain-HTTP-first discipline costs roughly 20-30% extra on the first tool of a project and produces 60% wrapper-effort reduction on the second client requesting the same functionality on a different protocol. Tracking wrapper hours separately from service hours in the agency's time log is the simplest instrument for quoting the re-platform cost accurately when a client asks.

Supporting figure

Anthropic's Model Context Protocol crossed broad client and server adoption through 2025; Google's Agent2Agent protocol joined the Linux Foundation in 2025; Meta's Llama Stack consolidated as an open-source runtime on a separate track, the three are not on a convergence trajectory and most 2026 agent platforms support only one or two of them natively

Date

24 May 2026

Verdict

Holding(OPS-076)

Next review

8 Jul 2026(+20d)

If your small agency builds agentic features on paid client work, you are picking an agent protocol whether or not you call it that. MCP, A2A, and Llama Stack do not converge in 2026, and the proprietary surfaces (Microsoft Copilot Agent, Salesforce Agentforce, SAP Joule, ServiceNow Now Assist) do not adopt any of them as defaults. The first time the question becomes a problem for a 6-person agency is the second time the agency is asked to deliver similar functionality on a different protocol estate, and the re-implementation hours land on the agency’s books rather than the client’s, because the engineering work was protocol-coupled by default.

The enterprise framing of the protocol decision is at AM-169, which explains why the protocol decision is harder to reverse than the model decision and what the three procurement clauses are that close the lock-in gap from the contract side. The small-agency version of the same problem has a simpler operational answer: pick by client, build to a portable shape, and track switching cost in hours.

Three rules for picking the protocol

Rule 1: match the client’s existing stack. If the client is Anthropic-aligned, Claude Code in engineering, Cursor as the engineer’s IDE, Claude Enterprise for the broader workspace, pick MCP. If the client is Google-aligned, Vertex AI agent platform, Gemini Enterprise, A2A integrations, pick A2A. If the client is Llama-Stack-aligned, usually because of data sovereignty requirements, fully self-hosted operations, or hard cost constraints, pick Llama Stack. Do not introduce protocol diversity into a client that has none. The client’s existing protocol estate is the path of least resistance for the integration, the audit, and the long-term maintenance.

Rule 2: when the client has no existing stack, default to MCP for tool-heavy work, A2A for agent-collaboration work. MCP has the broadest 2026 server ecosystem, the most reference implementations, and the lowest engineering friction for wrapping internal services as tools. A2A has the cleanest story for cross-agent coordination, if the feature is “agent A delegates to agent B and waits for a result”, A2A is the lower-friction protocol. The two compose; an agency can build the tool surface in MCP and the agent-coordination surface in A2A in the same project.

Rule 3: avoid Llama Stack unless the client has an explicit reason for it. The tooling community is smaller in 2026; the engineering documentation is thinner; the agency’s hour-per-feature productivity is materially lower than on MCP or A2A. If the client says “we want self-hosted, sovereign, with no vendor dependency”, Llama Stack is the answer. If the client has not said that, it is the wrong default.

The portability rule: HTTP first, protocol second

The asset that gets locked in a protocol estate is the tool inventory. The agency’s hours that built the tools are the engineering cost that does not transfer when the next client wants the same functionality on a different protocol.

The rule that breaks the lock-in: build every tool as a plain HTTP service first, wrap it to the chosen protocol second. The wrapper is the disposable layer. The HTTP service is the asset.

An internal Slack-summarisation tool exposed as a REST endpoint at /summarise can be wrapped as an MCP server in roughly 50 lines of glue, as an A2A endpoint in roughly 50 lines of different glue, as a Llama Stack tool plugin in roughly 50 lines of yet-different glue, and as a Copilot connector in approximately the same. The business logic, the prompts, the data joining, the rate limiting, the error handling, lives in the HTTP service once. The protocol wrapping happens at the boundary.

The discipline costs about 20-30% extra on the first tool of a project because the agency is now building two layers instead of one. The payoff is on the second client. Agencies that have adopted this pattern through 2025-2026 report quoting cross-protocol features at roughly 1.2x the single-protocol price and delivering them at 0.4x the single-protocol effort on re-implementation. The savings compound across the agency’s portfolio rather than per-project.

The HTTP service is also the easier asset to support, monitor, and debug. The wrapper is the protocol-specific surface that changes when the protocol changes; the service does not need to change. Maintenance load drops because changes to the business logic are made in one place rather than per protocol.

The proprietary-platform variation

The same rule works for clients on Microsoft Copilot, Salesforce Agentforce, SAP Joule, or ServiceNow Now Assist. The proprietary platforms maintain their own extensibility surfaces (Copilot connectors, Agentforce actions, Joule extensions, Now Assist actions). For a client on one of these, the agency’s tool inventory needs a proprietary wrapper instead of (or in addition to) an open-protocol wrapper.

The plain-HTTP-first discipline still keeps the underlying business logic intact. The same internal service that wraps as an MCP server for one client can wrap as an Agentforce action for another. The wrapper hours are different per platform but the asset is the same. The accounting trick is unchanged: count the plain HTTP services as the asset, count the wrappers as disposable.

The harder case is when the client’s proprietary platform requires the business logic to live inside the platform (Salesforce Apex code embedded in an Agentforce topic, ServiceNow scripts embedded in a Now Assist flow). In those cases, the portability story narrows, the platform-embedded logic is not portable to the next client. The agency should price these projects accordingly and surface the platform-lock to the client in the project scope, rather than discovering it in the next contract negotiation.

Pricing the work

Two practical patterns.

Pattern A: pick the protocol per project, deliver wrapper and service to the client. The agency picks the protocol per project as part of discovery, prices the wrapper hours into the project, and the client owns the wrapper (and usually the service) on delivery. The agency keeps the underlying HTTP service as agency-side intellectual property unless the contract specifies otherwise. The pattern is simple, defensible, and works for one-off projects.

Pattern B: maintain a multi-protocol portfolio internally, reuse the service across clients. The agency builds a portfolio of plain HTTP services covering common patterns (Slack summarisation, CRM enrichment, document Q&A, ticket triage, calendar coordination). For each client project, the agency wraps the relevant services into the client’s protocol estate. The first protocol is priced at the project rate; additional protocols on the same logic are priced at 30-40% of the original because the underlying service is reused. The pattern is more attractive for agencies serving the same vertical across multiple clients on different platforms.

Whichever pattern the agency uses, the contract should be explicit about what is delivered. “Wrapper + service” (client owns everything) versus “wrapper only” (client owns the wrapper, agency retains the service) is the key distinction. Most agency-side ambiguity on this question shows up later as an argument about reuse rights when the second client wants the same functionality.

Tracking the switching cost in hours

The simplest agency-side instrument for the protocol decision is an hours log. Per client project, log: hours building plain HTTP services (the asset), hours building protocol wrappers (the disposable layer), and the protocol(s) the wrappers target. The log is a spreadsheet column added to the existing time-tracking practice.

The log produces two useful answers. First, it makes the asset-versus-disposable ratio visible, agencies typically discover that wrapper hours are 15-25% of the total when the HTTP-first discipline is in place, versus 70-90% when it is not. Second, it produces a credible re-platform cost figure when a client asks “what would it cost to move this to a different protocol”, the answer is the wrapper hours, not the wrapper-plus-service hours. Quoting the re-platform cost accurately is often the difference between winning the second project with the same client and losing it to a competitor with a better number.

The order of operations for the small agency

The credential and identity layer (the NHI starter kit at OPS-074) comes first. The shadow-AI audit (the intra-vendor capability audit at OPS-075) comes second if the agency is consuming AI tools rather than only building them. The protocol picking covered in this piece comes third, and only matters when the agency is building agentic features for clients rather than only using AI tools on its own work.

The enterprise framings at AM-167, AM-168, and AM-169 are the language the agency’s enterprise clients are now using on the procurement side of the table. The agency that can hold both sides of the conversation, the small-team operational version and the enterprise procurement version, has a structural advantage in the 2026 services market. The procurement-mature small agency is the one that ages well as the protocol fragmentation widens through the rest of 2026 and into 2027.

ShareX / Twitter LinkedIn Email

OPS-076holdingsince 24 May 2026SiblingAM-169RegisterReporting

Spotted an error? See corrections policy →

Part of the pillar

AI security for small teams →

Practical agent security without an IT department — non-human identity, shadow-AI audits, kill-switches, and tool-memory hygiene for small teams. 9 other pieces in this pillar.

Picking an agent protocol when you are a 6-person agency: MCP, A2A, Llama Stack, and the rule that keeps your tool inventory portable

Three rules for picking the protocol

The portability rule: HTTP first, protocol second

The proprietary-platform variation

Pricing the work

Tracking the switching cost in hours

The order of operations for the small agency

AI security for small teams →

Related reading

Three rules for picking the protocol

The portability rule: HTTP first, protocol second

The proprietary-platform variation

Pricing the work

Tracking the switching cost in hours

The order of operations for the small agency

AI security for small teams →

Related reading

Windsurf and MCP advisories hit the IDEs your team already runs: the May 2026 small-agency playbook

AI-generated fraud is now aimed at small businesses, and the defense is procedural, not technical

Calendar phishing and ClickFix: the June advisory, read for small teams

AI-written analysis, signed by a practitioner. One or two pieces a week.

AI-written analysis, signed by a practitioner. One or two pieces a week.