All Field Notes
Essay·Mar 13, 2026·By Mathieu Stark·7 min read

MCP, agents and the quiet case for boring pipelines.

AI agents are only as good as the data contracts behind them. A practical view on what to automate first and why.

Most of the AI conversations we have in 2026 land in the same place. The team is excited about agents. They're piloting MCP integrations, RAG systems, conversational interfaces. The demos are working. The production deployments are stalling. When we look at the underlying data, we usually find the same set of unglamorous problems.

The boring pipelines, it turns out, are the load-bearing pieces of any AI system that has to survive contact with real users. Most of the AI investment in mid-market organizations is going into the surface layer (the agents, the prompts, the conversational UIs) and not enough is going into the foundation layer that the surface depends on. That mismatch is producing the gap between demo and production that almost every AI program is currently navigating.

What MCP actually changes

Model Context Protocol is genuinely useful. It standardizes how an LLM connects to a data source, which means the integration cost of giving an agent access to your CRM or your warehouse drops by an order of magnitude compared to building the same connection by hand. A team that would have spent three weeks building a custom connector to Salesforce can stand up an MCP server in a day.

That's a real efficiency. It is not, by itself, a strategy. MCP makes the connection cheap. It doesn't make the data on the other side of the connection any cleaner or any better-defined. An MCP integration to a confused warehouse is a faster path to a confused agent. The agent now has direct access to your messy data, refreshed in real time, with no governance layer in between, which is exactly the wrong direction if the goal is trustworthy answers.

The pattern most successful teams are converging on: MCP for the connection, governed semantic layer for the meaning. The agent talks to MCP. MCP talks to a layer that has tested definitions, freshness monitoring, and access controls. The agent inherits all of that without knowing it's there.

Why your agents are stalling

The pattern across the agent pilots that didn't reach production: the data layer underneath wasn't ready.

The agent could pull customer records, but the records weren't reconciled across systems. Three definitions of who the customer was, all visible to the agent through MCP, none of them resolving cleanly. The agent's answers reflected the underlying inconsistency. A user asking 'how much have we sold to Acme this year' got three different numbers depending on how the agent's last query happened to route through the entity graph.

The agent could pull metrics, but the metric definitions weren't tested. A model trained to answer 'how is revenue trending' was returning numbers that disagreed with the CFO's monthly report. The agent inherited the disagreement that had been lurking in the warehouse all along.

The agent could pull operational context, but the upstream data freshness wasn't monitored. A salesperson asking 'is this account healthy' got an answer based on data that was three days stale, with no signal to the user that the answer was based on stale data.

None of these are AI problems. They're pipeline problems. The agentic layer just made them visible at a speed nobody was prepared for. Bad data used to surface in monthly reports, where someone could catch it before it reached leadership. Bad data on top of an agent surfaces in real time, in customer-facing conversations, in board meetings.

An MCP integration to a confused warehouse is a faster path to a confused agent.

The case for boring pipelines

What separates the agentic deployments that work from the ones that stall is almost always the maturity of the data layer underneath. Specifically:

Data contracts at the boundary between source systems and the warehouse, so the agent's inputs don't change shape silently when an upstream team renames a column. The contracts catch breaking changes before they propagate, which means the agent stops getting wrong answers because of a Salesforce field rename two systems away.

A tested semantic layer where business metrics are defined exactly once, so the agent's answers reconcile with the dashboards leadership already trusts. The agent and the BI tool draw from the same definitions, and an answer the agent gives in a conversation matches the number leadership saw in last week's report.

Freshness monitoring, so the agent can refuse to answer when the data is too stale, instead of confidently returning yesterday's view. The agent can say 'this data is older than I'd usually require for this question' and surface the staleness rather than hide it.

A reconciled entity layer, so customer-related queries return one customer instead of three. The agent's answers about Acme are about Acme, not about whichever Acme record the query happened to find first.

None of this is interesting work. None of it ships in the demo video. All of it is the difference between an agent that surprises a user with a great answer and an agent that surprises them with a wrong one.

The maturity model that actually predicts agent readiness

We've started using a simple four-stage model with clients. Where you sit on it predicts which agentic deployments will work and which will stall.

Stage one: source systems with no shared definitions. AI agents at this stage produce demo-quality answers and production-quality confusion.

Stage two: warehouse and dbt project, but no semantic layer or formal definitions. AI agents here produce inconsistent answers because the data they pull is internally inconsistent. Pilots run. Pilots stall.

Stage three: warehouse with a semantic layer, governed metric definitions, and basic freshness monitoring. AI agents here can produce trustworthy answers on metric-level questions. The first wave of production-quality agentic deployments live in this stage.

Stage four: stage three plus reconciled entities, data contracts at source boundaries, and end-to-end lineage. AI agents here can be trusted with operational decisions, customer-facing answers, and executive reporting. This is where the next two years of competitive separation in mid-market AI will happen.

Most of the companies running agent pilots in 2026 are at stage two and trying to operate as if they're at stage four. The gap is exactly where their pilots are stalling.

What to automate first and why

We push clients toward a sequence that's the inverse of what most vendors recommend.

First, fix the entity layer. Customer, account, product, transaction. Reconciled, owned, tested. This unlocks any agentic deployment from being more accurate than gut feel. Most companies underestimate how much of their AI ambiguity comes from entity-level mismatches that look fine in any single system but explode when an agent crosses systems.

Second, build the semantic layer for the metrics that matter. Revenue, retention, churn, pipeline. One definition each, tested, version-controlled. The agent's answer to 'how is retention trending' is now traceable to a definition the CFO already endorsed.

Third, add data contracts and freshness monitoring. The agent gets predictable, refusable inputs. When something breaks upstream, the agent knows it before the user does and can degrade gracefully rather than answer confidently with stale data.

Then, and only then, build the agentic layer on top. By the time you get there, you have a system that can be trusted to answer real questions, because the answers are coming from a layer that was already trustworthy without the agent on top.

The agentic future will arrive for the companies who built boring pipelines first. The ones who skipped that step are getting the same answers they always did, just faster and more confidently wrong. MCP makes the connection cheap. It doesn't make the data trustworthy. The work that produces trustworthy data is unglamorous, takes longer than a quarter, and doesn't make for a good demo. It's also the only thing that separates an AI investment that compounds from an AI investment that just costs more every year.

Outcomes start with a Blueprint. We plan, build and run from there.

Thirty minutes with a 829 Analytics partner. You leave with a prioritized view of what to build first, what's worth waiting on, and the business metric anchoring each move. Whether or not we end up working together.