Data, AI, and the question of accountability

Something interesting is happening in the B2B data space. Vendors who have spent years explaining that they don't really hold data — they orchestrate, they enrich, they sit between sources and customers — are now shipping structured APIs designed for AI agents to query.

These are usually announced as MCP servers, after the Model Context Protocol that Anthropic released in late 2024 and that the major AI vendors have since adopted. The story is roughly: AI agents need access to fresh, structured business data; we have the data; here's a clean way to plug us in.

It's a sensible product move. AI agents are going to be doing a lot of B2B data lookups over the next few years, and a structured tool interface is much easier for an agent to use than a web scraper or an ad-hoc API. We're building in this direction ourselves.

But the move surfaces a question that the industry has been avoiding for a long time, and it surfaces it in a way that is hard to evade. The question is simply: if you are returning structured personal data records to an AI agent through a tool API, what exactly is your role under data protection law?

The shape of the new arrangement

To see why this is a sharper question than it was a year ago, it helps to be specific about what an MCP server actually does.

An AI agent receives a user's instruction — “find me 50 SaaS companies in Germany with 100–500 employees and a head of revenue I can email this week.” The agent reasons about how to fulfill that request. It calls a tool. The tool is implemented by a vendor's MCP server. The server queries an internal database. It returns structured records: company names, websites, headcounts, contact information, source links if any. The agent reads the response and continues.

From a technical standpoint, this is just an API call. From a regulatory standpoint, several things have happened that are different in kind from the SaaS query patterns that came before.

The records were returned at machine speed, in volume, with no human in the loop on the vendor's side reviewing what was sent. The customer (the human user) sees only the agent's summary, not the per-record data flow. The agent operates with some autonomy — it may fan out additional queries based on what comes back, in ways that the user did not specifically authorize.

None of this is unique to MCP, exactly. It is the natural arrangement when an autonomous agent is the data consumer. But MCP makes the pattern legible enough that the accountability question can finally be asked clearly.

The question MCP forces

Under GDPR Article 4(7), a data controller is the entity that determines the purposes and means of processing personal data. The vendor running the MCP server made decisions about which records to put in the database, how to organize them, what to return when queried, and what to log. Those are controller decisions.

If the vendor's marketing claim is that they don't really hold data — that they are an orchestration layer, a processor for their customers, a facilitator — the MCP server is awkward to reconcile with that claim. You cannot serve structured data through a tool API without holding it, organizing it, and choosing what to return. The architecture forces the question.

Several specific accountability questions follow from this:

Who is the controller for the act of compiling the database? Almost certainly the vendor. The customer did not decide which professionals to include in the index. The vendor did. The vendor is therefore a controller for the database itself, regardless of how it positions itself for the query.

Who is the controller for the act of serving records to an agent? This is where it gets interesting. The vendor decided which records to return, on what basis, with what filtering, at what cadence. The customer decided which question to ask. Both are controllers, jointly, for the resulting data flow. This is the joint controller arrangement contemplated by Article 26.

What does notification under Article 14 look like at agent scale? Article 14 requires that data subjects be notified when personal data is processed about them that was not obtained directly from them. The disproportionate-effort exception in 14(5)(b) is the workhorse provision for B2B data companies. But the exception assumes a reasonable proportionality between effort and benefit. Once an AI agent is fanning out thousands of queries on behalf of a single user, the calculation changes. It is going to be re-examined.

What does a subject access request look like when the requester does not know which agent saw their data? Today, a person sends a SAR to a vendor and the vendor returns what they hold. When the data flow runs through an agent layer, the SAR has to extend to: which agents queried your record, on whose behalf, with what filtering, when. This is a different shape of operational obligation than the industry has built for.

What this means for vendors who claimed they don't hold data

For years, “we don't really have data, we just orchestrate” has been a useful position. It allows a vendor to disclaim controller status, push liability onto upstream sources whose contracts are unknown, and avoid the obligations that come with being a real data company.

The MCP era makes this position structurally untenable. A vendor running an MCP server is, by construction, holding data and serving it. The position survives only as long as nobody looks closely. Once a regulator or a serious customer looks closely, the disclaimer dissolves.

This is not a forecast. It is a description of where the law already is. Article 4(7) does not care what a company calls itself in its marketing. It looks at what the company actually does. A company that compiles a database and serves records from it is a controller for that database. The MCP server is just the API layer in front of the controller relationship; it does not change the relationship.

What is going to change, in our view, is whether anyone notices. The MCP wave is going to make the underlying accountability arrangement legible to regulators, customers, and journalists in a way it was not before. The vendors who were already operating as honest controllers will have nothing new to do. The vendors who were positioned as “we don't really have data” will have a problem that does not have a marketing solution.

What good looks like

A B2B data company that is comfortable being a controller in the MCP era can do a few specific things.

It can describe its role plainly: “We are the controller for the data in our database. When we serve a record through our API, including to an AI agent, we are jointly responsible with the customer for that processing event. We have a privacy policy that says so. We have a data processing agreement with our customers that says so.”

It can maintain a per-record audit log that records which queries returned which records, on behalf of which customer, with what filtering. This is what allows it to honour subject access requests in the agent era. It is also what allows it to defend itself when something goes wrong.

It can support removal at the controller level, not the customer level. When an individual asks to be removed, they are removed from the source database — which means they will not appear in any future agent query, regardless of which customer that agent is running on behalf of. The suppression is permanent and infrastructure-deep.

None of this is exotic. It is the same posture a serious data company has always had. The MCP era simply raises the cost of pretending otherwise.

The broader pattern

What's happening with MCP is a small example of a larger pattern: as AI agents take on more of the actual data work, the architectural choices that vendors made when humans were the only consumers become visible in new ways. Positions that were sustainable when each interaction was reviewed by a person are not sustainable when each interaction is one of thousands, executed in seconds, with no human review.

This is going to put pressure on a lot of B2B infrastructure that was built for a slower, more forgiving environment. The first wave of that pressure is going to land on the companies that have been least honest about what they actually do. We expect the second wave will land on the companies that have been honest but underinvested in the audit and governance infrastructure required to operate at agent scale.

We're trying to be ready for both waves. The first is mostly a positioning problem. The second is mostly an engineering problem. Both have to be solved before AI agents become the dominant consumer of B2B data — which, on current trajectory, will happen in the next eighteen months.

Built for the agent era

Fullinfo is a controller, and we operate like one.

Source-linked records, per-record audit trail, permanent suppression. Built for the kind of accountability the MCP era is going to require.

Request Early Access →

This is the third piece in a Fullinfo blog series on what serious B2B data ownership looks like. Earlier pieces: What it means to actually own your data and The four questions every B2B data buyer should ask their vendor.

AI agents MCP GDPR Accountability

Data, AI, and the question of accountability

The shape of the new arrangement

The question MCP forces

What this means for vendors who claimed they don't hold data

What good looks like

The broader pattern

Fullinfo is a controller, and we operate like one.

Continue reading

What it means to actually own your data

The four questions every B2B data buyer should ask their vendor

Why we built on the open web