Somewhere in your organization, someone is buying contact and company data. The data shows up in your CRM, your marketing automation, your sales enablement tools. People use it to send emails, make calls, build territory plans, screen prospects.
Almost nobody asks where it came from.
That's the story of the B2B data industry. For the better part of two decades, the question of where a contact record originated has been treated as a technical detail — something the vendor's engineers handle, something not worth raising in a procurement conversation. The data is the data. You buy it, you use it, you move on.
We think this era is ending. Three things are forcing the change at once: regulators are starting to take data sourcing seriously, AI agents are about to consume B2B data at a scale that makes accountability questions unavoidable, and buyers themselves are beginning to notice that the contact records they paid for are also, somehow, in their competitors' CRMs.
When you start pulling on the thread, you find that a lot of what gets sold as “owning” data turns out to be something much weaker — a license to query someone else's index, with no real visibility into where the underlying records came from or how they got there.
This piece is about what it means to actually own your data, and why the distinction is going to start mattering more than it did.
The four properties of real ownership
A data company that actually owns its data can answer four questions about every record it serves.
Where did this record come from?
Not in vague terms — specifically. A URL, a public filing, a structured source. Not “our proprietary research network.” Not “a combination of public and partner sources.” A real source you could go look at yourself.
When was it last seen there?
A timestamp. Data goes stale. A record that was accurate in 2022 is not, by default, accurate in 2026. The vendor either keeps that timestamp and refreshes the record, or they don't. If they don't, you're paying for the privilege of guessing.
Who decided this record exists in our database?
This is the controller question, and it's the one most vendors do not want asked. Under GDPR Article 4(7), a data controller is “the natural or legal person... which, alone or jointly with others, determines the purposes and means of the processing of personal data.” Translation: if you decided to collect this record and you decided how to organize it, you are the controller, regardless of whether you call yourself one.
What do we do when someone asks to be removed?
A real vendor can describe their process. A vague vendor cannot. The difference shows up the first time you forward a subject access request and watch what comes back.
These four questions — provenance, recency, accountability, removal — are the structural test for whether a data company actually owns what it sells. Most don't pass. Most have never been asked.
The cost of not knowing
When provenance is opaque, several things stop working at once.
Compliance becomes guesswork. You cannot, with confidence, tell a regulator how a record entered your system if your vendor cannot tell you how it entered theirs. The “we relied on our supplier's representations” defense is increasingly thin. Recent enforcement actions across the EU have made clear that downstream users carry their own liability when they should have known their data sources were dubious.
Quality degrades invisibly. A record that was sourced from a website three years ago and never refreshed will eventually start producing bounces, wrong phone numbers, and outreach to people who left the company. The vendor's dashboard will still show 100% verified, because verification was done in 2022. You only find out the data is stale when your reply rates collapse.
Differentiation disappears. When every vendor in a space pulls from substantially the same upstream sources, the data they sell starts to look identical. Your team is reaching out to the same prospects, with the same talking points, drawn from the same database, as everyone else in your category. You paid to access a shared pool. You did not buy an advantage.
Removal requests turn theatrical. Real removal means the record is deleted and it doesn't come back in the next data refresh. Theatrical removal means the vendor flips a flag on their copy of the record, while the upstream source they re-import from monthly continues to surface the same person, who gets re-added under a new ID. The person ends up filing the same subject access request quarterly, with no permanent effect.
These costs accumulate quietly. They show up in your numbers months later, attributed to “list fatigue” or “deteriorating outbound performance” — when the real cause is that you've been paying for an asset that was never really yours.
What real ownership requires
A data company that actually owns its data has made some specific choices, and lives with their consequences.
It has chosen to collect data itself rather than buy it from undisclosed brokers. This is harder. It requires building and maintaining collection infrastructure. It rules out the easy path of reselling someone else's database under a different brand. The reward is that every record has a known origin.
It maintains source links on every record, not just at the dataset level. This means more storage, more engineering complexity, and more friction when the data is queried. The reward is that any record can be audited back to its source on demand — by you, by a regulator, by the person the record is about.
It treats itself as a controller for the database it builds, not a processor or an orchestrator. This is a legal posture as much as a technical one. The reward is that subject access requests get real answers, removal requests have real effect, and the company can stand behind what it sells.
It has a permanent suppression mechanism, not just a soft-delete. When someone asks to be removed, they are removed — and they stay removed across every future data refresh. The reward is that the company is not endlessly re-violating the same person's stated preference.
None of these choices are technically impossible. They are simply more expensive than the alternative. Companies that have made them have made them deliberately, knowing the cost.
What this means for buying B2B data in 2026
The next several years are going to be uncomfortable for vendors who took the opaque path. The pressure will come from three directions at once.
Regulators are getting more specific. Vague responses to subject access requests are becoming actionable rather than tolerated. The legal cover that vendors enjoyed when they could plead ambiguity about their data sourcing is thinning out. When a regulator asks where a record came from, the answer “we obtain data from a network of partners and public sources” is starting to receive the response it deserves.
AI agents are going to query B2B data at machine scale. When a single agent run can fan out into thousands of structured data requests, the question of who is the controller for those queries — and who is liable for what's returned — stops being a hypothetical. The vendors who have positioned themselves as “we don't really have data, we just orchestrate” are about to discover that this positioning does not survive contact with an actual API contract.
And buyers themselves are starting to notice. The first time a senior person in your organization receives a cold outreach to a phone number they only ever shared with two trusted contacts, the conversation about how that number ended up in someone's database becomes harder to avoid. We've spoken with several buyers this year who have started auditing their data vendors for exactly this reason. The audits have not gone well for the vendors.
The shift this implies is not dramatic. It is not a sudden collapse of legacy providers, nor a clean replacement by a new generation. It is gradual: a re-pricing of risk, a slow movement of serious buyers toward vendors who can answer the four questions, and a quiet exit of the vendors who cannot.
What we're trying to do
Fullinfo was built on the bet that this shift would happen, and that being on the right side of it from the start would matter more than being early to market. We collect data ourselves, from sources we can name. Every record carries its source link and a timestamp. We are the controller for the data in our database, and we say so in our privacy policy. Removal means removal — once you're suppressed, you stay suppressed.
None of this is a marketing position. It is the structural cost of doing the work the right way. Companies that don't pay this cost are cheaper to run, in the short term. They are also positioned to be on the wrong side of every trend that's coming.
The next few articles on this blog will explore specific dimensions of what we're describing here — the procurement questions buyers should be asking, the way AI agents are about to reshape accountability for data, what privacy-by-design actually looks like when applied to a B2B data company. If any of that is interesting to you, we'd love to have you read along.
Want to see what data with real provenance looks like?
Request early access. We'll show you how Fullinfo works — including the part where every record carries the source it came from.
Request Early Access →Fullinfo is a company intelligence platform built from the open web. We index 100M+ organizations across 195+ countries, with source-linked, auditable records. Our privacy policy describes how we handle personal data.