Inside the Connect system that trawls bank records, online platforms — and even social media — to close the UK tax gap (FY 2024–25)

LONDON — HM Revenue & Customs (HMRC) has disclosed that its flagship data‑analytics platform, known as “Connect,” delivered an additional £4.6bn in tax in the 2024–25 financial year, a sharp step‑up on prior years and a sign that algorithmic detection is now a mainstay of UK tax enforcement. The figure — obtained via a Freedom of Information request and reported this week — represents roughly a 35% increase on the system’s average additional annual yield of about £3.4bn in earlier years, underscoring the scale at which data mining is reshaping compliance.
Connect’s core proposition is simple: by pooling vast volumes of information and linking it using graph analytics, HMRC can spot patterns of under‑reporting that would elude human investigators. Officials say the platform ingests data from banks and other financial institutions, online marketplaces, property listings, customs declarations, company registries, and international information‑exchange networks. It can also draw signals from openly available social‑media activity, such as conspicuous displays of wealth that do not match declared income, though HMRC stresses all use is subject to strict legal gateways and proportionality tests.
The latest uplift lands against a backdrop of a persistent UK “tax gap” — the difference between tax theoretically due and tax actually collected — that HMRC estimated at £46.8bn for 2023–24 (around 5.3% of liabilities). While the gap is among the lowest relative to GDP in advanced economies, the absolute sum remains politically potent: each additional
billion recouped can underwrite visible public services or reduce borrowing. Analysts note that a £4.6bn incremental yield is material in that context, equivalent to a mid‑sized departmental budget line or a sizeable slice of personal‑tax allowance changes.
How Connect works
First rolled out in 2010 and expanded repeatedly since, Connect functions as a data lake coupled to a rules‑and‑models engine. At ingestion, multiple identifiers — National Insurance numbers, addresses, phone numbers, IP addresses, bank account details, and director or beneficial‑ownership records — are reconciled to construct entity graphs that map the real‑world relationships between people, companies and assets. From there, the system applies a toolkit of anomaly detectors: outlier detection on income flows versus lifestyle indicators; network‑based risk scoring to flag carousel‑ type VAT frauds; and fuzzy matching to spot shell companies that are phoenixing after insolvency.
Compliance personnel describe three especially productive channels in 2024–25:
Platform economy signals. Data from online marketplaces and gig‑work platforms — now more systematically reported — enable cross‑checks between gross receipts and self‑assessment figures. Unexplained gaps trigger “nudge letters” or,
in higher‑risk cases, formal enquiries.
Property and offshore transparency. Overseas property holdings, land registry records, and beneficial‑ownership disclosures provide a clearer view of rental and capital‑gains liabilities. Automatic exchange of information under international standards, including the Common Reporting Standard, has broadened visibility of overseas accounts.
Wealth‑lifestyle mismatches. Open‑source material (including public posts on social media), insurance and vehicle registries, and card‑spend aggregates can, in combination, highlight lifestyles inconsistent with declared income. Where indicators reach threshold, HMRC can escalate to statutory information powers.
A stronger stick — and sometimes a carrot
Connect’s output does not automatically mean a penalty or prosecution. Much of the £4.6bn came from prompted disclosures and negotiated settlements after taxpayers were contacted with evidence discrepancies. HMRC continues to use “nudge” campaigns to encourage voluntary corrections, reserving civil penalties or criminal action for egregious or deliberate cases. Officials say this blended approach maximises yield while keeping litigation costs contained.
Specialist units have amplified the data engine’s impact. The Wealthy and Mid‑Sized Business Compliance directorate has focused on complex arrangements among high‑income individuals and closely held companies, while taskforces targeting offshore evasion and labour‑market abuses have used Connect to prioritise casework. In parallel, better data about non‑resident online sellers has lifted VAT receipts, according to practitioners — a long‑running weak point that has improved as platform reporting has tightened.
Privacy, proportionality and accuracy
The potency of the system has revived long‑standing questions about data rights. HMRC insists its use of personal data is bounded by statute, with oversight from the Information Commissioner and internal governance. Still, civil‑liberties advocates argue that combining multiple datasets — especially where open‑source social‑media signals are used — can
produce inferences that feel intrusive even if lawful. Technologists counter that modern compliance cannot function without such joins, and that the alternative is higher headline tax rates to make up for leakage.
Accuracy is another flashpoint. Parliament’s Public Accounts Committee has previously criticised HMRC for lacking a comprehensive picture of personal wealth and for underestimating amounts lost to sophisticated evasion. Data scientists say no model is perfect: false positives are weeded out by human review and, where needed, by statutory information requests. False negatives — missed cases — remain endemic in any system, which is why the agency continually retunes its models and invests in new data feeds.
What £4.6bn means for policy
For the Treasury, the 2024–25 uplift will be welcomed as a non‑tax‑raising route to revenues. Yet experts warn against treating compliance yield as a piggy bank for unfunded promises. Yields can be lumpy — a large settlement can swing a year — and staffing, legal capacity and IT resilience are all constraining factors. To lock in gains, HMRC will have to
retain specialist investigators in a hot labour market for data skills, keep pace with crypto‑assets and new payment rails, and embed near‑real‑time reporting in high‑risk sectors.
Business groups, meanwhile, see a double‑edged sword. On one hand, robust enforcement levels the playing field for compliant firms and could allow targeted simplification if evasion falls. On the other, small businesses worry about administrative drag from increasingly granular information demands. The direction of travel is clear: more pre‑population
of returns from third‑party data, more real‑time prompts within bookkeeping software, and fewer excuses for “honest mistakes.”
The road ahead
With Connect now a proven workhorse, the next frontier is quality as much as quantity: joining still‑siloed datasets, improving entity resolution across aliases and cross‑border holdings, and applying causal analytics to prioritise interventions that actually change behaviour rather than simply raise short‑term cash. If HMRC can sustain that shift,
the 2024–25 experience may be remembered not just for a big number, but for a step change in how tax systems — and taxpayers — live inside a data‑driven state.




