An investigation by The Guardian published last weekend has revealed that confidential health data from UK BioBank has been exposed online dozens of times. UK BioBank holds medical records of 500,000 British volunteers, including genome sequences, scans, blood samples and lifestyle information. ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­    ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­  
View in browser

Your round-up of the latest, greatest data stories

The Week in Data

Hello ODI Supporter,

 

An investigation by The Guardian published last weekend has revealed that confidential health data from UK BioBank has been exposed online dozens of times. UK BioBank holds medical records of 500,000 British volunteers, including genome sequences, scans, blood samples and lifestyle information. It has been credited with making breakthroughs in cancer, dementia, and diabetes research. Scientists at private companies and universities from all around the globe can apply for access to the data, and until late 2024, they were free to download that data directly to their own systems. However, it seems that researchers have inadvertently published partial and complete BioBank datasets to Github when sharing the code they have used to analyse the data - as journals and funders are increasingly asking them to publish the code. Between July and December 2025, 80 legal notices were issued to GitHub to remove code, which were complied with, but data still remains visible. The Guardian approached BioBank volunteers, and were able to re-indentify a person through procedure details and dates, which according to BioBank did not highlight a privacy risk as additional information was required. However, experts are unsure if BioBank will be able to fully regain control of all of the data that has been published online. 

 

Companies House have said a glitch in their online filing service, caused by a system update, may have let people view and edit data for other businesses and their directors for up to five months. The update took place in October, but Companies House were alerted to the bug on Friday last week and suspended the service. The accessible data included dates of birth, residential addresses and company email addresses, while it also may have been possible to make unauthorised filings on other company’s records. Passwords were not visible. The service was back up and running on Monday, and Companies House launched an internal investigation and also reported the incident to the Information Commissioner’s Office and the National Cyber Security Centre.

 

FBI Director Kash Patel admitted this week that the agency has started buying location data of US citizens from data brokers. The admission came under oath at a Senate intelligence committee hearing. It’s the first time since 2023 that the agency has confirmed it buys access to people’s data from data brokers, with much of the information coming from phone apps and games. Government agencies usually have to seek permission from a judge to authorise a search warrant based on evidence of a crime before they can demand private data from a phone or tech company. More recently, US agencies have circumvented this process by purchasing commercially available data, something House Republican Warren Davidson said was “a clear violation of the fourth amendment and is why I introduced the Government Surveillance Reform act.”  

 

The Solid Symposium returns for its fourth edition on ​​Thursday, 30 April - Friday 1st May, at City St George’s University (Clerkenwell Campus), London EC1V 0HB. Solid lets people and businesses take control of their data and combine it to achieve new results. The symposium brings together people from science, business, the public sector and academia to discuss and learn about the latest developments in Solid and looks to the next chapter for data on the web. We have a fantastic selection of speakers, including Sir Tim Berners-Lee, and we’ll cover a wide range of Solid-related topics, from innovating business models to achieving adoption at scale. Visit the Solid Symposium website for more information and tickets.

 

There are still tickets left for the next edition of Solid World, which will look at modelling, analysing and sharing research data, on Monday 23 March, 16:00-17:00 GMT. On Thursday 26 March, 16:00-17:00 GMT we’ll be looking at data portals and what their designers and maintainers need to consider as the global digital ecosystem shifts. And on Monday 30 March, 12:00-13:00 BST, we have the next Data Ethics Professionals webinar, in which we’ll look at key learnings for organisations on embedding data ethics. Tickets are available now. 

 

And finally… a digital twin of the Tees Valley in the north of England has led to the reduction of traffic delays by almost 14% over the last six months. The pilot project created a digital replica of the local road network, with data analysed with AI to predict where problems will arise. In the next phase of the Tees Valley Traffic Digital Twin Project, freight, active travel and environmental information will be added to the routes and data.

 

 

Until next time. 

 

David and Jo

 

PS: Our friends at the The Alan Turing Institute have two lectures coming up that we thought you might be interested in: Frontier AI under pressure - building resilience across layers and Making AI (truly) sustainable - from environmental costs to social impacts

Follow us on Bluesky

From the outside world

Confidential health records from UK BioBank project exposed online

The Guardian

Exclusive: Guardian investigation finds data from flagship medical research leaked dozens of times.

 

De-identified UK Biobank health data accidentally published online

Digital Health

UK Biobank has confirmed that volunteers’ de-identified health data has sometimes been unintentionally published online by researchers.

 

What you need to know after millions of UK firms’ data shared in major glitch

The Independent

Experts have called on Companies House to be more transparent about the glitch.

 

Companies House chief apologises over data breach

Civil Service World

Technical error meant logged-in users could view and alter some elements of another company’s details without their consent.

 

Kash Patel admits under oath FBI is buying location data on Americans

The Guardian

Admission came during questioning at Senate intelligence committee worldwide threats hearing.

 

FBI is buying location data to track US citizens, director confirms

TechCrunch

The FBI has resumed purchasing reams of Americans’ data and location histories to aid federal investigations, the agency’s director, Kash Patel, testified to lawmakers on Wednesday.

 

The Solid Symposium

The next chapter for data on the web.

 

AI 'traffic twin' helping to reduce delays

BBC

Transport bosses say a traffic system powered by artificial intelligence (AI) has reduced delays and sped up bus journeys.

 

From the ODI

Head of Research

Reporting to the Director of Research, the Head of Research is responsible for scoping, selling and delivering ODI’s research to support the creation of an open, trustworthy data ecosystem. 

 

Solid World March 2026

Free webinar, Monday 23 March, 4-5pm GMT book here

Modelling, analysing, and sharing research data.

 

Data Centric AI #13: Data Portals of the Future

Free webinar, Thursday 26 March, 2026, 4-5pm GMT book here 

Join our expert panel as they discuss what the designers and maintainers of data portals will need to consider as the global digital ecosystem shifts.

 

Data Ethics Professional #12: Key learnings for embedding data ethics

Free webinar, Monday 30 March 2026, 12-1pm GMT book here

Top tips for your organisation's data ethics journey.

 

Data Ethics Professionals #13: How Police Scotland built an ethical data culture

Free webinar, Wednesday, April 29 · 12 - 1pm BST book here

Join Corinne Russell, Data Ethics Lead, Police Scotland for a session looking at how Police Scotland have embedded data ethics in their use of technology.

 

Strategic Data Skills 

Course, Tuesdays from 31 March, 6 weeks, book now

Empower your decision-making with practical data skills and AI-assisted learning—no coding required.

The Week in Data

The Week in Data is our weekly round up of the latest news in data. If you haven't already, you can subscribe here. 

Subscribe

Want to change how you receive these emails?

You can Manage preferences or unsubscribe from all emails from the ODI.

LinkedIn
Bluesky Social

The Open Data Institute, 4th Floor, Kings Place, 90 York Way, London, N1 9AG

Unsubscribe Manage preferences