My Lords, anonymisation of data is crucially important in this debate. I want to see, through the Bill, a requirement for personal data, particularly medical data, to be held within trusted research environments. This is a well-developed technique and Britain is the leader. It should be a legal requirement. I am not quite sure that we have got that far in the Bill; maybe we will need to return to the issue on Report.
The extent to which pseudonymisation—I cannot say it—is possible is vastly overrated. There is a sport among data scientists of being able to spot people within generally available datasets. For example, the data available to TfL through people’s use of Oyster cards and so on tells you an immense amount of information about individuals. Medical data is particularly susceptible to this, although it is not restricted to medical data. I will cite a simple example from publicly available data.
4.30 pm
Let us say that you know that someone, with a date of birth of 6 May 1953, had two minor cardiac operations, once on 19 October 2003 and again on 24 September 2004. With that information, you would know virtually everything there is to know about Tony Blair. It is that easy to get hold of information. Of course, you will not have a medical dataset without a date of birth or pre-existing conditions. The whole idea that you can easily anonymise data is a blind alley.
Ultimately, information should be retained within a locked box, where it stays, and the medical researchers, who are crucial, come up with their programme, using a sandbox, that is then applied to the locked-away data. The researchers would just get the results; they would not go anywhere near the data. The outcome of the research is identical but people’s medical information —their genetic information—would be kept away, secure. We have to work to that objective. I do not quite know yet whether the Bill gets that far, but it is crucial.