Pseudonymisation is a process that replaces personal information in data with artificial identifiers, or ‘pseudonyms’ so that individuals cannot be directly identified. Examples of this process are replacing an NHS number with another random number, replacing a name with a code, or replacing an address with a location code.
Pseudonyms should not contain any information that could identify the individual to which they relate (e.g. should not be made up of characters from the date of birth, etc.). A single pseudonym for each replaced field or collection of replaced fields makes the data record less identifiable while remaining suitable for data analysis and data processing.
Pseudonymised data can be restored to its original state with the addition of information which allows individuals to be re-identified. In contrast, anonymisation is intended to prevent re-identification of individuals within the dataset.
Why this is important
Pseudonymisation by Data Services for Commissioners Regional Offices (DSCROs) is a crucial step in protecting patient privacy while allowing useful data analysis. DSCROs apply pseudonymisation to patient-level data before it is sent to NHS Mid and South Essex, replacing personal details with pseudonyms so that individual identities are protected. Only DSCROs hold the key to reverse the pseudonymisation, meaning neither NHS Mid and South Essex nor its partner organisations can re-identify individuals from this data.
By ensuring consistent pseudonyms across data sets and over time, patient information can be linked and analysed without compromising privacy. This enables the NHS to trace patient journeys across different services and providers, allowing for a better understanding of how different parts of the healthcare system interact and impact overall patient care.
How it’s done
To effectively pseudonymise data the following actions must be taken:
- Applying an algorithm: An algorithm is applied to specific data fields within the patient record, such as the NHS number, to create a pseudonymised identifier for use in secondary reports.
- Unique pseudonyms for each field: Each field of personal confidential data receives a unique pseudonym to maintain privacy while allowing data linkage.
- Consistent formatting: Pseudonyms are formatted to match the length and structure of the original NHS numbers or fields, ensuring readability in reports. For instance, an NHS number pseudonym might look like “5L7 TWX 619Z,” which includes letters to distinguish it from real NHS numbers.