Identifiability, Anonymisation and Pseudonymisation
A person is identifiable when, though not yet identified, it is possible to identify them - directly (by name) or indirectly (by an identifier, or by combining data, i.e. jigsaw identification). Recital 26 sets the threshold: account is taken of all the means reasonably likely to be used (cost, time, available technology), so a merely hypothetical possibility is not enough. In Breyer, the CJEU held dynamic IP addresses can be personal data where a third party (an ISP) holds data that could be combined to identify the user. fall outside the GDPR; pseudonymised data do not - they are still personal data.
Identification can be direct (a name) or indirect (an ID number, IP address, or piecing together separate data - jigsaw identification). The big-data era makes jigsaw identification easier and identifiability a growing challenge.
To decide if a person is identifiable, take account of all the means reasonably likely to be used - including the cost, time and available technology for identification. A purely hypothetical chance of identification is not enough; there must be a reasonable likelihood.
In Patrick Breyer v. Germany the CJEU ruled that dynamic IP addresses can be personal data: a website operator could indirectly identify a user by combining the IP address with data held by the user's ISP, where a legal route to obtain that data exists (e.g. after a cyberattack). With CCTV, WP29 insists footage is personal data because the very purpose is to single out and identify individuals - to claim otherwise would be 'a sheer contradiction in terms'.
| Personal data | Pseudonymised data | Anonymised data | |
|---|---|---|---|
| Can the person be identified? | Yes (directly or indirectly) | Yes, but only with separately-held additional info (the 'key') | No - not, or no longer, identifiable |
| Within the GDPR? | Yes | Yes - still personal data | No - outside the GDPR |
| Typical technique | Raw records with identifiers | Replace identifiers (name, email) with a reference number; key kept separately and secured | Irreversible removal of identifiers; aggregation with a large enough sample |
| Purpose / benefit | Normal processing | Safeguard for data minimisation; helps assess compatibility of a new purpose | Removes data from GDPR scope entirely |
Terms like 'de-identified', 'indirectly identifiable' and 'pseudo-anonymised' are not defined in the GDPR and usually mean only that direct identifiers were removed - i.e. you still have pseudonymised personal data. 'PII' is also not a GDPR term and cannot be assumed to mean the same as personal data (U.S. sites often exclude IP addresses from PII).