Skip to content

DIMP (De-Identification-Minimisation-Pseudonymisation)

DIMP is the act of

  • De-identifying: Aggregating or transforming data to prevent re-identification (e.g. cutting of the birthdate at the month, shortening the ZIP from 5 to 2 characters)
  • Minimizing: Removing any data from a data set which is not necessary for a specific data use project (e.g. for a study which requires diagnosis codes the free text annotation of the diagnosis is not necessary)
  • Pseudonymising: Replacing identifier or IDs with Pseudonyms or hashed IDs to avoid direct re-identification (e.g. Patiend-ID-123 -> Patient_PSEUDONYM-999)

data for a data use project to preserve patient privacy.

FHIR Pseudonymizer and DIMP DUP Base yaml

To support standardized data use projects (DUPs), a DIMP DUP base configuration has been created, which can be used in conjunction with the fhir-pseudonymizer to apply DIMP functions to data. It implements the DIMP pseudonymization functions required by most data use projects for the fields defined in the MII core dataset.

This configuration is provided as a guideline only and does not guarantee compliance with applicable data privacy regulations.

Depending on your specific setup or the characteristics of your data, this base configuration will likely need to be extended or adjusted to meet the requirements of your particular project.

DSC ConceptFHIR ResourceFHIR ElementPrivacy RequirementDescriptionDIMP ImplementationDUP Base YAML
Technical IDAll.idCrypto hashTechnical resource ID, generated and assigned by the FHIR server. Not meaningful outside the system.Replace with CryptoHash- path: Resource.id
method: cryptoHash
truncateToMaxLength: 32
Technical ReferencesAll.referenceCrypto hashTechnical reference IDs linking resources to one another.Replace with CryptoHash- path: nodesByType('Reference').reference
method: cryptoHash
truncateToMaxLength: 32
Reference IdentifierAllReference.identifierRedact unless otherwise specified — see Encounter and Patient identifier rulesLogical identifier embedded in a reference. Redacted by default; specific identifier types are handled by more targeted rules below.Redact- path: nodesByType('Reference').identifier
method: redact
Encounter IdentifierAllEncounter.identifierIDAT – do not exportLogical encounter identifier, potentially a direct reference to the hospital's internal encounter ID (e.g. VN).Replace via re-pseudonymization using pseudonymization software- path: nodesByType('Identifier').where(type.coding.where(system='http://terminology.hl7.org/CodeSystem/v2-0203' and code='VN').exists()).value
method: pseudonymize
domain: https://my-dic-domain/identifiers/encounter-id
Patient IdentifierAllPatient.identifierIDAT – do not exportLogical patient identifier, potentially a direct reference to the hospital's internal patient ID (e.g. MR).Replace via re-pseudonymization using pseudonymization software- path: nodesByType('Identifier').where(type.coding.where(system='http://terminology.hl7.org/CodeSystem/v2-0203' and code='MR').exists()).value
method: pseudonymize
domain: https://my-dic-domain/identifiers/patient-id
NamePatientPatient.nameIDAT – do not exportPatient name; multiple HumanName elements may be present (e.g. official, maiden, nickname).Redact all HumanName nodes- path: nodesByType('HumanName')
method: redact
SexPatientPatient.genderIDAT and MDAT – export permittedAdministrative gender per the FHIR required value set (male, female, other, unknown).
Date of BirthPatientPatient.birthDateIDAT and MDAT – generalize to at least month precisionFull date of birth of the patient. Must be generalized before export.Generalize to year-month (YYYY-MM)- path: Patient.birthDate
method: generalize
cases:
"$this": "$this.toString().replaceMatches('(?<year>\\d{2,4})-(?<month>\\d{2})-(?<day>\\d{2})\\b', '${year}-${month}')"
Deceased (flag)PatientPatient.deceased.ofType(boolean)IDAT – removal recommended per DSC; subject to further discussionBoolean flag indicating whether the patient is deceased (true/false).Keep as-is- path: Patient.deceased.ofType(boolean)
method: keep
Deceased (date)PatientPatient.deceased.ofType(dateTime)IDAT – removal recommended per DSC; subject to further discussionDate and time of death. Could potentially be generalized to month precision analogous to date of birth — open for discussion. Redacted for now.Redact- path: Patient.deceased.ofType(dateTime)
method: redact
AddressPatientPatient.addressIDAT – removeFull address information in any form (home, work, temp, etc.).Redact all Address nodes- path: nodesByType('Address')
method: redact
Postal CodePatientPatient.address.postalCodeIDAT and MDAT – generalize to 2 digitsPostal code component of an address. Retaining the first 2 digits preserves regional granularity while reducing re-identification risk.Generalize to first 2 characters- path: Patient.address.postalCode
method: generalize
cases:
"$this": "$this.toString().substring(0,2)"
Free TextAllnodesByType('Annotation')IDAT – removeUnstructured free-text fields such as Observation.note. May contain patient-identifiable information and cannot be reliably de-identified automatically.Redact- path: nodesByType('Annotation')
method: redact