DIMP Pseudonymization

DIMP provides de-identification and pseudonymization for FHIR data, protecting patient privacy while keeping the data useful for research.

What DIMP Does

Removes or masks identifying information (names, addresses, etc.)
Generates consistent pseudonyms for patient identifiers
Preserves clinical data (diagnoses, procedures, lab values)

Configuration

Add DIMP to your aether.yaml:

yaml

services:
  dimp:
    url: "http://your-dimp-server:32861/fhir"

pipeline:
  enabled_steps:
    - torch   # or local_import
    - dimp    # Pseudonymize after import

jobs_dir: "./jobs"

Running Pseudonymization

bash

aether pipeline start your-query.crtdl

Aether will:

Extract data from TORCH (or import from files)
Send it to DIMP for pseudonymization
Save the protected data in the jobs folder

Output

Results are saved in:

jobs/<job-id>/
├── status.json          # Job status
└── dimp_results.ndjson  # Pseudonymized data

Large Bundles

For large datasets, Aether automatically splits bundles before sending to DIMP:

yaml

services:
  dimp:
    url: "http://your-dimp-server:32861/fhir"
    bundle_split_threshold_mb: 10   # Split bundles larger than 10MB

DIMP Pseudonymization ​

What DIMP Does ​

Configuration ​

Running Pseudonymization ​