Configuration
Aether uses a YAML configuration file. Create an aether.yaml anywhere on disk and pass its path as the first positional argument on every command. Aether does not auto-discover config files.
Basic Configuration
services:
torch:
base_url: "https://your-torch-server.org"
username: "your-username"
password: "your-password"
dimp:
url: "http://your-dimp-server:32861"
pipeline:
enabled_steps:
- torch
- dimp
jobs_dir: "./jobs"Service Configuration
TORCH
services:
torch:
base_url: "https://your-torch-server.org"
username: "your-username"
password: "your-password"
extraction_timeout: PT30M
polling_interval: PT5SDIMP
services:
dimp:
url: "http://your-dimp-server:32861" # server root; /fhir appended by client
bundle_split_threshold_mb: 10 # Auto-split large bundlesFlattening
services:
flattening:
service_url: "http://fhir-flattener:8000"
lookup_path: "/path/to/flatten-lookup.json"
formats:
- csv
timeout: PT30MSend
Direct to FHIR server:
services:
send:
send_as: "direct_resource_load"
url: "https://fhir-server.example.com" # server root; /fhir appended by client
batch_size: 100
auth:
username: "${FHIR_USER}"
password: "${FHIR_PASSWORD}"DSF transfer:
services:
send:
send_as: "transfer_load"
url: "https://transfer-server.example.com" # server root; /fhir appended by client
auth:
oauth_issuer_uri: "${OAUTH_ISSUER}"
oauth_client_id: "${OAUTH_CLIENT_ID}"
oauth_client_secret: "${OAUTH_CLIENT_SECRET}"
transfer:
project_identifier: "MII-PROJECT"
organization_identifier: "your-org.example.de"S3 upload (AWS S3, MinIO, Ceph):
services:
send:
send_as: "s3_upload"
s3:
bucket: "${S3_BUCKET}"
region: "eu-central-1"
access_key_id: "${AWS_ACCESS_KEY_ID}"
secret_access_key: "${AWS_SECRET_ACCESS_KEY}"
# endpoint: "http://minio.example.com:9000" # for non-AWS stores
# use_path_style: true # required for MinIO
# timeout: PT30MSee the Send step guide for full S3 options and proxy-auth behaviour.
Local Import
services:
local_import:
dir: "/path/to/fhir/data" # Override with --dir flagValidation
services:
validation:
url: "http://your-validator:8080/fhir"
fail_on_error: true # false to continue pipeline despite validation errorsPipeline Steps
pipeline:
enabled_steps:
- torch # OR local_import OR http_import
- validation # Validate FHIR data (optional)
- dimp # Pseudonymization
- wait # Pause for inspection (optional)
- flattening # FHIR to CSV (requires CRTDL)
- send # Upload to destinationStep Placement Rules
Wait steps:
- Can be placed between any two steps
- Cannot be the first step (needs previous step output)
- Cannot be consecutive (redundant)
- Multiple wait steps are supported at different points in the pipeline
Processing steps (dimp, flattening):
- Should only appear once in the pipeline
- Multiple instances are not supported (output directories would be overwritten)
Import steps (torch, local_import, http_import):
- Must be first
- Only one import step allowed
Compression
compression:
enabled: true # default: true
level: "default" # fastest, default, better, bestOutput files use .ndjson.zst extension when enabled.
TLS
Trust custom or internal certificates and, when needed, disable verification entirely:
tls:
# PEM bundle of additional CA or server certificates to trust
# (system CAs are still trusted alongside these)
ca_cert_path: "${CA_CERT_PATH}"
# Skip certificate verification — development/testing only
insecure_skip_verify: falsetls applies to every outgoing HTTP client, including TORCH, DIMP, validation, flattening, send (FHIR + S3), and HTTP import.
Retry
Transient failures (network errors, 5xx responses, S3 SlowDown / ServiceUnavailable / timeouts) are retried with exponential backoff:
retry:
max_attempts: 5 # 1-10
initial_backoff_ms: 1000
max_backoff_ms: 30000CRTDL Preprocessing
Enriches CRTDL files with extra attributes (e.g. pseudonymisation identifiers) before sending them to TORCH. Disabled by default.
services:
crtdl_preprocessing:
enabled: true
# Option A: external rules file
enrichments_path: "/path/to/dimp-enrichments.json"
# Option B: inline rules (mutually exclusive with enrichments_path)
# enrichments:
# - group_reference: "https://www.medizininformatik-initiative.de/fhir/core/modul-person/StructureDefinition/Patient"
# create_if_not_exists:
# group_name: "Patient"
# attributes_to_add:
# - attribute_ref: "Patient.identifier:PseudonymisierterIdentifier"
# must_have: trueEnvironment Variables
Use environment variables for sensitive data:
services:
torch:
username: "${TORCH_USERNAME}"
password: "${TORCH_PASSWORD}"export TORCH_USERNAME="researcher"
export TORCH_PASSWORD="secret"Next Steps
- Quick Start - Run your first pipeline
- Pipeline Steps - Step details
- Configuration Reference - All options