Configuration Reference
Complete reference for all Aether configuration options.
Configuration Schema
services:
torch:
base_url: string
username: string
password: string
extraction_timeout_minutes: integer # default: 30
polling_interval_seconds: integer # default: 5
max_polling_interval_seconds: integer # default: 30
dimp:
url: string
bundle_split_threshold_mb: integer # 1-100, default: 10
flattening:
service_url: string
lookup_path: string
formats: [string] # ["csv"]
timeout: duration # default: 30m
send:
send_as: string # "direct_resource_load" or "transfer_load"
url: string
batch_size: integer # 1-1000, default: 100
auth:
username: string
password: string
oauth_issuer_uri: string
oauth_client_id: string
oauth_client_secret: string
transfer:
project_identifier: string
organization_identifier: string
validation:
url: string
max_concurrent_requests: integer # default: 4
bundle_chunk_size_mb: integer # default: 10
fail_on_error: boolean # default: true
local_import:
dir: string
pipeline:
enabled_steps: [string]
max_ndjson_line_size_mb: integer # default: 100
retry:
max_attempts: integer # 1-10, default: 5
initial_backoff_ms: integer # default: 1000
max_backoff_ms: integer # default: 30000
compression:
enabled: boolean # default: true
level: string # fastest, default, better, best
jobs_dir: string # default: ./jobsServices
TORCH
TORCH server for FHIR data extraction.
services:
torch:
base_url: "https://torch.example.org"
username: "${TORCH_USER}"
password: "${TORCH_PASSWORD}"
extraction_timeout_minutes: 30
polling_interval_seconds: 5
max_polling_interval_seconds: 30| Option | Type | Default | Description |
|---|---|---|---|
base_url | string | - | TORCH server URL (required if torch step enabled) |
username | string | - | Authentication username |
password | string | - | Authentication password |
extraction_timeout_minutes | int | 30 | Max wait time for extraction |
polling_interval_seconds | int | 5 | Initial status check interval |
max_polling_interval_seconds | int | 30 | Max interval (exponential backoff cap) |
DIMP
DIMP pseudonymization service.
services:
dimp:
url: "http://dimp:32861/fhir"
bundle_split_threshold_mb: 10| Option | Type | Default | Description |
|---|---|---|---|
url | string | - | DIMP service URL (required if dimp step enabled) |
bundle_split_threshold_mb | int | 10 | Split Bundles larger than this (1-100 MB) |
Flattening
fhir-flattener service for FHIR to CSV transformation.
services:
flattening:
service_url: "http://fhir-flattener:8000"
lookup_path: "/config/flatten-lookup.json"
formats:
- csv
timeout: 30m| Option | Type | Default | Description |
|---|---|---|---|
service_url | string | - | fhir-flattener service URL |
lookup_path | string | - | Path to lookup table file |
formats | []string | ["csv"] | Output formats |
timeout | duration | 30m | Request timeout |
Send
Destination server for uploading processed data.
Direct Resource Load
Upload FHIR resources directly to a FHIR server.
services:
send:
send_as: "direct_resource_load"
url: "https://fhir-server.example.com/fhir"
batch_size: 100
auth:
username: "${FHIR_USER}"
password: "${FHIR_PASSWORD}"Transfer Load
Package files for DSF-based transfer.
services:
send:
send_as: "transfer_load"
url: "https://transfer.example.com/fhir"
auth:
oauth_issuer_uri: "${OAUTH_ISSUER}"
oauth_client_id: "${OAUTH_CLIENT}"
oauth_client_secret: "${OAUTH_SECRET}"
transfer:
project_identifier: "MII-PROJECT"
organization_identifier: "your-org.example.de"| Option | Type | Default | Description |
|---|---|---|---|
send_as | string | - | direct_resource_load or transfer_load |
url | string | - | FHIR server base URL |
batch_size | int | 100 | Resources per transaction (direct mode, 1-1000) |
Authentication (choose one):
| Option | Description |
|---|---|
auth.username + auth.password | Basic authentication |
auth.oauth_issuer_uri + oauth_client_id + oauth_client_secret | OAuth2 client credentials |
Transfer settings (transfer_load mode only):
| Option | Description |
|---|---|
transfer.project_identifier | MII project identifier |
transfer.organization_identifier | Organization identifier |
Validation
FHIR validation service for data quality checks.
services:
validation:
url: "http://validator:8080/fhir"
max_concurrent_requests: 4
bundle_chunk_size_mb: 10
fail_on_error: true| Option | Type | Default | Description |
|---|---|---|---|
url | string | - | Validation service URL (required if validation step enabled) |
max_concurrent_requests | int | 4 | Concurrent validation requests |
bundle_chunk_size_mb | int | 10 | Bundle chunk size for batching resources (MB) |
fail_on_error | bool | true | Stop pipeline when validation finds data quality errors |
When fail_on_error is true (default), the pipeline stops after the validation step completes with errors. When false, validation reports are written but the pipeline continues.
Local Import
Default directory for local FHIR imports.
services:
local_import:
dir: "/data/fhir"| Option | Type | Description |
|---|---|---|
dir | string | Default import directory (overridable with --dir flag) |
Pipeline
pipeline:
enabled_steps:
- local_import
- dimp
- flattening
max_ndjson_line_size_mb: 100| Option | Type | Default | Description |
|---|---|---|---|
enabled_steps | []string | - | Pipeline steps to execute in order |
max_ndjson_line_size_mb | int | 100 | Maximum NDJSON line size in MB. Increase if you encounter "token too long" errors when reading large FHIR Bundles. Set to 0 to use default. |
Available steps:
| Step | Description |
|---|---|
torch | Import via TORCH (requires CRTDL) |
local_import | Import from local directory |
http_import | Import from HTTP URL |
dimp | Pseudonymize via DIMP |
wait | Pause for manual inspection |
flattening | Transform to CSV (requires CRTDL) |
send | Upload to destination server |
validation | Validate FHIR data against profiles |
csv_conversion | Convert to CSV (placeholder) |
parquet_conversion | Convert to Parquet (placeholder) |
Rules:
- One import step must be first (torch, local_import, or http_import)
- Wait step cannot be first or consecutive
- Flattening requires CRTDL input
Retry
retry:
max_attempts: 5
initial_backoff_ms: 1000
max_backoff_ms: 30000| Option | Type | Default | Range | Description |
|---|---|---|---|---|
max_attempts | int | 5 | 1-10 | Max retry attempts for transient errors |
initial_backoff_ms | int | 1000 | - | Initial backoff delay |
max_backoff_ms | int | 30000 | - | Max backoff delay |
Exponential backoff: wait = min(initial * 2^attempt, max)
Compression
compression:
enabled: true
level: "default"| Option | Type | Default | Description |
|---|---|---|---|
enabled | bool | true | Enable zstd compression |
level | string | "default" | Compression level |
Compression levels:
| Level | Speed | Ratio | Use Case |
|---|---|---|---|
fastest | ~500 MB/s | ~3-4x | Large datasets, CPU-constrained |
default | ~200 MB/s | ~4-5x | Balanced (recommended) |
better | ~100 MB/s | ~5-6x | Storage-constrained |
best | ~50 MB/s | ~6-7x | Archival |
Output files use .ndjson.zst extension when enabled. Aether auto-detects and reads both compressed and uncompressed files.
Jobs Directory
jobs_dir: "./jobs"Directory for job state and data files.
Environment Variables
All string values support environment variable substitution:
services:
torch:
username: "${TORCH_USERNAME}"
password: "${TORCH_PASSWORD}"
send:
url: "${FHIR_SERVER_URL}"Example Configurations
TORCH + DIMP
services:
torch:
base_url: "https://torch.hospital.org"
username: "${TORCH_USER}"
password: "${TORCH_PASS}"
dimp:
url: "http://dimp:32861/fhir"
pipeline:
enabled_steps:
- torch
- dimp
jobs_dir: "./jobs"Local Import with Flattening
services:
local_import:
dir: "/data/fhir"
dimp:
url: "http://dimp:32861/fhir"
flattening:
service_url: "http://fhir-flattener:8000"
lookup_path: "/config/lookup.json"
pipeline:
enabled_steps:
- local_import
- dimp
- flattening
compression:
enabled: true
level: "default"
jobs_dir: "./jobs"Full Pipeline with Send
services:
torch:
base_url: "https://torch.hospital.org"
username: "${TORCH_USER}"
password: "${TORCH_PASS}"
dimp:
url: "http://dimp:32861/fhir"
send:
send_as: "transfer_load"
url: "https://transfer.mii.de/fhir"
auth:
oauth_issuer_uri: "${OAUTH_ISSUER}"
oauth_client_id: "${OAUTH_CLIENT}"
oauth_client_secret: "${OAUTH_SECRET}"
transfer:
project_identifier: "MII-PROJECT"
organization_identifier: "hospital.example.de"
pipeline:
enabled_steps:
- torch
- dimp
- send
retry:
max_attempts: 5
compression:
enabled: true
jobs_dir: "/data/aether/jobs"Next Steps
- CLI Commands - Command reference
- Pipeline Steps - Step details