Skip to content

Configuration Reference

Complete reference for all Aether configuration options.

Configuration Schema

yaml
services:
  torch:
    base_url: string
    username: string
    password: string
    extraction_timeout_minutes: integer  # default: 30
    polling_interval_seconds: integer    # default: 5
    max_polling_interval_seconds: integer # default: 30

  dimp:
    url: string
    bundle_split_threshold_mb: integer   # 1-100, default: 10

  flattening:
    service_url: string
    lookup_path: string
    formats: [string]                    # ["csv"]
    timeout: duration                    # default: 30m

  send:
    send_as: string                      # "direct_resource_load" or "transfer_load"
    url: string
    batch_size: integer                  # 1-1000, default: 100
    auth:
      username: string
      password: string
      oauth_issuer_uri: string
      oauth_client_id: string
      oauth_client_secret: string
    transfer:
      project_identifier: string
      organization_identifier: string

  local_import:
    dir: string

pipeline:
  enabled_steps: [string]
  max_ndjson_line_size_mb: integer           # default: 100

retry:
  max_attempts: integer                  # 1-10, default: 5
  initial_backoff_ms: integer            # default: 1000
  max_backoff_ms: integer                # default: 30000

compression:
  enabled: boolean                       # default: true
  level: string                          # fastest, default, better, best

jobs_dir: string                         # default: ./jobs

Services

TORCH

TORCH server for FHIR data extraction.

yaml
services:
  torch:
    base_url: "https://torch.example.org"
    username: "${TORCH_USER}"
    password: "${TORCH_PASSWORD}"
    extraction_timeout_minutes: 30
    polling_interval_seconds: 5
    max_polling_interval_seconds: 30
OptionTypeDefaultDescription
base_urlstring-TORCH server URL (required if torch step enabled)
usernamestring-Authentication username
passwordstring-Authentication password
extraction_timeout_minutesint30Max wait time for extraction
polling_interval_secondsint5Initial status check interval
max_polling_interval_secondsint30Max interval (exponential backoff cap)

DIMP

DIMP pseudonymization service.

yaml
services:
  dimp:
    url: "http://dimp:32861/fhir"
    bundle_split_threshold_mb: 10
OptionTypeDefaultDescription
urlstring-DIMP service URL (required if dimp step enabled)
bundle_split_threshold_mbint10Split Bundles larger than this (1-100 MB)

Flattening

fhir-flattener service for FHIR to CSV transformation.

yaml
services:
  flattening:
    service_url: "http://fhir-flattener:8000"
    lookup_path: "/config/flatten-lookup.json"
    formats:
      - csv
    timeout: 30m
OptionTypeDefaultDescription
service_urlstring-fhir-flattener service URL
lookup_pathstring-Path to lookup table file
formats[]string["csv"]Output formats
timeoutduration30mRequest timeout

Send

Destination server for uploading processed data.

Direct Resource Load

Upload FHIR resources directly to a FHIR server.

yaml
services:
  send:
    send_as: "direct_resource_load"
    url: "https://fhir-server.example.com/fhir"
    batch_size: 100
    auth:
      username: "${FHIR_USER}"
      password: "${FHIR_PASSWORD}"

Transfer Load

Package files for DSF-based transfer.

yaml
services:
  send:
    send_as: "transfer_load"
    url: "https://transfer.example.com/fhir"
    auth:
      oauth_issuer_uri: "${OAUTH_ISSUER}"
      oauth_client_id: "${OAUTH_CLIENT}"
      oauth_client_secret: "${OAUTH_SECRET}"
    transfer:
      project_identifier: "MII-PROJECT"
      organization_identifier: "your-org.example.de"
OptionTypeDefaultDescription
send_asstring-direct_resource_load or transfer_load
urlstring-FHIR server base URL
batch_sizeint100Resources per transaction (direct mode, 1-1000)

Authentication (choose one):

OptionDescription
auth.username + auth.passwordBasic authentication
auth.oauth_issuer_uri + oauth_client_id + oauth_client_secretOAuth2 client credentials

Transfer settings (transfer_load mode only):

OptionDescription
transfer.project_identifierMII project identifier
transfer.organization_identifierOrganization identifier

Local Import

Default directory for local FHIR imports.

yaml
services:
  local_import:
    dir: "/data/fhir"
OptionTypeDescription
dirstringDefault import directory (overridable with --dir flag)

Pipeline

yaml
pipeline:
  enabled_steps:
    - local_import
    - dimp
    - flattening
  max_ndjson_line_size_mb: 100
OptionTypeDefaultDescription
enabled_steps[]string-Pipeline steps to execute in order
max_ndjson_line_size_mbint100Maximum NDJSON line size in MB. Increase if you encounter "token too long" errors when reading large FHIR Bundles. Set to 0 to use default.

Available steps:

StepDescription
torchImport via TORCH (requires CRTDL)
local_importImport from local directory
http_importImport from HTTP URL
dimpPseudonymize via DIMP
waitPause for manual inspection
flatteningTransform to CSV (requires CRTDL)
sendUpload to destination server
validationValidate FHIR data (placeholder)
csv_conversionConvert to CSV (placeholder)
parquet_conversionConvert to Parquet (placeholder)

Rules:

  • One import step must be first (torch, local_import, or http_import)
  • Wait step cannot be first or consecutive
  • Flattening requires CRTDL input

Retry

yaml
retry:
  max_attempts: 5
  initial_backoff_ms: 1000
  max_backoff_ms: 30000
OptionTypeDefaultRangeDescription
max_attemptsint51-10Max retry attempts for transient errors
initial_backoff_msint1000-Initial backoff delay
max_backoff_msint30000-Max backoff delay

Exponential backoff: wait = min(initial * 2^attempt, max)

Compression

yaml
compression:
  enabled: true
  level: "default"
OptionTypeDefaultDescription
enabledbooltrueEnable zstd compression
levelstring"default"Compression level

Compression levels:

LevelSpeedRatioUse Case
fastest~500 MB/s~3-4xLarge datasets, CPU-constrained
default~200 MB/s~4-5xBalanced (recommended)
better~100 MB/s~5-6xStorage-constrained
best~50 MB/s~6-7xArchival

Output files use .ndjson.zst extension when enabled. Aether auto-detects and reads both compressed and uncompressed files.

Jobs Directory

yaml
jobs_dir: "./jobs"

Directory for job state and data files.

Environment Variables

All string values support environment variable substitution:

yaml
services:
  torch:
    username: "${TORCH_USERNAME}"
    password: "${TORCH_PASSWORD}"
  send:
    url: "${FHIR_SERVER_URL}"

Example Configurations

TORCH + DIMP

yaml
services:
  torch:
    base_url: "https://torch.hospital.org"
    username: "${TORCH_USER}"
    password: "${TORCH_PASS}"
  dimp:
    url: "http://dimp:32861/fhir"

pipeline:
  enabled_steps:
    - torch
    - dimp

jobs_dir: "./jobs"

Local Import with Flattening

yaml
services:
  local_import:
    dir: "/data/fhir"
  dimp:
    url: "http://dimp:32861/fhir"
  flattening:
    service_url: "http://fhir-flattener:8000"
    lookup_path: "/config/lookup.json"

pipeline:
  enabled_steps:
    - local_import
    - dimp
    - flattening

compression:
  enabled: true
  level: "default"

jobs_dir: "./jobs"

Full Pipeline with Send

yaml
services:
  torch:
    base_url: "https://torch.hospital.org"
    username: "${TORCH_USER}"
    password: "${TORCH_PASS}"
  dimp:
    url: "http://dimp:32861/fhir"
  send:
    send_as: "transfer_load"
    url: "https://transfer.mii.de/fhir"
    auth:
      oauth_issuer_uri: "${OAUTH_ISSUER}"
      oauth_client_id: "${OAUTH_CLIENT}"
      oauth_client_secret: "${OAUTH_SECRET}"
    transfer:
      project_identifier: "MII-PROJECT"
      organization_identifier: "hospital.example.de"

pipeline:
  enabled_steps:
    - torch
    - dimp
    - send

retry:
  max_attempts: 5

compression:
  enabled: true

jobs_dir: "/data/aether/jobs"

Next Steps

Healthcare data integration made simple