Skip to content

Configuration Reference

Complete reference for all Aether configuration options.

Configuration Schema

yaml
services:
  torch:
    base_url: string
    username: string
    password: string
    extraction_timeout_minutes: integer  # default: 30
    polling_interval_seconds: integer    # default: 5
    max_polling_interval_seconds: integer # default: 30

  dimp:
    url: string
    bundle_split_threshold_mb: integer   # 1-100, default: 10

  flattening:
    service_url: string
    lookup_path: string
    formats: [string]                    # ["csv"]
    timeout: duration                    # default: 30m

  send:
    send_as: string                      # "direct_resource_load" or "transfer_load"
    url: string
    batch_size: integer                  # 1-1000, default: 100
    auth:
      username: string
      password: string
      oauth_issuer_uri: string
      oauth_client_id: string
      oauth_client_secret: string
    transfer:
      project_identifier: string
      organization_identifier: string

  validation:
    url: string
    max_concurrent_requests: integer   # default: 4
    bundle_chunk_size_mb: integer      # default: 10
    fail_on_error: boolean             # default: true

  local_import:
    dir: string

pipeline:
  enabled_steps: [string]
  max_ndjson_line_size_mb: integer           # default: 100

retry:
  max_attempts: integer                  # 1-10, default: 5
  initial_backoff_ms: integer            # default: 1000
  max_backoff_ms: integer                # default: 30000

compression:
  enabled: boolean                       # default: true
  level: string                          # fastest, default, better, best

jobs_dir: string                         # default: ./jobs

Services

TORCH

TORCH server for FHIR data extraction.

yaml
services:
  torch:
    base_url: "https://torch.example.org"
    username: "${TORCH_USER}"
    password: "${TORCH_PASSWORD}"
    extraction_timeout_minutes: 30
    polling_interval_seconds: 5
    max_polling_interval_seconds: 30
OptionTypeDefaultDescription
base_urlstring-TORCH server URL (required if torch step enabled)
usernamestring-Authentication username
passwordstring-Authentication password
extraction_timeout_minutesint30Max wait time for extraction
polling_interval_secondsint5Initial status check interval
max_polling_interval_secondsint30Max interval (exponential backoff cap)

DIMP

DIMP pseudonymization service.

yaml
services:
  dimp:
    url: "http://dimp:32861/fhir"
    bundle_split_threshold_mb: 10
OptionTypeDefaultDescription
urlstring-DIMP service URL (required if dimp step enabled)
bundle_split_threshold_mbint10Split Bundles larger than this (1-100 MB)

Flattening

fhir-flattener service for FHIR to CSV transformation.

yaml
services:
  flattening:
    service_url: "http://fhir-flattener:8000"
    lookup_path: "/config/flatten-lookup.json"
    formats:
      - csv
    timeout: 30m
OptionTypeDefaultDescription
service_urlstring-fhir-flattener service URL
lookup_pathstring-Path to lookup table file
formats[]string["csv"]Output formats
timeoutduration30mRequest timeout

Send

Destination server for uploading processed data.

Direct Resource Load

Upload FHIR resources directly to a FHIR server.

yaml
services:
  send:
    send_as: "direct_resource_load"
    url: "https://fhir-server.example.com/fhir"
    batch_size: 100
    auth:
      username: "${FHIR_USER}"
      password: "${FHIR_PASSWORD}"

Transfer Load

Package files for DSF-based transfer.

yaml
services:
  send:
    send_as: "transfer_load"
    url: "https://transfer.example.com/fhir"
    auth:
      oauth_issuer_uri: "${OAUTH_ISSUER}"
      oauth_client_id: "${OAUTH_CLIENT}"
      oauth_client_secret: "${OAUTH_SECRET}"
    transfer:
      project_identifier: "MII-PROJECT"
      organization_identifier: "your-org.example.de"
OptionTypeDefaultDescription
send_asstring-direct_resource_load or transfer_load
urlstring-FHIR server base URL
batch_sizeint100Resources per transaction (direct mode, 1-1000)

Authentication (choose one):

OptionDescription
auth.username + auth.passwordBasic authentication
auth.oauth_issuer_uri + oauth_client_id + oauth_client_secretOAuth2 client credentials

Transfer settings (transfer_load mode only):

OptionDescription
transfer.project_identifierMII project identifier
transfer.organization_identifierOrganization identifier

Validation

FHIR validation service for data quality checks.

yaml
services:
  validation:
    url: "http://validator:8080/fhir"
    max_concurrent_requests: 4
    bundle_chunk_size_mb: 10
    fail_on_error: true
OptionTypeDefaultDescription
urlstring-Validation service URL (required if validation step enabled)
max_concurrent_requestsint4Concurrent validation requests
bundle_chunk_size_mbint10Bundle chunk size for batching resources (MB)
fail_on_errorbooltrueStop pipeline when validation finds data quality errors

When fail_on_error is true (default), the pipeline stops after the validation step completes with errors. When false, validation reports are written but the pipeline continues.

Local Import

Default directory for local FHIR imports.

yaml
services:
  local_import:
    dir: "/data/fhir"
OptionTypeDescription
dirstringDefault import directory (overridable with --dir flag)

Pipeline

yaml
pipeline:
  enabled_steps:
    - local_import
    - dimp
    - flattening
  max_ndjson_line_size_mb: 100
OptionTypeDefaultDescription
enabled_steps[]string-Pipeline steps to execute in order
max_ndjson_line_size_mbint100Maximum NDJSON line size in MB. Increase if you encounter "token too long" errors when reading large FHIR Bundles. Set to 0 to use default.

Available steps:

StepDescription
torchImport via TORCH (requires CRTDL)
local_importImport from local directory
http_importImport from HTTP URL
dimpPseudonymize via DIMP
waitPause for manual inspection
flatteningTransform to CSV (requires CRTDL)
sendUpload to destination server
validationValidate FHIR data against profiles
csv_conversionConvert to CSV (placeholder)
parquet_conversionConvert to Parquet (placeholder)

Rules:

  • One import step must be first (torch, local_import, or http_import)
  • Wait step cannot be first or consecutive
  • Flattening requires CRTDL input

Retry

yaml
retry:
  max_attempts: 5
  initial_backoff_ms: 1000
  max_backoff_ms: 30000
OptionTypeDefaultRangeDescription
max_attemptsint51-10Max retry attempts for transient errors
initial_backoff_msint1000-Initial backoff delay
max_backoff_msint30000-Max backoff delay

Exponential backoff: wait = min(initial * 2^attempt, max)

Compression

yaml
compression:
  enabled: true
  level: "default"
OptionTypeDefaultDescription
enabledbooltrueEnable zstd compression
levelstring"default"Compression level

Compression levels:

LevelSpeedRatioUse Case
fastest~500 MB/s~3-4xLarge datasets, CPU-constrained
default~200 MB/s~4-5xBalanced (recommended)
better~100 MB/s~5-6xStorage-constrained
best~50 MB/s~6-7xArchival

Output files use .ndjson.zst extension when enabled. Aether auto-detects and reads both compressed and uncompressed files.

Jobs Directory

yaml
jobs_dir: "./jobs"

Directory for job state and data files.

Environment Variables

All string values support environment variable substitution:

yaml
services:
  torch:
    username: "${TORCH_USERNAME}"
    password: "${TORCH_PASSWORD}"
  send:
    url: "${FHIR_SERVER_URL}"

Example Configurations

TORCH + DIMP

yaml
services:
  torch:
    base_url: "https://torch.hospital.org"
    username: "${TORCH_USER}"
    password: "${TORCH_PASS}"
  dimp:
    url: "http://dimp:32861/fhir"

pipeline:
  enabled_steps:
    - torch
    - dimp

jobs_dir: "./jobs"

Local Import with Flattening

yaml
services:
  local_import:
    dir: "/data/fhir"
  dimp:
    url: "http://dimp:32861/fhir"
  flattening:
    service_url: "http://fhir-flattener:8000"
    lookup_path: "/config/lookup.json"

pipeline:
  enabled_steps:
    - local_import
    - dimp
    - flattening

compression:
  enabled: true
  level: "default"

jobs_dir: "./jobs"

Full Pipeline with Send

yaml
services:
  torch:
    base_url: "https://torch.hospital.org"
    username: "${TORCH_USER}"
    password: "${TORCH_PASS}"
  dimp:
    url: "http://dimp:32861/fhir"
  send:
    send_as: "transfer_load"
    url: "https://transfer.mii.de/fhir"
    auth:
      oauth_issuer_uri: "${OAUTH_ISSUER}"
      oauth_client_id: "${OAUTH_CLIENT}"
      oauth_client_secret: "${OAUTH_SECRET}"
    transfer:
      project_identifier: "MII-PROJECT"
      organization_identifier: "hospital.example.de"

pipeline:
  enabled_steps:
    - torch
    - dimp
    - send

retry:
  max_attempts: 5

compression:
  enabled: true

jobs_dir: "/data/aether/jobs"

Next Steps

Healthcare data integration made simple