Skip to content

aether DUP pipeline coordinator

The DUP pipeline requires a Data Node to be set up with all the services required by the pipeline.

Please refer to the architecture of a data node here and a list of all the data node services here

aether use

See aether documentation here.

For an example configuration see the base configuration in our example setup here.

aether simple example to get started

First install aether locally following the install instructions here.

To get started using aether configure a simple pipeline as shown here.

and then run aether using aether pipeline --config base-pipeline-config-simple.yml start queries/example-crtdl.json in the aether folder of your data node.

Aether will run and then tell you the ID of your job e.g. Job ID: 20260331_0915_5932b1e1-0ed5-4bab-902e-25f328209390, which directly corresponds to a folder in your jobs directory.

For this simple example you will find your extracted data in the import folder in the directory of your specific job.

Note that aether always creates all necessary folders for all supported steps:

  1. import (TORCH export directory)
  2. pseudonymized (Pseudonymized data)
  3. validation (output information from the validation step - note this does not contain the data but validation results instead)
  4. csv (flattened output if csv is chosen)
  5. send (information about the send step)

The DUP Reference Pipeline Detailed

Zooming in the more detailed pipeline can be depicted as follows: