Implementation Details
The TORCH REST API follows the Asynchronous Bulk Data Request Pattern.
$extract-data Kick-off
The $extract-data endpoint initiates the extraction. It expects a FHIR Parameters resource containing a Base64 encoded CRTDL definition. resource with a crtdl parameter containing a valueBase64Binary CRTDL. In all examples torch is configured with the base url http://localhost:8080.
scripts/create-parameters.sh src/test/resources/CRTDL/CRTDL_observation.json | curl -s 'http://localhost:8080/fhir/$extract-data' -H "Content-Type: application/fhir+json" -d @- -vRequest Body Structure
The Parameters resource created by create-parameters.sh look like this:
{
"resourceType": "Parameters",
"parameter": [
{
"name": "crtdl",
"valueBase64Binary": "<Base64 encoded CRTDL>"
}
]
}Optionally patient ids can be submitted for a known cohort, bypassing the cohort selection in the CRTDL:
{
"resourceType": "Parameters",
"parameter": [
{
"name": "crtdl",
"valueBase64Binary": "<Base64 encoded CRTDL>"
},
{
"name": "patient",
"valueString": "<Patient Id 1>"
},
{
"name": "patient",
"valueString": "<Patient Id 2>"
}
]
}Result Files
Upon successful completion, the data extraction results consist of multiple NDJSON files:
- One NDJSON file per patient batch, each containing FHIR transaction Bundles
- One
core.ndjsonfile, containing a single FHIR transaction Bundle with all non-patient-specific resources
Patient Batch Files
Each patient batch NDJSON file contains:
- One transaction Bundle per patient
- Each Bundle includes:
- exactly one
Patientresource - all patient-specific resources extracted according to the CRTDL (e.g.
Encounter,Condition,Observation, etc.)
- exactly one
Patient batch files can be processed independently and in any order.
Core Bundle File
The core.ndjson file contains:
- a single transaction Bundle
- all non-patient-specific resources (e.g. shared reference data such as
Medication)
For referential integrity, core.ndjson must be processed before any patient batch files.
For this reason, the provided transfer script uploads core.ndjson first.
If core.ndjson contains resources but no patient batch files are present, this indicates that:
No patient survived the extraction, but core (non-patient) resources were still loaded.
Example
Given:
- Bundle size:
20 100patients- Per patient:
1Encounter1Diagnosis
- Shared references:
30Medication resources
The extraction result will contain:
- 5 patient batch NDJSON files (each containing 20 transaction Bundles)
- 1
core.ndjsonfile
Each patient batch file contains:
- one transaction Bundle per patient
- each Bundle includes:
1Patient≥1Diagnosis≥1Encounter
The core.ndjson contains:
- a single transaction Bundle
30Medication resources