Dataset Workflow
This stage prepares a reproducible local dataset state before simulation submission.
|
|
1. 1. Discover Available Data Sources
kub-dataset list-dmps
kub-dataset list-locations --show-versions
If you need a specific backend only:
kub-dataset list-locations --dmp girder-unistra --show-versions
2. 2. Pull Location Dataset
kub-dataset pull arz \
--version 0.1.0 \
--cemdb-root cemdb/locations \
--dmp girder-unistra
kub-dataset pull arz \
--version 0.1.0 \
--cemdb-root cemdb/locations \
--dmp girder-unistra \
--api-key "$GIRDER_API_KEY"
|
Current |
3. 3. Pull Simulator Dataset
kub-dataset pull-simulator \
--version 0.2.0 \
--cemdb-root cemdb \
--force
4. 4. Verify Local Dataset State
kub-dataset summary arz --cemdb-root cemdb/locations
kub-dataset manifest-show arz --version 0.1.0 --cemdb-root cemdb/locations
5. 5. Optional: Track Pulls With DVC
kub-dataset init
kub-dataset pull arz --version 0.1.0 --dvc --force --cemdb-root cemdb/locations
kub-dataset status
6. Expected Layout Check
find cemdb/locations/arz/v0.1.0 -maxdepth 2 -type d | sort
You should see key directories such as geo/, weather/, scenarios/, preprocessing/, and simulations/.
7. Next Step
Proceed to Run Simulation.