kub-dataset: Location Datasets

1. Purpose

This page documents kub-dataset commands used for location datasets under:

cemdb/locations/<location>/vX.Y.Z

Use this for archive creation, remote sync (Girder/CKAN), local summaries, data generation, and version maintenance.

For exhaustive option-by-option details, see kub-dataset: Full Reference.

2. Location Command Synopsis

# Local archive operations
kub-dataset pack <location> [options]
kub-dataset unpack <archive> [options]
kub-dataset list <archive>

# Remote discovery and sync
kub-dataset list-dmps [--format {table,json}]
kub-dataset list-locations [--dmp <ids>] [--show-versions] [--format {table,json}]
kub-dataset list-versions <location> [--dmp <id>]
kub-dataset push <location> --version <version> --api-key <key> [--dmp <id>]
kub-dataset pull <location> [--version <version>] [--dmp <id>] [--dvc]
kub-dataset delete <location> [--version <version>] --api-key <key>

# Local inspection and generation
kub-dataset summary [<location> ...] [--all] [--version <version>] [--format {table,json}] [--compact]
kub-dataset generate {meteo|iaq|both} <location> --start <date|datetime> --end <date|datetime> [--version <version>]

# Manifest and migration
kub-dataset manifest-show <location> [--version <version>]
kub-dataset manifest-regenerate <location> [--version <version>]
kub-dataset manifest-verify <location> [--version <version>]
kub-dataset manifest-validate [manifest|location] [--version <version>] [--cemdb-root <path>] [--strict]
kub-dataset fix <location> [--version <version>] [--policy-file <path>] [--dry-run]
kub-dataset policy list
kub-dataset policy show [policy_ref]
kub-dataset policy validate [policy_ref]
kub-dataset policy apply <location> [--version <version>] [--template <name>|--file <path>] [--dry-run]
kub-dataset policy-validate <policy-file> [-v]
kub-dataset migrate [<location>] [--execute]
kub-dataset migrate-layout [<location>|--all]
kub-dataset migrate-org-layout [<location>|--all]
kub-dataset migrate-simulator [<version>] [--dry-run]

# Version and naming maintenance
kub-dataset copy-version <location> <source-version> <target-version>
kub-dataset rename-version <location> <old-version> <new-version>
kub-dataset rename-location <old-name> <new-name>
kub-dataset set-current <location> <version>
kub-dataset duplicate <location> [<source-version>]

# Components and references
kub-dataset push-component <location> --version <version> --component <geo|config|inputs|preprocessing|all> --api-key <key>
kub-dataset pull-component <location> --version <version> --component <geo|config|inputs|preprocessing>
kub-dataset list-components <location> --version <version>
kub-dataset tag-reference <location> <run-id> [--type <single|ensemble>]
kub-dataset list-reference <location> [--type <single|ensemble>]

3. Core Workflows

3.1. Archive a Dataset

# Pack location inputs (simulation outputs excluded)
kub-dataset pack kernante --cemdb-root cemdb/locations

# Inspect archive content
kub-dataset list kernante_input.zip

# Restore archive
kub-dataset unpack kernante_input.zip --cemdb-root cemdb/locations

3.2. Pull and Push Across DMPs

# Discover registered DMPs and locations
kub-dataset list-dmps
kub-dataset list-locations --show-versions

# Pull latest from Girder (default)
kub-dataset pull kernante --api-key $GIRDER_API_KEY

# Push a version to CKAN
kub-dataset push kernante \
  --version 0.98.0 \
  --dmp ckan-hidalgo2 \
  --api-key $CKAN_API_KEY

3.3. Summaries and Input Generation

# Local summary for one or all datasets
kub-dataset summary kernante --cemdb-root cemdb/locations
kub-dataset summary --all --cemdb-root cemdb/locations

# Generate weather and IAQ inputs
kub-dataset generate meteo strasbourg --version 0.1.0 --start 2023-01-01 --end 2023-12-31
kub-dataset generate iaq   strasbourg --version 0.1.0 --start 2023-01-01 --end 2023-12-31
kub-dataset generate both  strasbourg --version 0.1.0 --start 2023-01-01 --end 2023-12-31

3.4. Manifest and Version Maintenance

# Manifest lifecycle
kub-dataset manifest-show kernante --version 0.98.0
kub-dataset manifest-regenerate kernante --version 0.98.0
kub-dataset manifest-verify kernante --version 0.98.0
kub-dataset manifest-validate cemdb/locations/kernante/v0.98.0/manifest.json --strict

# Dataset repair and policy management
kub-dataset fix kernante --version 0.98.0 --dry-run
kub-dataset policy list
kub-dataset policy apply kernante --version 0.98.0 --template legacy-default

# Version operations
kub-dataset copy-version kernante 0.98.0 0.99.0
kub-dataset set-current kernante 0.99.0
kub-dataset duplicate kernante

4. Notes

  • kub-case-summary is deprecated. Use kub-dataset summary.

  • kub-dataset policy-validate is a compatibility alias. Prefer kub-dataset policy validate.

  • Legacy compatibility commands are still available but deprecated: list-locations-ckan, list-versions-ckan, pull-ckan, push-ckan.