kub-dataset: DVC and Authentication

1. Purpose

This page covers:

  • DVC initialization and status commands

  • Authentication commands for configured Keycloak providers

  • --dvc integration used with dataset pulls

For full CLI details, see kub-dataset: Full Reference.

2. DVC Commands

kub-dataset init [--path <path>]
kub-dataset status [--path <path>]

2.1. init

Initialize DVC metadata in a repository.

# Initialize in current repository
kub-dataset init

# Initialize in an explicit path
kub-dataset init --path /path/to/repo

2.2. status

Show DVC state and tracked datasets.

kub-dataset status
kub-dataset -v status

3. Authentication Commands

kub-dataset login --provider <provider-id> [--username <user> --password <pass>] [--timeout <seconds>]
kub-dataset logout [--provider <provider-id>|--all]
kub-dataset whoami [-v]

3.1. login

Authenticate to a configured provider.

# Interactive login
kub-dataset login --provider hidalgo2

# Non-interactive login
kub-dataset login --provider hidalgo2 --username "$USER" --password "$PASSWORD"

3.2. logout / whoami

# Remove one provider session
kub-dataset logout --provider hidalgo2

# Remove all sessions
kub-dataset logout --all

# Display active sessions and identities
kub-dataset whoami

4. Pull with DVC Tracking

Use --dvc in pull commands to generate .dvc tracking files and update ignore rules.

# Location dataset pull with tracking
kub-dataset pull kernante --version 0.98.0 --dvc --api-key $GIRDER_API_KEY

# Simulator pull with tracking
kub-dataset pull-simulator --version 0.1.0 --dvc --force
# 1) One-time setup
kub-dataset init
git add .dvc .dvcignore
git commit -m "Initialize DVC"

# 2) Pull tracked datasets
kub-dataset pull kernante --version 0.98.0 --dvc --force
kub-dataset pull-simulator --version 0.1.0 --dvc --force

# 3) Check and commit tracking files
kub-dataset status
git add cemdb/locations/*.dvc cemdb/simulators/**/*.dvc cemdb/**/.gitignore
git commit -m "Track KUB datasets with DVC"