ModelOps¶
ModelOps is the control-plane domain for managed models in VANESSA. It sits beside the GenAI platform capability/provider/deployment architecture and does not replace it.
Responsibilities¶
- Maintain the managed model catalog and stable identity records.
- Enforce lifecycle transitions for local and cloud-hosted models.
- Centralize ownership, visibility, and invocation eligibility.
- Persist validation history and current validation state.
- Track model usage in generic daily rollups.
- Expose only eligible models to deployment bindings as managed-model resources.
Canonical Model Record¶
model_registry remains the anchor record, but it now carries ModelOps metadata:
node_idglobal_model_idtask_keycategoryhosting_kindruntime_mode_policylifecycle_statevisibility_scope- validation summary fields
- lightweight version fields (
model_version,source,revision,checksum)
task_key and category are the canonical task classifiers. Older compatibility fields are no longer part of the active ModelOps contract.
Artifact Separation¶
Local model artifacts are tracked separately from catalog metadata:
model_registry: control metadata and lifecycle statemodel_artifacts: storage path, artifact status, checksum/provenance, runtime requirements
This keeps local file concerns out of the main catalog record while still allowing validation and UI inspection.
Lifecycle¶
Local models:
created -> registered -> validated -> active <-> inactive -> unregistered
Cloud models:
registered -> validated -> active <-> inactive -> unregistered
Rules enforced by the backend:
- activation requires a current successful validation
- validation requires a registered or previously managed model state
- active models must be deactivated before unregister
- models must be unregistered before deletion
- cloud models are blocked in offline runtime mode
Validation¶
Validation history is append-only in model_validations.
- Cloud validation probes the configured OpenAI-compatible endpoint and credential.
- Local validation checks artifact existence and marks failures explicitly.
- Config changes reset validation freshness through
is_validation_current.
model_registry keeps denormalized summary fields for fast reads, while model_validations stores the durable audit trail.
Access and Sharing¶
Visibility is normalized around ModelOps scopes:
privateusergroupplatform
Assignment tables remain part of the implementation, but access decisions flow only through the ModelOps access path.
Platform Integration¶
Model activation is not the same as deployment binding.
- ModelOps decides whether a model is active, validated, visible, and invokable.
/control/platformdecides which provider/deployment binding should use which managed-model resources.- Deployment binding pickers should only show ModelOps-eligible models for the relevant capability.
- For local model-bearing providers, there is an extra state between "downloaded in ModelOps" and "bound into a deployment": the model must be loaded into the provider's local runtime slot so the runtime advertises it through
/v1/models.
This separation is especially important for embeddings, where a provider binding may exist before an operator selects the bound model resource.
Current local-superadmin flow:
- Discover or download the model into ModelOps.
- Register and validate the managed model.
- Assign that managed model to the local
llm_inferenceorembeddingsprovider slot from Platform Control. - Wait for the matching local runtime controller to load the assigned model and advertise it through
/v1/models. - Use
GET /v1/modelops/models/{id}/test-runtimesandPOST /v1/modelops/models/{id}/testto verify the runtime is actually serving it. - Activate the model.
- Bind the active validated managed model into a deployment profile resource list and choose its default resource when appropriate.
In this design, "downloaded" and "currently served by the runtime" are intentionally separate states.
APIs¶
Canonical routes live under /v1/modelops/models:
GET /v1/modelops/modelsPOST /v1/modelops/modelsGET /v1/modelops/models/{id}POST /v1/modelops/models/{id}/registerPOST /v1/modelops/models/{id}/validateGET /v1/modelops/models/{id}/testsGET /v1/modelops/models/{id}/test-runtimesPOST /v1/modelops/models/{id}/testPOST /v1/modelops/models/{id}/activatePOST /v1/modelops/models/{id}/deactivatePOST /v1/modelops/models/{id}/unregisterDELETE /v1/modelops/models/{id}
For local LLM validation flows, superadmins can now select a compatible llm_inference runtime just for the test action. This does not change the active deployment profile, and ModelOps records a failure instead of silently falling back when the chosen runtime is not actually serving the selected local artifact.
For both local LLM and embeddings flows, GET /v1/modelops/models/{id}/test-runtimes now exposes structured advertised runtime entries plus local-slot diagnostics such as loaded_managed_model_id, loaded_runtime_model_id, and load_state. Internal identifier matching still accepts raw path/source identifiers when needed, but those are no longer the primary user-facing advertised choices.
Runtime inference routes such as /v1/models/generate and /v1/models/inference remain product-facing, but they now resolve eligibility strictly through ModelOps.
Extension Points¶
The current implementation leaves room for:
- broader multimodal task taxonomies via
task_key - group-based sharing UI
- richer dependency tracking in
model_runtime_dependencies - multi-node
global_model_idfederation - raw usage event storage alongside daily rollups
Current implementation status:
- schema, lifecycle, validation history, usage rollups, and deployment-binding eligibility are implemented
- group UI, federation sync, and richer validator plugins are intentionally deferred
For implementation boundaries and maintenance conventions, see the companion ModelOps maintenance note.