Skip to content

Metadata

Card Metadata

Both ModelCard and DataCard objects have a metadata attribute that can be used to store information about the model. If not provided, a default object is created. When registering a card, the metadata is updated with the latest information. In addition to automatically generated attributes, the metadata object can be used to store custom information about the model like descriptions.

User-Defined Attributes (Optional)

description: Description
Description object for your model

Description

Description is a simple data structure that can be used to store extra descriptive information about your model or data.

Args

Summary: Optional[str]
Summary text or pointer to a markdown file that describes the model or data
Sample Code: Optional[str]
Sample code that can be used to load and run the model or data
Notes: Optional[str]
Any additional information not captured by the other attributes

Example

from opsml import ModelCard, ModelCardMetadata
from opsml.types import Description, ModelCardMetadata

# logic for datacard or modelcard
...

modelcard = ModelCard(
  name="my_model",
  repository="my_repo",
  contact="user",
  interface=interface,
  datacard_uid=datacard.uid,
  metadata=ModelCardMetadata(description=Description(summary="my_summary.md")
  )
)

Docs

opsml.types.Description

Bases: BaseModel

Source code in opsml/types/extra.py
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
class Description(BaseModel):
    summary: Optional[str] = None
    sample_code: Optional[str] = None
    Notes: Optional[str] = None

    @field_validator("summary", mode="before")
    @classmethod
    def load_summary(cls, summary: Optional[str]) -> Optional[str]:
        if summary is None:
            return summary

        if ".md" in summary.lower():
            try:
                mkdwn_path = FileUtils.find_filepath(name=summary)
                with open(mkdwn_path, "r", encoding="utf-8") as file_:
                    summary = file_.read()

            except IndexError as idx_error:
                logger.info(f"Could not load markdown file {idx_error}")

        return summary

opsml.types.ModelCardMetadata

Bases: BaseModel

Create modelcard metadata

Parameters:

Name Type Description Default
interface_type

Type of interface

required
description

Description for your model

required
data_schema

Data schema for your model

required
runcard_uid

RunCard associated with the ModelCard

required
pipelinecard_uid

Associated PipelineCard

required
auditcard_uid

Associated AuditCard

required
Source code in opsml/types/model.py
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
class ModelCardMetadata(BaseModel):
    """Create modelcard metadata

    Args:
        interface_type:
            Type of interface
        description:
            Description for your model
        data_schema:
            Data schema for your model
        runcard_uid:
            RunCard associated with the ModelCard
        pipelinecard_uid:
            Associated PipelineCard
        auditcard_uid:
            Associated AuditCard
    """

    interface_type: str = ""
    description: Description = Description()
    data_schema: DataSchema = DataSchema()
    runcard_uid: Optional[str] = None
    pipelinecard_uid: Optional[str] = None
    auditcard_uid: Optional[str] = None

    model_config = ConfigDict(protected_namespaces=("protect_",))

opsml.types.DataCardMetadata

Bases: BaseModel

Create a DataCard metadata

Parameters:

Name Type Description Default
interface_type

Type of interface that contains data

required
data_type

Type of data

required
description

Description for your data

required
feature_map

Map of features in data (inferred when converting to pyarrow table)

required
additional_info

Dictionary of additional info to associate with data (i.e. if data is tokenized dataset, metadata could be {"vocab_size": 200})ed

required
runcard_uid

Id of RunCard that created the DataCard

required
pipelinecard_uid

Associated PipelineCard

required
auditcard_uid

Associated AuditCard

required
Source code in opsml/types/data.py
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
class DataCardMetadata(BaseModel):
    """Create a DataCard metadata

    Args:
        interface_type:
            Type of interface that contains data
        data_type:
            Type of data
        description:
            Description for your data
        feature_map:
            Map of features in data (inferred when converting to pyarrow table)
        additional_info:
            Dictionary of additional info to associate with data
            (i.e. if data is tokenized dataset, metadata could be {"vocab_size": 200})ed
        runcard_uid:
            Id of RunCard that created the DataCard
        pipelinecard_uid:
            Associated PipelineCard
        auditcard_uid:
            Associated AuditCard
    """

    interface_type: str = ""
    data_type: str = ""
    description: Description = Description()
    feature_map: Dict[str, Feature] = {}
    additional_info: Dict[str, Union[float, int, str]] = {}
    runcard_uid: Optional[str] = None
    pipelinecard_uid: Optional[str] = None
    auditcard_uid: Optional[str] = None

Registered Model Metadata

One of the benefits to the model registration process (especially when auto-converting to onnx) is the creation of model metadata that can be used in downstream applications to load and run models via apis or batch jobs. The example below shows sample metadata that is produced with a registered model.

Example

{
    "model_name": "regression",
    "model_class": "SklearnEstimator",
    "model_type": "LinearRegression",
    "model_interface": "SklearnModel",
    "onnx_uri": "opsml-root:/OPSML_MODEL_REGISTRY/opsml/regression/v1.4.0/onnx-model.onnx",
    "onnx_version": "1.14.1",
    "model_uri": "opsml-root:/OPSML_MODEL_REGISTRY/opsml/regression/v1.4.0/trained-model.joblib",
    "model_version": "1.4.0",
    "model_repository": "opsml",
    "sample_data_uri": "opsml-root:/OPSML_MODEL_REGISTRY/opsml/regression/v1.4.0/sample-model-data.joblib",
    "opsml_version": "2.0.0",
    "data_schema": {
        "data_type": "numpy.ndarray",
        "input_features": {
            "inputs": {
                "feature_type": "float64",
                "shape": [
                    1,
                    10
                ]
            }
        },
        "output_features": {
            "outputs": {
                "feature_type": "float64",
                "shape": [
                    1,
                    1
                ]
            }
        },
        "onnx_input_features": {
            "predict": {
                "feature_type": "tensor(float)",
                "shape": [
                    null,
                    10
                ]
            }
        },
        "onnx_output_features": {
            "variable": {
                "feature_type": "tensor(float)",
                "shape": [
                    null,
                    1
                ]
            }
        },
        "onnx_data_type": null,
        "onnx_version": "1.14.1"
    }
}

opsml.ModelMetadata

Bases: BaseModel

Model metadata associated with all registered models

Parameters:

Name Type Description Default
model_name

Name of model

required
model_class

Name of model class

required
model_type

Type of model

required
model_interface

Type of interface

required
onnx_uri

URI to onnx model

required
onnx_version

Version of onnx model

required
model_uri

URI to model

required
model_version

Version of model

required
model_repository

Model repository

required
sample_data_uri

URI to sample data

required
opsml_version

Opsml version

required
data_schema

Data schema for model

required
preprocessor_uri

(only present if preprocessor is used) URI to preprocessor

required
preprocessor_name

(only present if preprocessor is used) Name of preprocessor

required
quantized_model_uri

(only present if huggingface model is quantized) URI to huggingface quantized onnx model

required
tokenizer_uri

(only present if huggingface tokenizer is used) URI to tokenizer

required
tokenizer_name

(only present if huggingface is used) Name of tokenizer

required
feature_extractor_uri

(only present if huggingface feature extractor is used) URI to feature extractor

required
feature_extractor_name

(only present if huggingface feature_extractor is used) Name of feature extractor

required
Source code in opsml/types/model.py
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
class ModelMetadata(BaseModel):
    """Model metadata associated with all registered models

    Args:
        model_name:
            Name of model
        model_class:
            Name of model class
        model_type:
            Type of model
        model_interface:
            Type of interface
        onnx_uri:
            URI to onnx model
        onnx_version:
            Version of onnx model
        model_uri:
            URI to model
        model_version:
            Version of model
        model_repository:
            Model repository
        sample_data_uri:
            URI to sample data
        opsml_version:
            Opsml version
        data_schema:
            Data schema for model
        preprocessor_uri: (only present if preprocessor is used)
            URI to preprocessor
        preprocessor_name: (only present if preprocessor is used)
            Name of preprocessor
        quantized_model_uri: (only present if huggingface model is quantized)
            URI to huggingface quantized onnx model
        tokenizer_uri: (only present if huggingface tokenizer is used)
            URI to tokenizer
        tokenizer_name: (only present if huggingface is used)
            Name of tokenizer
        feature_extractor_uri: (only present if huggingface feature extractor is used)
            URI to feature extractor
        feature_extractor_name: (only present if huggingface feature_extractor is used)
            Name of feature extractor
    """

    model_name: str
    model_class: str
    model_type: str
    model_interface: str
    onnx_uri: Optional[str] = None
    onnx_version: Optional[str] = None
    model_uri: str
    model_version: str
    model_repository: str
    sample_data_uri: str
    opsml_version: str = __version__
    data_schema: DataSchema

    model_config = ConfigDict(
        protected_namespaces=("protect_",),
        extra="allow",
    )