Stages & Manifest
Stages
maldet supports three stages. Each stage is driven by maldet run <stage>
--config <path>.
train
Loads data.train_csv, materializes features, calls Trainer.fit(), and
writes artifacts to paths.output_dir:
model/— serialized estimator or Lightning checkpoint directoryevents.jsonl— full event streammanifest.json— snapshot ofmaldet.tomlfor provenance
evaluate
Loads paths.source_model, runs the model over data.test_csv, calls
Evaluator.evaluate(), and writes metrics.json.
predict
Loads paths.source_model, runs the model over data.predict_csv, calls
Predictor.predict(), and writes predictions.csv.
maldet.toml fields
maldet.toml is the detector manifest. It must be present in the working
directory or at $MALDET_MANIFEST.
[detector]
| Field | Type | Required |
|---|---|---|
name |
string | yes |
version |
string | yes |
framework |
sklearn | lightning | sklearn+lightning |
yes |
description |
string | no |
[input]
| Field | Type | Default |
|---|---|---|
binary_format |
elf | pe | apk | raw_bytes |
required |
required_sections |
list[str] | [] |
dataset_contract |
string | "sample_csv" |
[output]
| Field | Type | Default |
|---|---|---|
task |
binary_classification | multiclass_classification | ... |
required |
classes |
list[str] | [] |
score_range |
[float, float] | [0.0, 1.0] |
[resources]
| Field | Type | Default |
|---|---|---|
supports |
list of cpu/gpu1/gpu2/gpu4/gpu8 |
["cpu"] |
recommended |
string | "cpu" |
min_memory_gib |
int | 1 |
gpu_required |
bool | false |
[lifecycle]
| Field | Type | Default |
|---|---|---|
stages |
list | ["train", "evaluate", "predict"] |
supports_serving |
bool | false |
supports_hpsweep |
bool | true |
supports_distributed |
bool | "ddp" | "fsdp" |
false |
supports_multinode |
bool | false |
[artifacts]
Defines the expected output paths for model, metrics, and predictions.
[compat]
| Field | Default |
|---|---|
min_python |
"3.12" |
min_maldet |
"1.0" |
schema_version |
1 |
[stages.<name>]
Each stage block declares dotted-import paths (module:Class) for the
layer symbols to instantiate:
[stages.train]
reader = "mydet.readers:MyReader"
extractor = "mydet.features:MyFeatures"
model = "mydet.models:make_rf"
trainer = "maldet.trainers.sklearn_trainer:SklearnTrainer"
[stages.evaluate]
evaluator = "maldet.evaluators.binary:BinaryClassificationEvaluator"
[stages.predict]
predictor = "maldet.builtins.predictors:BatchPredictor"
Integration with lolday
The lolday backend imports DetectorManifest from maldet.manifest to validate
the base64-decoded io.maldet.manifest OCI image label. When lolday Phase 11b
ships, it pins maldet ~= 1.0 in its backend/pyproject.toml, gaining
type-safe manifest validation for free.
Platform-side, lolday also reads:
/mnt/output/events.jsonl— via a sidecar tail that POSTs each event to the internal/internal/jobs/{job_id}/eventsendpoint/mnt/output/manifest.json— stored alongside MLflow artifacts for provenance/mnt/output/metrics.json— parsed as the canonical evaluation record/mnt/output/predictions.csv— served back to the user via/api/v1/runs/{run_id}/artifacts/download
For the Volcano Job spec that mounts these paths and invokes maldet run <stage>,
see the lolday repo's services/job_spec.py (Phase 11b onward).