Stages & Manifest

Stages

maldet supports three stages. Each stage is driven by maldet run <stage> --config <path>.

train

Loads data.train_csv, materializes features, calls Trainer.fit(), and writes artifacts to paths.output_dir:

model/ — serialized estimator or Lightning checkpoint directory
events.jsonl — full event stream
manifest.json — snapshot of maldet.toml for provenance

evaluate

Loads paths.source_model, runs the model over data.test_csv, calls Evaluator.evaluate(), and writes metrics.json.

predict

Loads paths.source_model, runs the model over data.predict_csv, calls Predictor.predict(), and writes predictions.csv.

maldet.toml fields

maldet.toml is the detector manifest. It must be present in the working directory or at $MALDET_MANIFEST.

`[detector]`

Field	Type	Required
`name`	string	yes
`version`	string	yes
`framework`	`sklearn` \| `lightning` \| `sklearn+lightning`	yes
`description`	string	no

`[input]`

Field	Type	Default
`binary_format`	`elf` \| `pe` \| `apk` \| `raw_bytes`	required
`required_sections`	list[str]	`[]`
`dataset_contract`	string	`"sample_csv"`

`[output]`

Field	Type	Default
`task`	`binary_classification` \| `multiclass_classification` \| ...	required
`classes`	list[str]	`[]`
`score_range`	[float, float]	`[0.0, 1.0]`

`[resources]`

Field	Type	Default
`supports`	list of `cpu`/`gpu1`/`gpu2`/`gpu4`/`gpu8`	`["cpu"]`
`recommended`	string	`"cpu"`
`min_memory_gib`	int	`1`
`gpu_required`	bool	`false`

`[lifecycle]`

Field	Type	Default
`stages`	list	`["train", "evaluate", "predict"]`
`supports_serving`	bool	`false`
`supports_hpsweep`	bool	`true`
`supports_distributed`	bool \| `"ddp"` \| `"fsdp"`	`false`
`supports_multinode`	bool	`false`

`[artifacts]`

Defines the expected output paths for model, metrics, and predictions.

`[compat]`

Field	Default
`min_python`	`"3.12"`
`min_maldet`	`"1.0"`
`schema_version`	`1`

`[stages.<name>]`

Each stage block declares dotted-import paths (module:Class) for the layer symbols to instantiate:

[stages.train]
reader    = "mydet.readers:MyReader"
extractor = "mydet.features:MyFeatures"
model     = "mydet.models:make_rf"
trainer   = "maldet.trainers.sklearn_trainer:SklearnTrainer"

[stages.evaluate]
evaluator = "maldet.evaluators.binary:BinaryClassificationEvaluator"

[stages.predict]
predictor = "maldet.builtins.predictors:BatchPredictor"

Integration with lolday

The lolday backend imports DetectorManifest from maldet.manifest to validate the base64-decoded io.maldet.manifest OCI image label. When lolday Phase 11b ships, it pins maldet ~= 1.0 in its backend/pyproject.toml, gaining type-safe manifest validation for free.

Platform-side, lolday also reads:

/mnt/output/events.jsonl — via a sidecar tail that POSTs each event to the internal /internal/jobs/{job_id}/events endpoint
/mnt/output/manifest.json — stored alongside MLflow artifacts for provenance
/mnt/output/metrics.json — parsed as the canonical evaluation record
/mnt/output/predictions.csv — served back to the user via /api/v1/runs/{run_id}/artifacts/download

For the Volcano Job spec that mounts these paths and invokes maldet run <stage>, see the lolday repo's services/job_spec.py (Phase 11b onward).