API Reference
maldet.types
Core dataclasses shared across maldet layers.
MetricReport
dataclass
Return value of Evaluator.evaluate().
to_json_dict()
Serialize into the wire shape of metrics.json (schema_version=1).
Sample
dataclass
A single binary sample.
label is None during predict. metadata is a per-instance mutable
dict — frozen=True forbids reassigning the field, not mutating the contained dict.
TrainResult
dataclass
Return value of Trainer.fit().
model is the trained estimator or LightningModule. best_checkpoint is set
by Lightning when a ModelCheckpoint callback ran. extras is a free-form dict
that round-trips into events.
maldet.protocols
Runtime-checkable Protocols for maldet's six layers + EventLogger.
Protocols use structural typing — implementations do not need to inherit.
@runtime_checkable enables isinstance(obj, Trainer) for pipeline-assembly
validation.
maldet.events
CompositeEventLogger — fans out to N loggers, isolating failures.
CompositeEventLogger
Forwards every call to every wrapped logger.
A delegate raising is caught and logged at WARNING; other delegates still run. This protects the training loop from a broken MLflow / filesystem from killing the detector run.
JSONL event logger — append-only, fsync per event.
JsonlEventLogger
Writes one NDJSON line per event to path.
Each write is followed by os.fsync so a pod kill does not lose events in the
page cache. Parent directory is created if missing.
Stdout event logger — prefixed JSON line per event.
StdoutEventLogger
Writes maldet.event: {json}\n lines to stdout.
MLflow-backed event logger.
Keeps the mlflow import optional — if the caller passes mlflow=None and
mlflow is not importable, log_* methods silently no-op. This makes MLflow
a soft dependency (install maldet[mlflow] to enable).
MlflowEventLogger
log_event(kind, **payload)
Flatten event payload into tags named maldet.<kind>.<field>.
Metric events are NOT forwarded here — the caller calls log_metric for
those.
maldet.builtins
Built-in sample readers.
SampleCsvReader
Reads a sample_csv contract CSV: columns file_name[,label].
Resolves each sample path under samples_root/<sha[:2]>/<sha>.
When strict=False (default), missing sample files are skipped with no
error (lolday frequently produces CSVs that reference samples not yet
present; platform guarantees the SHA is valid, not that the byte stream
is).
Built-in predictor: batch prediction over a SampleReader.
BatchPredictor
Iterate samples, extract features, call model.predict in one batch.
Writes a CSV with the required columns file_name, pred_label, pred_score.
Extra columns are added as pred_prob_<class> when predict_proba is
available.
maldet.evaluators
Binary-classification evaluator.
BinaryClassification
Binary classification metrics using sklearn.metrics.
Runs model.predict once over the whole reader. Optionally calls
predict_proba if available to compute ROC-AUC.
maldet.trainers
SklearnTrainer — thin wrapper around sklearn estimator.fit/predict/proba.
SklearnTrainer
Trainer for scikit-learn-compatible estimators (fit + predict).
Lightning-based Trainer for deep-learning detectors.
Reads the platform-injected env vars MALDET_GPU_COUNT and
MALDET_DISTRIBUTED_STRATEGY to pick accelerator, devices, and
strategy for lightning.Trainer.
LightningTrainer
PyTorch Lightning-based Trainer.
MaldetLightningLogger
Bases: Logger
Adapter from Lightning's Logger API onto maldet EventLogger.
MaldetProgressCallback
Bases: Callback
Emits epoch_begin / epoch_end events through the EventLogger.
maldet.manifest
Detector manifest — Pydantic model for maldet.toml and helpers.
DetectorManifest
Bases: _Frozen
The full manifest (maldet.toml root).
ManifestNotFoundError
Bases: FileNotFoundError
Raised by search_manifest when no maldet.toml is discoverable.
search_manifest()
Return the first manifest path found in:
1. $MALDET_MANIFEST env var (absolute path)
2. $PWD/maldet.toml
3. /app/maldet.toml (the scaffold Docker WORKDIR)