Quick reference
| Engine | Data type | Catches | Severity driver |
|---|---|---|---|
mislabel | 2D images | Wrong‑class images | % flagged |
mislabel_3d | 3D volumes / NIfTI | Wrong labels, slice + case level | % flagged (weighted) |
mislabel_broad | 2D images | Mislabels + OOD + low quality | Weighted (mislabels × 3 + outliers) |
poisoning | 2D images | Backdoors, adversarial patches, Glaze / Nightshade | % flagged + fused score |
text_analysis | Plain text corpus | Secrets, PII, prompt injections, topic outliers | Finding counts by severity |
bias_shortcut | 2D images (+ extras CSV) | Spurious correlations and shortcuts | % flagged + confirmed hypotheses |
mask_quality | Segmentation masks | Invalid / inconsistent / missing masks | % flagged |
Mislabel 2D
Use it when you have a standard image classification dataset (class_a/, class_b/, …) and suspect some images were mislabeled.
Inputs. Folder‑structured image dataset (JPEG, PNG, WebP, TIFF).
Per‑image fields:
| Field | Meaning |
|---|---|
given_label | The label the dataset currently assigns. |
predicted_label | What Antidote thinks the label should be. |
label_quality_score | Confidence in the given label (0–1). Low means suspect. |
top_alt_label | Most likely alternative class. |
confidence_gap | How much the model prefers the alternative over the given label. |
is_issue | Final flag. True means “look at this image.” |
Mislabel 3D (medical imaging)
Use it when you’re working with CT, MRI, or ultrasound data organized per patient or per volume. Inputs..nii.gzvolumes.- NIfTI‑style PNG stacks (
label_*/case001_0042.png). - Optional segmentation metadata to restrict analysis to an anatomical region of interest.
- Case‑level probability, whether the whole volume is suspect.
- Neighbor review, whether slices directly above and below disagree.
- Multi‑label probability, probability across the full label set for multi‑label cases.
Mislabel Broad (OOD + mislabel)
A superset of Mislabel 2D. Besides wrong labels, it finds out‑of‑distribution images, things that don’t belong in the dataset at all (logos, screenshots, unrelated photos, corrupted files). Use it when the dataset is from mixed sources, scraped from the web, or otherwise possibly contaminated. Issue types. Every flag is tagged with exactly one:mislabel, definitely the wrong class.ood, does not belong in the dataset.low_quality, the model is uncertain but no clear alternative.
- Mislabeled, high confidence wrong label.
- MislabeledCandidate, probable but less certain.
- MislabeledCandidateWeak, weak signal.
- Outlier, OOD, not a mislabel.
Poisoning
Use it when you’re ingesting data from untrusted sources (scraped web, user uploads, third‑party dumps) before training a production model. Catches. Backdoor triggers, adversarial patches, Glaze and Nightshade style‑transfer attacks, and any image tampered with to manipulate training. Per‑image fields:| Field | Meaning |
|---|---|
is_poisoned | Binary flag. |
fused_score | Overall suspicion score (0–1). |
decision_basis | Which detector fired (heuristic / VLM / trigger / depoison). |
trigger_bbox | Bounding box of the suspected trigger patch (if any). |
vlm_prompt_score | Vision‑language model confidence that the image is risky. |
depoison_score | How much model behavior changes under counter‑perturbations. |
| Knob | Meaning |
|---|---|
sensitivity | 0 = strict, 1 = aggressive. |
consensus_level | How many independent signals must agree. |
aggressive | Quick preset for maximum recall. |
Text Analysis
Use it when you’re building fine‑tuning, RAG, or instruction datasets and need to sanitize them before training or publishing. Catches three problem classes in one pass:- Leaked secrets and PII, API keys, tokens, passwords, emails, IPs.
- Prompt injection attempts, system‑prompt overrides, jailbreaks.
- Topic outliers, paragraphs that don’t belong with the rest.
| Field | Meaning |
|---|---|
type | secret, injection, or topic_outlier. |
subtype | github_pat, aws_access_key, explicit, corpus_outlier, etc. |
score | Severity (0–1). |
severity | CRITICAL, MEDIUM, or INFO. |
snippet | ≤ 200 characters around the match. |
start / end | Exact character offsets in the source document. |
doc_id / paragraph_index | Where in the corpus the finding lives. |
api_key=… / password=…, emails, IPv4, IPv6, plus a high‑entropy
token heuristic.
Built‑in injection phrases. A curated list of 12 phrases
(ignore previous instructions, reveal the system prompt,
disregard the earlier rules, …) plus jailbreak and role‑reassignment
heuristics.
What to expect. A searchable table of findings per document, with
the match highlighted in context. Triage feedback (confirm / dismiss)
is remembered across subsequent scans, so repeat patterns stop
nagging you.
Bias & Shortcut
Use it when you’re about to train a model and want to know whether it will actually learn the task or cheat on a spurious cue. Catches:- Background color correlates with label.
- Watermarks or logos leak the class.
- Resolution, aspect ratio, or file metadata is informative.
- A confounder column (site, scanner, date) determines the outcome.
| Field | Meaning |
|---|---|
kind | Hypothesis type. border, global_stat, extra_field, metadata, embedding_cluster, patch_cluster, outer_ring, loc_patch, duplicates. |
shortcut_score | Evidence strength (0–100). |
decision | confirmed / suspected_strong / suspected / not_confirmed. |
rationale | Why this decision was made. |
explain_plain | Human‑friendly explanation. |
recommendations | Concrete mitigations (“crop the bottom 15% to remove the watermark”). |
artifacts | Heatmaps, grids, plots. |
suspects_path | CSV of the most suspect images for the hypothesis. |
is_bias (subtle), is_shortcut (strong), plus
the underlying score.
Confounder columns. Drop an extras.csv alongside your images
with columns like site, scanner, date, patient_id. The engine
tests each one for label correlation and train/test shift.
What to expect. A dedicated bias results page
(/datasets/:id/bias) listing every hypothesis with its decision and
visualizations. Confirmed shortcuts come with specific mitigations.
Mask Quality
Use it when you’re preparing data for semantic or instance segmentation. Catches. Missing labels, inconsistent sizing, class imbalance, invalid pixel values. What to expect. Per‑mask flags with a textual description of what’s wrong and an overlay preview against the original image.Picking the right engine
I just got a dataset and want a quick health check
I just got a dataset and want a quick health check
Run Mislabel Broad. It catches mislabels, OOD, and low quality
in one pass and gives you the cleanest first impression.
I'm about to train and worry about shortcuts
I'm about to train and worry about shortcuts
Run Bias & Shortcut with your
extras.csv. Read the confirmed
hypotheses before training; train on the cured branch if any
confirmed shortcut would have leaked into the model.The dataset comes from the public web
The dataset comes from the public web
Run Poisoning with
aggressive=true for the first pass to see
the worst offenders, then drop sensitivity and run again to
settle on a production threshold.I'm building a fine‑tune corpus
I'm building a fine‑tune corpus
Run Text Analysis before anything else. Anything
CRITICAL
should be redacted with Healing before
training.
