Home / Resources / Consequence Scoring

Consequence Scoring Worksheet

Not all AI errors are equal. Map failure modes to business consequences and prioritize what to test first.

Worksheet For: PM, Risk, Eng Est. time: 45 min

How Consequence Scoring Works

Traditional accuracy treats all errors equally. A misspelling and a compliance violation both count as "1 error." Consequence scoring fixes this by weighting errors by their real-world impact.

Formula: Consequence Score = Σ (Error Severity × Frequency × Detection Difficulty)

Step 1: Define Severity Scale

Score	Severity	Description	Examples
5	Critical	Legal liability, safety risk, regulatory violation	Wrong medical advice, GDPR violation, financial misstatement
4	High	Revenue loss, customer churn, brand damage	Wrong pricing, offensive response, data leak via hallucination
3	Medium	Bad user experience, support ticket generated	Wrong feature described, outdated info, confusing answer
2	Low	Minor friction, user can self-correct	Slightly verbose, suboptimal formatting, minor imprecision
1	Cosmetic	No real impact, polish issue	Typo, awkward phrasing, slightly wrong emoji

Step 2: Inventory Failure Modes

List every way your AI system can fail. Score each on severity, frequency, and detection difficulty.

Failure Mode	Category	Severity (1-5)	Frequency (1-5)	Detection (1-5)	Risk Score	Priority
Hallucinated policy details	Accuracy	5	3	4	60	P0
Prompt injection compliance	Safety	5	2	3	30	P1
Wrong product recommendation	Relevance	3	4	3	36	P1
Outdated pricing info	Freshness	4	2	2	16	P2
Verbose answers	Tone	1	5	1	5	P3
[Add failure mode]					—	—

Step 3: Priority Matrix

Priority	Risk Score	Action	Timeline
P0 — Block release	≥ 50	Must have eval coverage. Zero tolerance for failure.	Before any release
P1 — Fix this sprint	25 – 49	Eval coverage required. Monitor closely.	Within 1 sprint
P2 — Track & plan	10 – 24	Add to golden set. Include in quarterly review.	Within 1 quarter
P3 — Backlog	< 10	Log for awareness. Evaluate if resources allow.	Best effort