What does a real RICE backlog scoring session look like?

A team session takes 90–120 minutes for a backlog of 10–15 items. Pre-work: the PM drafts initial scores for each item; the team reviews async and challenges any input where they disagree. The session itself focuses on disputed inputs — usually Confidence and Impact — and arrives at consensus scores. The output is a ranked backlog with explicit reasoning for each input, which makes the ranking defensible to stakeholders later.

What were the surprising losers in this backlog?

The most visible loser was a high-effort enterprise SSO feature requested loudly by two large prospects — high Impact, low Reach (only the two accounts), 80% Confidence, very high Effort. Its RICE score landed in the bottom third. The team had been assuming SSO was 'must-ship' because the prospects were loud, but the math said the engineering capacity was better spent on a smaller-scoped retention feature affecting 1,800 active users.

Why is Confidence often the most-disputed RICE input?

Confidence is the only RICE input without a direct unit. Reach has users-per-quarter. Impact has the 0.25–3 scale. Effort has person-months. Confidence is a percentage representing 'how sure are we about Reach and Impact?' — and teams routinely default to 80% to avoid the harder conversation. Disputed Confidence usually means someone in the room has evidence that should change the Reach or Impact estimates, which is the conversation actually worth having.

Should the highest-RICE item always ship first?

Not automatically. RICE produces a starting point, not a verdict. Override the ranking when dependency chains, strategic alignment, or capacity constraints justify it — but document the override reason. The discipline is treating RICE as the default and requiring an explicit reason to deviate, rather than treating it as advisory and falling back to gut feel. Without that discipline, the framework becomes a ritual rather than a forcing function.

RICE prioritization on a real SaaS team's Q3 backlog

A B2B SaaS team running their Q3 planning. 12 candidate features on the backlog, capacity for roughly 5. The team had been arguing about three favorites for two weeks. They ran RICE on a Tuesday afternoon — by 5pm the ranking was settled and the planning meeting on Thursday took 20 minutes instead of two hours.

If you need the formula and scoring rules before reading this example, see RICE score calculator: the formula with 3 worked examples.

Setup

Product: workflow automation for mid-market customer success teams
Active accounts: 1,800
Team capacity: ~12 engineer-months for Q3 features (after carrying ops and bugs)
Scoring scale: Reach = customers affected/quarter; Impact = 0.25/0.5/1/2/3; Confidence = %; Effort = engineer-months

The candidates and scores

Feature	Reach	Impact	Conf	Effort	RICE
Bulk-edit account fields	1,400	1	80%	1.5	747
Slack notification routing	900	2	70%	2	630
Custom dashboard widgets	600	2	60%	3	240
New onboarding wizard for trial users	1,200	0.5	90%	2	270
Salesforce 2-way sync v2	400	3	60%	4	180
Mobile push notifications	1,800	0.5	70%	1.5	420
Account merge tool	300	1	90%	1	270
Inline AI summaries on accounts	1,800	1	40%	2.5	288
Multi-workspace support	200	3	80%	6	80
Auto-renew reminder emails	1,600	0.5	90%	0.5	1,440
Custom report builder	500	2	50%	4	125
Real-time activity feed	1,800	1	60%	3	360

Total effort if everything shipped: 31 engineer-months. Available: 12.

The ranked top 5

By RICE:

Auto-renew reminder emails — 1,440 (the surprise winner)
Bulk-edit account fields — 747
Slack notification routing — 630
Mobile push notifications — 420
Real-time activity feed — 360

Total effort for top 5: 8.5 engineer-months. Fits with room to spare.

What was surprising

Auto-renew reminders almost didn't make the candidate list. A product manager had floated it casually; nobody pushed hard for it. When the team computed RICE honestly — affects almost every customer, modest per-customer impact, very high confidence (we know exactly what this does), and tiny effort (0.5 eng-month) — it landed at 1,440. The Effort being so low and Reach so high produced the disproportionate score the formula was designed to surface.

Multi-workspace support, the founder's favorite, ranked dead last. Reach was small (only enterprise customers needed it), Effort was huge (6 eng-months), and the resulting RICE score of 80 was honest. The team would have shipped it on intuition; the scoring forced an explicit conversation about whether enterprise-tier expansion justified the trade. They concluded: not this quarter. Revisit when 3+ enterprise prospects ask for it.

Inline AI summaries had high Reach × Impact (1,800 × 1) but low Confidence (40%) because the team had no evidence summaries would actually be read. The Confidence multiplier dropped it from a potential 1,800 to 288. Penalty for speculation working as intended.

The "we cheated" check

After the scoring, the lead asked the question that prevents most RICE abuse: "are any of these scores driven by what we wanted the answer to be?"

One score got revised: Salesforce sync v2 (Confidence raised from 60% → 80%) was an attempt by an engineer to push the score up. The team caught it because Salesforce sync v2 had failed user testing twice before — 60% confidence was the right number. After the revision back, Salesforce sync stayed out of the top 5.

What this teaches

The formula's denominator (Effort) is what produces non-obvious winners. A small Effort with even modest other inputs jumps to the top. Most teams under-weight this.
Confidence is the term that penalizes wishful thinking. Without honest confidence numbers, the score becomes "what we hope is true × what we wish it cost".
Surprise winners and surprise losers are signals. When RICE produces a ranking that conflicts with team intuition, both deserve a serious look. Sometimes the intuition was right and the scoring is off; more often the scoring is honest and the intuition was anchored on irrelevant features.

Run your own

The full method is in the Academy guide →. Open the framework page for the catalog entry, or start a canvas to score your own backlog.

RICE prioritization on a real SaaS team's Q3 backlog

Setup

The candidates and scores

The ranked top 5

What was surprising

The "we cheated" check

What this teaches

Run your own

Frequently asked questions

What does a real RICE backlog scoring session look like?

What were the surprising losers in this backlog?

Why is Confidence often the most-disputed RICE input?

Should the highest-RICE item always ship first?

More examples

SpaceX–Cursor Acquisition: A McKinsey 7S Integration Analysis (2026)

Nike PESTEL Analysis (2026): the $1.5B tariff squeeze

FC Barcelona BCG Matrix: the economic levers gamble

FIFA Ansoff Matrix 2026: the World Cup growth bet