Beacon Hill Lab · a tool for delivery folks

The AI Delivery Monte Carlo

If you're new to delivering AI work, here's the thing nobody puts on the planning slide: depending on which study you read, somewhere between 70% and 85% of enterprise AI projects fail to deliver the value they promised. About half make it from pilot to production. Of the ones that do ship, a meaningful chunk run two or three times longer than planned.

That's not because the teams are bad. It's because AI delivery has a long tail most plans don't model. Data access stalls in procurement. Eval loops don't converge on the timeline you picked. Adoption stalls after launch. Each phase looks reasonable in isolation, then variance compounds and the date quietly slips.

This tool runs your plan ten thousand times so you can see that long tail before you commit to a date. What's the actual probability you hit it? Plug in your numbers and find out.

On the numbers above: Gartner has reported AI project failure rates in the 70–85% range across multiple recent surveys. RAND's 2024 study on AI project failure put the figure around 80%, roughly twice the rate of conventional IT projects. Standish CHAOS data shows IT projects more broadly run 40–60% over schedule when they slip. Treat these as directional, not gospel — methodology varies and the field is moving.
Mode
Display unit
Phase coupling
ρ = 0.30 · loose coupling — typical project
independentrealistic defaulttightly coupled

Most simulators pretend phases fail independently. They don't. When data access slips, eval slips with it. When framing drifts, adoption gets harder. This slider lets you tell the model how much your phases move together. 0.3 is a sensible starting point for most AI work — slide it around and watch how the result changes.

01 — Phase estimates (weeks)

Pre-launch
5 phases
Phase
Best
Likely
Worst
'Good enough' tends to drift mid-project.
Externally blocked. Variance lives here.
Predictable once access is in hand.
The phase plans usually estimate well.
Non-linear. 'One more eval round' is the killer.
Post-launch
2 phases · usually under-planned
Phase
Best
Likely
Worst
Where most AI projects actually fail.
The lifecycle starts at go-live, not ends there.
weeks from start
Plug in your numbers above and hit run.