Estimation Ramblings

Mark Pullan

Published Jul 13, 2021

Disclaimer

Some of what I say here is probably original, based on my recent experience of estimating and much earlier dabblings with things like COCOMO II and Wideband Delphi. Some of this is probably unwittingly plagiarised from Steve McConnell’s book Rapid Development (which is well worth a read for techies who are starting to get more involved with team leading and management aspects of the job).

A further disclaimer for any mistakes on the statistics side of things (it’s over 25 years since my A-levels, which was the last time I looked at stats in anger…)

The Fallacy of “The Estimate”

One of the things I think that as an industry we’re guilty of is that we think of the estimation process as a means of getting to a single, presumably correct value. Whether it be a cost, a number of days, story points, whatever – the temptation is always to come up with “The Estimate”. This isn’t how the real world works, and probably isn’t how we should be modelling things.

By providing “The Estimate” you are almost immediately embarking on a lose-lose proposition. If your estimate is too high you risk failing to sell in the first place or a perceived revenue under-spend; too low and the team will be facing a struggle to deliver “on time”. Both of these have reputational and commercial impact.

“The Estimate” implies a level of precision that is almost certainly misleading. If my estimation crystal ball comes up with a figure of (say) 2,239.5 developer-days, then this implies a whopping 5 significant figures of estimation confidence which is almost certainly false confidence.

So, what’s the answer?

If we assert that coming up with an estimate isn’t a good idea, what should we do? We need to know how big a team is needed, the client needs to know how much budget to seek, and there’s a point at which this impacts overall feasibility as the cost/benefit trade-off is no longer justified.

Model a Range, not a Point

The first part of the answer is to understand that when we estimate we should be providing a range of values not a single value. There is a lower bound below which the project scope will not be delivered, irrespective of how skilled the team is and how lucky we are with assumptions, risks and dependencies falling in our favour. Simultaneously there is usually an upper bound beyond which it would be unreasonable to expect delivery to take. Our estimate lives somewhere in the range between these points.

Model Uncertainty

Once we accept that estimates are inherently about a range rather than a point, we are instantly provided another tool with which to work. The probability of successful delivery will likely follow some sort of bell curve between these bounds. An estimation performed with a high level of confidence will have a tall, narrow curve between the bounding points, whereas an estimation with a low level of confidence will have a wide, short curve. I’ve seen some early-days, low-confidence estimates produced with ambiguous requirements where the upper bound is a factor of 5 or more times the lower bound (and this is great, because it accurately reflects our uncertainty and shows us the areas that need to be investigated and refined to increase our confidence).

Pricing and Risk as a Statistical Exercise

Given that we’re likely to have to provide our estimates in terms other than a probability function, how do we reconcile this? Probably the easiest way is thinking in terms of standard deviations and appetite for risk. We’re unlikely to want to price something at the mid (50% chance of success) point, but assuming this follows a normal distribution we might price based on one (68%) or two (95%) standard deviations above the mean.

Increasing Confidence

Off the back of this, the emphasis of our estimation process shifts from downwards pressure on estimates to downwards pressure on uncertainty. There are a handful of techniques we can apply to assist with this:

Use Uncertainty to Drive Feasibility/Discovery Activities

We should explicitly dedicate a period of our Feasibility or Discovery phases hunting down areas of low-confidence estimation. This provides us with a number of tangible benefits:

It flushes out technical or delivery risks by pulling investigation of unknown aspects to the left.
It reduces the number of low-confidence areas
It increases the overall level of confidence
It provides a basis of legitimacy for poking at thorny problem areas that the client might otherwise try to arm-wave away

Split High and Low

Usually when estimating (especially at the earlier stages of a project) there is a bunch of stuff that the customer knows well, and a bunch of stuff they are less knowledgeable about (legacy black boxes, external systems, blue-sky requirements). The usual approach is to disregard this separation, but by combining high-confidence and low-confidence areas into a single estimate we simply end up with a muddied low-confidence estimate. Conversely, by presenting areas of high confidence and low confidence as separate estimates we can provide a better basis for costing and also provide an improved focus for subsequent activities (e.g. by pulling forwards analysis or deferring uncertain scope to later phases).

Harmful Multipliers

One of the common certainty-killers during estimation is the need to provide an estimate based upon a multiplication of two or more variables, especially where we’re having to estimate for an unknown (or arbitrarily large) number of unknown (or arbitrarily large) things. This seems especially prevalent in areas like MI, where a client often doesn’t know what reports they have, which of those are actually needed/used, and how complex they are.

The trick here is to bound one (or more) of the dimensions with reasonable, documented assumptions so that we can present an estimate that isn’t simply the worst case multiplied by the worst case. By binding the quantity dimension and providing a price-per-thing estimate (price-per-report, price-per-location, etc) rather than a total estimate, we remove the temptation to provide an arbitrarily large number and we also allow the customer to think about the phasing and MVP. They might not need to provide “all reports”, “all stores”, etc up front, especially if this significantly increases the cost of their initial roll-out.

In A Nutshell

Think ranges, not points
Avoid false high-precision representations
Embrace and be transparent around uncertainty
Higher confidence means less contingency is required; increasing confidence is a win-win scenario
Focus Discovery activities on low-confidence aspects
Strive for convergence of “deliverable” and “sellable” ranges
Provide separate estimates for “knowns” vs. “unknowns”
Break down and restate arbitrarily large numbers
Seek the MVP by deferring aspects that aren’t known to be required.

Lawrence Ryan 4y

Great insight into estimating. I also like to compare activities with like prior work to test estimate and elicit unknowns.

1 Reaction

Andrew Norman 4y

Very good article. Highlights many of the problems in the black art of estimation (or guessing with a little thought behind it as an old manager once said to me).

2 Reactions

Lee Bohan 4y

For infrastructure projects was back when I was doing those I would often turn this around and ask the stakeholder what did they want it to cost, what they could cope with and what their NFRs were. Gave me chance to go back at L0 quickly to set expectations etc. if they were way out there or kill it early if it just wouldn’t fly. Any job can be done for any price if the quality is sufficiently unimportant.

1 Reaction

Luke Walton-Jones 4y

This is really well put together Mark. I don’t often dust off my keyboard to comment on LinkedIn but this is really valuable. Good work!

2 Reactions

Rich McIntyre 4y

Great write up Mark, very solid thinking, great advice from good experience. Thank you 😀

See more comments

To view or add a comment, sign in

Estimation Ramblings

Mark Pullan

Disclaimer

The Fallacy of “The Estimate”