What We Do and Don't Know about Software Development Effort Estimation

x0x0 · on Aug 30, 2014

Yet another place where intuitions derived from the normal distribution about the behavior of distributions screws people.

An explanation I like, from michaelochurch:

      Let's say that you have 20 tasks. Each involves rolling a 10-sided die. 
   If it's a 1 through 8, wait that number of minutes. If it's a 9, wait 15 
   minutes. If it's a 10, wait an hour.
      How long is this string of tasks going to take? Summing the median time 
   expectancy, we get a sum 110 minutes, because the median time for a task is 
   5.5 minutes. The actual expected time to completion is 222 minutes, with 5+ 
   hours not being unreasonable if one rolls a lot of 9's and 10's.
      This is an obvious example where summing the median expected time for the 
   tasks is ridiculous, but it's exactly what people do when they compute time 
   estimates, even though the reality on the field is that the time-cost 
   distribution has a lot more weight on the right. (That is, it's more common 
   for a "6-month" project to take 8 months than 4. In statistics-wonk terms, 
   the distribution is "log-normal".)

idlewords · on Aug 30, 2014

That's a nice lucid example. Google should hire that guy!

Cacti · on Aug 31, 2014

Exponential feedback is a bitch!

When I put together plans and estimates, I always take a lot of care to separate out those things which are linear and those things which exponentially impact other things within the schedule, along with the sort of inflection points. I may not know where I'm going to roll a 9 or 10, as they may crop up anywhere, but there are certainly areas where they are more possible and less possible.

In a sane world, at least. Can't do much with a black swan.

manoDev · on Aug 31, 2014

In a startup I've worked on, we did estimates based on hours, but also had the "no idea" tasks. We would then make sure to alternate sprints to work on well estimated tasks (as you say, "linear" tasks) and others to work on the hard ones, without a set deadline; or split the team effort in that manner.

I think that's a better strategy than making wild guesses and ultimately falling behind schedule, but at the same time maintaining cadence, which buys you power to sometimes say "hard things are hard, I don't know when it'll be ready to ship".

crasshopper · on Aug 30, 2014

If, as the article says, "Clients’ Focus on Low Price Is a Major Reason for Effort Overruns", then probably a simpler theoretical explanation can just be the winner's curse.

No need to talk about distributions, gaussian or otherwise.

on Aug 30, 2014

[deleted]

hu_me · on Aug 30, 2014

https://news.ycombinator.com/item?id=3522910

lethain · on Aug 30, 2014

I think many and perhaps most poor estimates are caused by initial estimates being viewed as too high for the project, and instead of deciding the project isn't worth doing at its estimated cost, instead deciding the estimates must be wrong in order to align expected project value with expected project cost.

Perhaps in a twisted future where we estimate project cost before deciding which projects to take on, we might discover our estimates are much better.

A related pathology is trading technical debt for speed, every time, on every project. The debt will be paid.

markcmyers · on Aug 30, 2014

Well, exactly. If I understand the paper, the only condition you need in order to arrive at accurate estimates is the absence of pressure to underestimate.

x0x0 · on Aug 30, 2014

You've worked on one of those projects too, eh? I call them pony projects, as in I want a fucking pony...

People have a recurring delusion that the world is shaped by their wants.

emjimenez · on Aug 30, 2014

Software effort estimation methods fail because they ignore the margin of error. In Mathematics, engineering and statistics a result does not mean anything if it does not include the margin of error. One month may be one month if the margin of error is one day, and one month may be one year if the margin of error is one year. Classic estimation techniques like Cocomo or Albrecht Function Points ignore this fact. They have no mathematical rigor. If presented with mathematical rigor they would be absurd, because their margin of error is between 100% and 600%. Classic software effort estimation techniques are harmfull and dangerous, because ignoring the margin of error they invite to make decisions that ignore existing risks. No automatic method can replace human experience and wisdom. They have not bound margin of error too, but at least they do not pretend to hide existing risks.

tunesmith · on Aug 31, 2014

Many customers/clients don't even want accurate estimates. Given the choice between an accurate estimate of $x, and a competing estimate of $0.75x with later surprises and deadline stress and renegotiations to pay another $0.35x for "phase 2" which gets the product up to what they originally wanted, especially when the business relationship has "bonded" in a way where it's all rah-rah, go-team, we're-in-this-together... clients will go for the latter path way more often than they should.

Part of the reason estimates are inaccurate is because there's that business disincentive to be accurate.

crasshopper · on Aug 30, 2014

   Hofstadter's Law: It always takes longer than you expect,
   even when you take into account Hofstadter's Law.

The best advice I ever got on project-time estimation (from a biology postdoc) was: make your best, most honest best effort, and then double it.

When I make projections with a spreadsheet, I have a cell that copies my grand total of all costs and call that copy "unforeseen costs". I always hate bidding that high at the start, but the estimate ends up being close to right surprisingly often.

This article says 30% overruns are common, which is within my former boss' +100% bounds.

The other nice thing about doubling your cost estimate is it prevents you from catching the winner's curse and landing an overly-stingy client. Plus if you really can keep costs within your spec for the project, then you win extra profits. You'll never win that "game" if you don't leave room for error.

dirtyaura · on Aug 31, 2014

Experienced project managers multiply by pi ;)

http://alistair.cockburn.us/The+magic+of+pi+for+project+mana...

I think people should start using confidence intervals. Then the upper bounds become more realistic. If you need to estimate roughly with 90% condifence intervals, then a developer can communicate the uncertainity: I think task A will take about a week. At least 2 days. And no more than 3 months.

You can immediately see that it's probably best to either a) work with this tasks a few days and make a new estimate based on the acquired knowledge b) or if that's not possible, try to split the task to smaller subtasks to identify which parts are the most uncertain.

YZF · on Aug 30, 2014

Doubling your estimate is a common rule of thumb that goes way back but over time if you keep learning from your previous estimates you should be able to become more accurate on average. There's another subtle thing going on is that many times when the estimate increases so does the actual time, a self fulfilling prophecy, hence Hofstadter's law...

crasshopper · on Aug 31, 2014

Agreed, if you're doing the same kind of work (or similar enough) over time.

ams6110 · on Aug 31, 2014

The advice I got was double the number, and increase the time unit. E.g. you think the project will take 2 weeks, estimate 4 months.

crasshopper · on Sept 2, 2014

I've heard that as well. I dunno, doubling seems to have worked so far...

jdimov · on Aug 31, 2014

Might as well estimate 8 years then, you'd probably still be right.. but is it still worth talking to you?

Joeri · on Aug 31, 2014

I've been giving estimates for a decade, and still feel like I'm winging it every time. Planning poker definitely works, provided you understand the requirements, and by 'understand' i don't mean you have read a spec document, but that you understand the customer's business problem and have figured out how the proposed solution aims to solve it. Sadly most large projects don't have the time in their pre-sales estimation phase for the team that produces the estimate to build an understanding of the whole problem domain. Paradoxically this low confidence in the estimate will tempt the sales team to cut it even further, since they interpret uncertainty as a liberty to seek the low bound (or even lower).

As others have pointed out in those large projects it's better not to make up-front estimates and just build as much value as possible for a fixed cost, using agile principles. However, that's typically not how large software projects are sold (or bought). Fixed price almost always means fixed scope. I'd like to know of any large software project sold to a customer in truly agile fashion (no fixed scope determined in advance). To me it sounds like a software development unicorn: you hear about it, but you're never the one building it.

ww520 · on Aug 30, 2014

Often a technical realistic schedule is derived and presented, but the business side deems the project cost is too high and asks the schedule to be "optimized." The optimistic scenarios of the schedule is adapted and revised. Of course reality sets in when the project goes forward and it ends up taking as much time as predicted.

andruby · on Aug 31, 2014

That's often what I see. An estimate roadmap is presented, management expresses that it wants it sooner, the roadmap is "shuffled" and "optimized", it is approved, yet reality still sets in during development :-)

amenod · on Aug 30, 2014

IMHO it is quite easy to make a reliable estimation of well-planned project. However it is extremely difficult to plan the project more than one step ahead of what is already done... This is why agile development is so popular.

In general when under-estimating the project you can make it:

1) on time and within planned resources,

2) with all the planned functionalities and

3) without sacrificing quality.

Pick any two.

jayvanguard · on Aug 30, 2014

> IMHO it is quite easy to make a reliable estimation of well-planned project.

Evidence suggests otherwise. Sure you can estimate +/- 100-200% early on but that isn't what anyone is aiming for in a software project. Even detailed plans of repeatable (non-trivial) software projects do not result error bars that anyone really desires.

HeyLaughingBoy · on Aug 30, 2014

I don't know that I'd say it's easy, but it is certainly possible. The big takeaway from the article that I agree with is that historical data significantly improves estimates. If you know that e.g., the last 5 projects took an average of x weeks on the authentication layer, then it's likely that your project will take somewhere around the same time.

The problem is that most companies don't record this data. Start today!

amenod · on Aug 31, 2014

You missed the second part: "However ...". :)

My point was that it is the planning step that is extremely difficult, not the estimating one. With most real-world projects the project plan must follow changing requirements (based on external input or on things you have learned during development). It is extremely unlikely that the original plan will (or should) be followed to the end.

chvid · on Aug 31, 2014

The article says:

"A tendency toward underestimation of effort is particularly present in price-competitive situations, such as bidding rounds. In less price-competitive contexts, such as inhouse software development, there are no such tendencies - in fact, you might even see the opposite. This suggests that a main reason for effort overruns is that clients tend to focus on low price when selecting software providers - that is, the project proposals that underestimate effort are more likely to be started. "

Is that really correct? Are there studies that shows that inhouse projects (or not fixed-price projects) do not underestimate systematically as opposed to fixed-price client projects?

narag · on Aug 31, 2014

There are very different inhouse environments. Some are obsessed with metrics and "improving" results.

kickingvegas · on Aug 30, 2014

An older engineer once told me a valuable tl;dr approach to engineering estimates: whatever number you arrive at, multiply it by two.

martininmelb · on Aug 30, 2014

An even older engineer once told me: Even after you've doubled the estimate, it will still take twice as long as you estimate.

zwischenzug · on Aug 30, 2014

"Six to eight weeks" was the default estimate my project managers gave for anything above trivial. Long enough to make the task seem difficult, not too long to scare off the client.

twic · on Aug 31, 2014

Do you work at Stack Exchange?

http://meta.stackexchange.com/questions/19478/the-many-memes...

zwischenzug · on Aug 31, 2014

Ha! This was the norm at my corp over 10 years ago. Heh.

Pxtl · on Aug 31, 2014

Double it with each layer. Double your estimate for each individual task, then when you've got the whole Gantt chart built for the iteration, double that.

evan_ · on Aug 30, 2014

That older engineer... was it Scotty?

http://www.imdb.com/title/tt0708764/quotes?item=qt0349432

snarfy · on Aug 31, 2014

We have so much technical debt I can no longer estimate with any accuracy. Something that should take half a day takes a week. Most of the week is cleaning up all of the crap in the code, and then spend a couple hours writing the few lines of code that solve the business requirement.

collyw · on Aug 31, 2014

Sound exactly how my management want things to be in a couple of years

Flenser · on Sept 2, 2014

Does anyone know where can I read the referenced studies? I'd be interested if they looked at Evidence Based Scheduling / Monte Carlo simulation: http://www.joelonsoftware.com/items/2007/10/26.html

Edit:

One of the authors of the references has several articles available here:

https://www.simula.no/people/magnej/bibliography

but not the one referenced, although there are several newer articles.

Flenser · on Sept 2, 2014

Found a PDF for this one thanks to Google Scholar

4. T. Menzies and M. Shepperd, “Special Issue on Repeatable Results in Software Engineering Prediction,” Empirical Software Eng., vol. 17, no. 1, 2012, pp. 1–17

http://menzies.us/pdf/12stability.pdf

EDIT:

Several of the author's articles are available here (click the PDF links):

https://www.simula.no/people/magnej/bibliography?b_size:int=...

From looking at Google Scholar it looks like there are many newer articles on Software Estimation that the OP does not reference so may not have read.

Edit 2:

This paper talks about Monte Carlo Simulation:

https://www.simula.no/research/se/publications/Jorgensen.200...

mlinksva · on Sept 2, 2014

That we don't know whether software development is subject to economy or diseconomy of scale stuck out to me.

Estimation cost (of doing the estimation, not consequences of estimation) not mentioned. Is estimation itself significantly costly relative to subject of estimation?

To the extent open source works relatively well as a development practice, how much of a role does suppression of estimation play (assuming there is suppression; harder to even pretend to hold anyone to an estimate without a contract, so why bother)?

cashoil · on Aug 31, 2014

Problem with estimates is that once there is an estimate the team can really stick to this estimate. Regardless of quality.

It is feasible for the team to claim that it met the estimate, and it is feasible to have all indicators green on the day the deadline is met. Simply do less design, less refactoring, less thinking, less tests, less collarborative work, less engineering...

blutoot · on Aug 30, 2014

Can Machine Learning (or NLP) ever help in estimating effort based on expected lines of code where the model would be trained upon similar applications/files that already exist? If so, is anyone researching this in academia or in any research lab?

ericHosick · on Aug 30, 2014

There is mention of tools like group estimates which improve the estimation effort (according to the article).

There isn't much mention of the estimates you can go for:

1) Accurate but not reliable

2) Reliable but not accurate

kaonashi · on Aug 30, 2014

> An implication of these observations is that clients can avoid effort overrun by being less price - and more competence - focused when selecting providers.

I absolutely loved this line.

alien3d · on Aug 31, 2014

Hard to estimate.client think it was easy .most think us like tv show,

dror · on Aug 31, 2014

I never understood why software estimates are so bad when other fields do so well.

When a contractor gives you an estimate of how long and how much it's going to take to do a remodel, he's invariably on time and on schedule, right?

And when Boeing spends billions of dollars on a new plane, they have on ready and on budget, right?

So why can't software people do the same?

Oh wait, complex, badly defined projects tend to run late and over budget. It's not that complicated. Spend 3-6 months defining all the details of your new web application, promise not to change anything on the fly, don't ask us to make it work on IE 7, and by the time we do 3-4 of these, we'll be able to give you a good estimate.

zxcvbnmkj · on Aug 31, 2014

> And when Boeing spends billions of dollars on a new plane, they have on ready and on budget, right?

Well, no, they don't.

http://edition.cnn.com/2011/TRAVEL/08/07/boeing.dreamliner/