On software project scheduling
The fundamental problem with estimating software projects is, of course, that in some point of the chain of command there is a person who wants to have two numbers: a date when the software is ready and the amount of dollars the project costs. It would be relatively OK if the interpretation of these numbers was unambiguous, but unfortunately it is far from that. The person that ultimately decides whether money is invested to a software project wants to hear that the software will be finished with 100% certainty before a certain date (and if it isn't ready, they want compensation) and with 100% certainty the software project cost is below a certain figure. This is insane, of course.
(Almost all of the following is stolen from Steve Mcconnell's excellent Rapid Development.)
Software projects have numerous uncertainty factors (ie. variability) that make it impossible to estimate one exact figure. Rather, possible schedules form a probability distribution, something like the following:
^
|
Probability |
of |
completing |
exactly on |
the |
scheduled | _____
date | _/ \_
| / \_
| / \_
| | \__
| / \___
| | \____
| | \-------\
-------------------------------------------------------->
Scheduled completion date
So, say, you estimate that your project is finished on the date marked with asterisk (*) in the following graph:
^
|
Probability |
of |
completing |
exactly on |
the |
scheduled | _____
date | _/ \_
| / \_
| /| \_
| || \__
| / | \___
| | | \____
| | * \-------\
-------------------------------------------------------->
Scheduled completion date
You can deduce the probability by counting the area limited by the the vertical line and the curve, and you can see that it isn't too likely that you will finish on the schedule.
Now, to make an estimate that has characteristics mentioned in the first paragraph, you would need to move schedule point (*) to the near end of the graph:
^
|
Probability |
of |
completing |
exactly on |
the |
scheduled | _____
date | _/ \_
| / \_
| / \_ |
| | \__ |
| / \___ |
| | \____ v
| | \-------\*
-------------------------------------------------------->
Scheduled completion date
But such estimation doesn't make any sense either. It is very likely that you will finish before the scheduled completion date, which is nice as such, but imagine that you try to sell the software project with a bill of 1 million dollars versus a bill of, say, 300 000 dollars. And imagine the dynamics of the project if it's estimated to one year versus four months, both with similar requirements. Most probably the one year project will make a better product, but does it justify three times more money spent and three times more time spent? Not likely.
The most accurate estimation is somewhere around the asterisk in the following graph:
^
|
Probability |
of |
completing |
exactly on |
the |
scheduled | _____
date | _/ \_
| / |\_
| / | \_
| | | \__
| / | \___
| | v \____
| | * \-------\
-------------------------------------------------------->
Scheduled completion date
But given this estimate (schedule) the project has a 50 % chance of being late. This sounds unacceptable for many managers, but in one form or another, they just have to accept it. The fact remains that a software project is far from deterministic.
There are ways to make the curve more "bent", meaning:
^
|
Probability |
of |
completing |
exactly on |
the |
scheduled | __
date | / \_
| | \
| / |
| | \__
| / \___
| | \____
| | \----\
-------------------------------------------------------->
Scheduled completion date
But this takes years of hard work. You need measurement, you need people specializing on estimation, etc. You need better methodologies, stricter schedule control. More iterative processes, whatever.
Making projects faster, ie. moving the curve to the left in the graph, is possible too, but it's even harder than making better estimations and reducing the variability of projects.
Having said all that, I need to point out that the reality is actually much more bleak than this. Above graphs are nice idealized models of software schedule probability distributions. In reality, you have no idea what your probability distributions are. So you might as well make a wild guess for your schedules.
There are ways to improve estimation too, of course, but it's not within the scope of this essay. Later.