Introduction to Critical Chain Project Management
Robert C. Newbold, ProChain Solutions, Inc. 
Introduction
We are all aware that virtually every business sector has become
more and more competitive in recent years. There is cutthroat
competition both at home and abroad, and the need for improvement
embraces virtually every aspect of business. The popularity of
downsizing, rightsizing and re-engineering attests to the need for
change. There can be no question that this need will grow even
stronger into the next century. We need more than one-time fixes;
there is a clear need for processes for ongoing improvement,
processes that can enable major leaps in performance.
This paper explores such a process, which is derived from an
improvement methodology called "Theory of Constraints" (TOC). TOC
consists of a number of common-sense tools and processes. These
tools allow us to focus efforts on those few areas, called
"constraints," which restrict our ability to improve. These
constraints are the leverage points towards which successful
improvement efforts must be directed.
The TOC concepts are well established in manufacturing; see for
example (Noreen, Smith and Mackey 1995). Application of TOC to
project management is relatively new, but initial results are very
promising. Completion times have been dramatically shrunk for
defense R&D contracts, aircraft repair, new product development
and various types of construction.
The Goal
What is a process for ongoing improvement? Intuitively it sounds
like a useful idea, but we need to be very clear about what we want.
A "process" is a systematic series of actions. "Ongoing" means the
process can be repeated over and over. We want a process that we can
use more than once; otherwise we’ll have to spend all our time
looking for the next bandage and hoping it works.
What is an improvement? In order to know this, we first need to
understand the goal of the organization we’re talking about. For
what reason was it created? We could define the goal of the U.S.
military to be "defense readiness." It needs to be as prepared as
possible to protect U.S. interests, primarily (let’s say) as a
response to aggression. The goal of a school district might be to
educate children who can live up to their potential to contribute to
society. The goal of a public company is better bottom-line results,
now and in the future. People invest in it in order to make
money.
In determining an organization’s performance there are two
important, fundamental measurements: what goes into the company to
allow it to produce, and what it produces. Since we are measuring
over time, these measurements must be rates. The rate of input is
called "operating expense;" the rate of output is called
"throughput" (Goldratt 1992, 60-61). Some examples:
| Type of Organization |
Throughput |
Operating Expense |
| U. S. military |
"Defense units" |
Federal tax dollars |
| School district |
"Potential achieved" |
Local tax dollars |
| Public company |
Money made (through sales) |
Expenses | EXHIBIT 1. FUNDAMENTAL
MEASUREMENTS
It is not easy to measure throughput of not-for-profit
organizations precisely. However, many counter-productive actions
and attitudes can be avoided just by a broad understanding of the
organization’s throughput, by asking "What are people supposed to be
achieving?" We will simplify the discussion by confining ourselves
to organizations whose goal is to make money, now and in the future.
In for-profit organizations, "throughput" is the rate at which
the organization generates money through sales. Usually the costs
that directly depend on an individual product (also known as truly
variable costs), such as raw materials and contractor prices, are
subtracted from sales in order to get a value for throughput
(Goldratt 1990, 19-20). Operating expense is the rate at which money
is spent generating throughput. The standard bottom-line measurement
"net profit" can be defined as throughput minus operating expense
(Goldratt 1990, 32).
In addition to net profit, there is another common bottom-line
measurement: return on assets. Return on assets is net profit,
divided by total asset value. (Goldratt 1990, 23) concentrates on
inventory rather than assets in this calculation, probably because
of the importance of physical inventory in manufacturing. Typically
the effect of increasing assets is difficult to assess. On the one
hand more capital is tied up, and on the other hand those assets
represent value that the company could (through sales) turn into
cash. For purposes of this discussion we’ll assume that reduction of
assets is not a major avenue for improvement.
We’re now in a position to be much more precise about the meaning
of "improvement." We want throughput to go up and operating expense
to go down. To make operating expense go down, we can reduce head
count, if necessary by laying people off. This avenue has a lower
limit for improvement. To make throughput go up, we can sell more,
or sell at higher prices; the potential here is unlimited. Whether
we choose the limited or unlimited direction, we must leverage our
resources to improve. We must find those key areas that are most
important to focus improvement efforts. There are several ways to do
this.
Leverage Points
Let’s look at a hypothetical company that sells projects to
develop custom hardware. This company has no problem with sales; the
demand will meet the supply for the foreseeable future. Furthermore,
the times and resources required to complete each project are very
similar. The generic flow is shown in Exhibit 2:
Each rectangle represents a task. The horizontal lengths of the
rectangles are proportional to the expected task durations. Inside
each rectangle is printed the number of weeks the task is expected
to take, and the type of person doing the work. The required
resources are a customer services representative (CS), an engineer
(Eng), a hardware technician (HW), and a computer programmer (Prog).
Each rectangle which immediately follows another is dependent on the
prior task. The arrow between 3:HW and 2:CS also represents a
dependency. This means that both 3:HW and 5:HW must finish before
the last 2:CS can start. The vertical bar to the right represents
project completion.
How many of these projects could be completed in a year, assuming
that the organization has one person of each type, and that each
person has fifty productive weeks in a year? We can at least answer
this theoretically, by noting that the hardware technician is going
to be the most busy. Since eight weeks of her time are required for
each project, about six projects can be completed per year. How can
the company produce more? There are a few choices: Make sure the
technician is focused on her work, and that she has as few
distractions as possible. Every minute she is productive translates
to more projects completed. Every project completed increases the
bottom line.
- Make sure the technician is focused on her work, and that she
has as few distractions as possible. Every minute she is
productive translates to more projects completed. Every project
completed increases the bottom line.
- Make sure that everyone else produces in order to keep her
busy. The programmer or engineer can’t take a vacation unless
there is sufficient work to keep the technician busy in the
meantime.
- Hire another technician.
In fact, we have just gone through the following five-step
improvement process, derived from (Goldratt 1990, 59-62):
- Identify the system’s leverage points. We noted that lack of
hardware technician’s time prevented us from making more projects,
and hence more money.
- Exploit the leverage points. That is, squeeze as much as
possible from the technician. This might be as simple as freeing
the technician from distractions or communicating her importance.
- Subordinate everything else to the above decisions. Make sure
everyone else is working to keep the "leverage point" busy. It
only makes sense for others to produce enough to keep the
technician busy, and no more. It only makes sense for the sales
people to sell as much as the technician can produce.
- Elevate the leverage points. That is, spend money to eliminate
them. An example might be hiring another technician.
What
happens if a new technician is hired? How many projects could be
completed in a year? Assuming the new technician comes up to speed
quickly, the new leverage point will in software department. At
six weeks per project, we now expect to be able to produce about
eight projects per year. Further improvement focused on the
technician will be at best useless. This suggests a fifth step:
- Go back to step 1; don’t let inertia become a constraint.
It looks simple and logical. Why aren’t we doing it? Here we
address two key reasons: local performance measurements and
uncertainty. Local performance measurements are formal and informal
means used to evaluate individuals. Uncertainty is embodied in
Murphy’s Law: if anything can go wrong, it will. The one thing we
can be sure of regarding our project schedules is that they will
never be followed precisely.
Local Performance Measurements
How can local performance measurements hurt the application of
the five-step process? Let’s use the company of Exhibit 2. What
"improvements" would typical management look for? They would
probably note that several people aren’t working all the time. In
fact, the customer service rep is spending half his time on coffee
breaks. Maybe the engineer and programmer could help out with
customer service. With the purchase of an automated phone system,
the customer service person could be let go, and money would be
saved. The bottom line has improved, and therefore the company seems
better off.
The reality is that the "keep busy" mentality is a measurement.
The message is "work or be laid off." It may not even be a formal
measurement, but chances are everyone knows about it. If they don’t,
they will learn very quickly after the first layoff. What is the
response to this measurement? One must keep busy. There are then two
choices for someone who has insufficient work to do: slow down, or
accept more work.
Slowing down is a manifestation of Parkinson’s Law (Parkinson
1957, 2): the work expands to fill the time available for its
completion. People become less productive. People procrastinate.
Accepting more work sounds promising. On the other hand, if
people besides the technician continue to accept enough work to keep
busy, uncompleted work will build up in the system. For a discussion
of some of the results of this buildup of work, including increased
lead times and reduced quality, see (Goldratt and Fox 1986, 32-53).
For now, it’s sufficient to note that excess work adds to the
overall confusion. The more papers on your desk, the less likely you
are to find the ones you need. The combination of Parkinson’s Law
and the buildup of uncompleted work means that management can’t
really predict when tasks will be done, or how much time is
available to do more. It therefore becomes very difficult to predict
how long a given project, or even a given task, will take.
There’s another local performance measurement that causes
problems: the necessity of keeping individual commitment dates. When
people commit to finishing a task by a certain date, they are likely
to be held to this date. At first this seems both sensible and
inevitable. But consider what typically happens in order that people
can make their personal commitments. Suppose someone has a task that
should take, on average, five weeks of work. If they estimate five
weeks, they’ll be late at least half the time, even if no other
tasks come up. That is unacceptable. In order to have a good chance
of finishing on time, they’ll probably estimate at least ten.
They’ll provide for the worst case. Of course, they’re only busy
half the time, so they’ll have to take some other actions — either
slow down, or accept more work.
Now consider how these local measurements affect the bidding
process. The project bid must be competitive. Time to complete is
usually a significant factor. In order to be competitive, task times
are often factored down, sometimes arbitrarily. In the process
project performance criteria may be compromised as well. The chances
of completing on time, and the chances of satisfying the customer,
suffer.
Let’s look at how this affects project schedules. A possible
critical path schedule for the project in Exhibit 2 is shown in
Exhibit 3.
The tasks with bold borders are on the critical path. Note that
the individual tasks have been padded to protect their completion
times. Some of that padding may have been reduced to make a
competitive bid. The overall project duration is nineteen weeks. The
task durations, and hence the project completion date, are rather
arbitrary.
An astute project manager might note that there is a conflict for
the hardware technician’s time. They might decide to resolve the
conflict by creating an artificial dependency between the two
technician’s tasks, as follows:
This plan is probably more realistic. But now the project
duration is twenty-four weeks, which is twice the duration of the
critical path in Exhibit 2. Furthermore, the last task has only one
week of protection; this may still be insufficient to protect the
project due date from problems.
This scheduling approach also raises many questions. Suppose
everyone gave estimated average completion times for tasks. Suppose
we’re not worried about keeping everyone busy. It seems that at this
point we’re more exposed than ever to the effects of Murphy’s Law.
How can we protect project completion dates against inevitable
fluctuations? If we want to implement the five-step improvement
process, if we want to establish some predictability, we will need a
new approach to project planning. We need a technique that provides
significantly more protection to the project commitment dates,
without adding more slack time than traditional methods.
The Critical Chain: Breaking Murphy’s Law
Our new technique is called "critical chain" scheduling, and is
discussed in (Goldratt 1997) and (Pittman 1994). We start with the
initial project layout, Exhibit 2, and completely ignore
uncertainty. Exhibit 2 is not really a feasible schedule, because it
has two tasks contending for the hardware technician’s time. As our
first scheduling step, let’s resolve the resource contention.
This schedule would be perfect if there were no uncertainty. So
next we need to protect against the uncertainty. We need to protect
the commitment date, because the commitment date is directly tied to
throughput. In order to protect it, we must decide which tasks are
responsible for the current project duration. If delayed, those
tasks would make the project longer. Those tasks should therefore be
considered most important. They should be protected. That set of
tasks is called the "critical chain." The critical chain tasks are
shown with bold outlines in Exhibit 6:
It’s easy to see that a delay of any of the bold tasks would
delay the project. Note that this is different from the traditional
critical path in two ways: resource contention is taken into
account, and tasks are placed at their late start times. Because of
the resource contention the critical chain, unlike the critical
path, can hop from one path to another.
Having identified the key tasks, how can we best protect the
customer? We are dealing with a fixed number of resources, so the
only feasible way of adding protection is by adding time.
Traditionally we protect the schedule by padding individual tasks or
"spreading the slack" throughout the schedule. Using the critical
chain approach we don’t protect individual tasks; we protect the
project completion. We do this by means of lumps of protection,
scheduled blocks of time, called buffers. The buffers look like
slack and feel like slack, but they are not slack. They are
necessary components of the schedule.
We need to identify the key places to put these buffers. First,
since the critical chain determines the project duration, we need to
protect the critical chain itself. If work is not ready for the
critical chain tasks to start, the critical chain will be delayed,
thus likely delaying project completion. This means we must have
some protection every time a non-critical chain task feeds the
critical chain. This type of protection is called a "feeding
buffer".
With the feeding buffer we have protected the critical chain from
fluctuations. We haven’t yet protected the commitment date from
fluctuations on the critical chain. This is done by means of a
"project buffer" placed after the last-scheduled task. Exhibit 7
shows the fully-buffered schedule:
The feeding buffer protects the critical chain task 3:HW from
uncertainty in the task 6:Prog. The project completion date is also
protected by a project buffer of five weeks. This means that every
task has at least five weeks of protection. By specifying average
task durations and by removing the necessity to keep everyone busy,
we have drastically reduced the need for people to take on multiple
tasks to keep busy. This, in turn, helps reduce the normal chaos
associated with fighting fires across multiple projects. If you
compare this schedule with Exhibit 4, you’ll see that while the
overall project duration is four weeks shorter, we have actually
gained significantly in reliability by pooling our slack into
buffers.
Imagine for a moment that the critical chain tasks in Exhibit 7
are parts of an automobile, and that the uncertainty causes them to
vibrate unpredictably. The buffers act as shock absorbers, so that
the vibrations don’t affect the passenger, who also happens to be
the customer. If we think about Exhibit 7 in this context, we’ll
realize that the vibrations go both ways, left and right. Some tasks
complete early, some complete late.
The critical chain approach helps projects to complete more
quickly by encouraging tasks to start early. Typically,
opportunities for starting tasks early are lost or ignored, because
people don’t know which are high priority tasks and therefore need
to be started early. When things go well, these "positive" schedule
disruptions can’t accumulate. Frequent rescheduling ensures that
late tasks or "negative" disruptions do accumulate, because they
must be taken into account in the revised schedules. If we know
which tasks are most important, i.e. the critical chain tasks, we
know which tasks we would like to start early. Furthermore, the
buffers allow this to happen. Suppose in Exhibit 7 the task 5:HW
completes in four weeks. Unless 6:Prog is very late, the feeding
buffer ensures that 3:HW can start a week early, thus speeding
project along.
There is a useful refinement that can be added to the schedule.
Consider what will happen if the Engineer is working on another
project before the 3:Eng task starts, and that other work is
delayed. That delay can delay the entire critical chain of this
project, and potentially use up some of the project buffer. To avoid
that, we can schedule a "wake-up call" (also known as a "resource
buffer") some time before resources are due to start their critical
chain tasks. The resources are told in advance when they will be
needed to perform these key tasks. This lets them know that they
need to be ready for high-priority work, and adds further
reliability to the critical chain schedule.
In creating a critical chain schedule, we have in fact carried
out the five-step improvement process. We identified the leverage
point by resolving resource contention and identifying the critical
chain. We will exploit the leverage point by focusing on critical
chain tasks. We have allowed everyone else to subordinate to the
leverage point by inserting buffers. We know how to elevate the
leverage point or "crash the project", if we so desire: we can
increase the resources available to work on critical chain tasks. Of
course, this may create a new critical chain; so we must then go
back to step 1.
An important question remains. If we don’t worry about late
tasks, how do we monitor project status? The answer is simple: we
monitor how much of the buffers have been used up, compared with how
much work remains on the path feeding it. For example, suppose
delays have pushed completion of the final project task into the
project buffer, so that only 30% of the buffer remains. If the
project is 90% complete, we’re probably in good shape. If it’s only
50% complete, there may be a serious problem.
Conclusions
We can expect a number of benefits from the critical chain
process for individual projects. Completion dates are more reliable
due to the addition of buffers to the schedules. Project
times-to-complete are reduced by pooling the slack into buffers.
Costs typically go down as lead times go down. Lower lead times also
minimize the opportunity for customers to change specifications,
which is a common cause of uncertainty in projects. Because people
are not rigidly held to task start and finish times, they can feel
comfortable taking the time to address quality problems without fear
of missing their completion dates. This reduces rework, a common and
severe problem with defects discovered late in a project (Boehm
1983, 40), and helps ensure a higher-quality result.
In a multiple-project environment, there are additional benefits.
If people are not measured by how busy they are or by precisely when
their tasks complete, the incentive to slow down or accept multiple
tasks is reduced. They are then free to follow an important rule:
when you have work, finish it as quickly as possible. It is then
much easier to estimate resource availability. It becomes possible
to identify "constraint" resources that are leverage points for the
organization to produce more projects; it even becomes possible to
select such resources as strategic leverage points around which the
business can be managed.
The TOC improvement tools come with a warning: there is no single
individual who can implement these concepts. In an individual
project, the entire project team needs to understand what is needed.
In a company, the entire organization must be involved. A successful
implementation requires going from a cost-oriented approach that
requires attention everywhere, to a throughput-oriented approach in
which everyone must work together and focus on key leverage
points.
References
Boehm, Barry W. 1981. Software Engineering Economics.
Englewood Cliffs, NJ: Prentice Hall.
Goldratt, Dr. Eliyahu M. and Fox, Robert E. 1986. The Race.
Croton-on-Hudson, NY: North River Press.
Goldratt, Dr. Eliyahu M. 1990. The Haystack Syndrome.
Croton-on-Hudson, NY: North River Press.
Goldratt, Dr. Eliyahu M. 1992. The Goal, Second Revised
Edition. Croton-on-Hudson, NY: North River Press.
Goldratt, Dr. Eliyahu M. 1997. Critical Chain. Great
Barrington, MA: North River Press.
Noreen, Eric; Smith, Debra; and Mackey, James. 1995. The
Theory of Constraints and Its Implications for Management
Accounting. Great Barrington, MA: North River Press.
Pittman, Paul Howard. 1994. Project Management: A More
Effective Methodology for the Planning and Control of Projects. Ann
Arbor, MI: University Microfilms International.
Back to
top
<< Go
Back |