Friday, June 5, 2009

A Blast from the Past - Scoping and Estimating with Context Diagrams



A technique developed "in the old days", Context Diagrams, can often help us develop a plausible first approximation of project scope before requirements are really defined. It's easy to learn, quick to apply, and works with any flavor of development, including both traditional and Agile approaches.

No rocket science here, just practical common sense. A Context Diagram, illustrated here in the general case, is simply a list of "objects" or elements that are typically included in a software system. This particular format originated back in the days of stone tools, perhaps it was the Mesozoic era, sometimes referred to as the epoch of "structured analysis". Truly amazing what those cave people were able to accomplish! But I diverge ...

This was, and is, a useful quick way to scope out a new project. Let's take a simple example. Suppose for some reason our fearless leader decides we're going to build a new accounts receivable system. Le Grand Fromage wants to know by Tuesday how long it's going to take and what it will cost.

Step 1: Assemble a small group of individuals who have at least a fuzzy idea what an AR system entails - e.g., the billing supervisor, the head of collections, a rep from the CFO's office, and the BA who deals with the order to cash process.

Step 2: A facilitator (perhaps the BA) draws a big box on the white board and asks the group "What's inside the AR box, what's not?" Billing speaks up "We're using quill pen and abacus now - we really think we ought to be inside." Collections pipes in "If we're going to include billing it will greatly increase the scope - is that really what M. le Brie intended?" Already the tribes are forming - who's going to get voted off the island first? We have a choice to make here - do we go back to the big cheese right now, or do we continue and develop several alternative scope estimates - e.g., with billing and without.

Step 3: Whatever we decide is (at least provisionally) inside the big box, our next step it to start describing the various elements that define "it". For most business and administrative type systems there are five broad categories of "objects" that will account for the vast majority of the effort and cost associated with building a system - specifically, Inputs, Outputs, data sets (tables, files), Interfaces to other systems (both one-time and on-going), and Queries. We now make a list of each item in each category. For our hypothetical AR system inputs might include New Customer, Merge Customers, Delete Customer, Update Customer, Invoice, Credit Memo, Credit Report, etc. A list of several hundred items can easily be identified in one day or less.

Notice we're not trying to define the content or format of any of these items, and we're not exploring business rules, workflow aspects, or any of the other myriad details that will need to be uncovered before we can actually build this. Of course we'll need to do that later, but just this list, which in a great many cases can be pretty well completed in a day or two, really tells us a lot about what we're getting into.

Step 4: Convert our list to an estimate of Size (in "function points"). Here we can use a simplified approach I call "function points lite", also known as "count on average". This approach will give you perhaps 80% of the accuracy of full-blown function points at a small fraction of the cost.

For those who may not be familiar with Function Points (FP), they are the subject of an ANSI and ISO standard developed by the Internal Function Point User Group (IFPUG). This is a method to quantify the "size" of a software system, and size has been shown to be a primary driver of cost and schedule. Many estimating methods and tools use Function Points as a primary input.

Each of the five entity types mentioned above has been assigned a low, average, and high function point value based primarily on the number of data elements making up the entity. (The 'official' counting method is a good bit more involved than we'll go into here - references on this abound.) You can get a very good approximation of the total count very quickly by simply assuming every element is "average" for its' type. Taking our example a step further might lead us to ...

7 Inputs * 4 = 28 FP
9 Data sets * 10 = 90 FP
6 Queries * 4 = 24 FP
12 Outputs * 5 = 60 FP
8 Interfaces * 7 = 56 FP
=================
Total = 258 Function Points

This is then the "size" of our baseline scope - obviously as items are added or removed as we go along the scope can easily be adjusted accordingly - AND we're adjusting in a way that is very easy for the customer to understand.

Step 5: Schedule and Cost Estimates - We can convert this size value into estimates of schedule and cost by many different means - e.g., all of the best selling estimating tools will take our FP count (and other factors) as inputs and give us the results. Alternately we can use simpler 'rule of thumb' methods such as those suggested by Capers Jones (note: Capers would readily agree these are much less reliable than tool based approaches that account for many more factors) - e.g.,

project duration (months) = FP raised to the .4 power = 258**.4 = 9.2 months

project staffing = FP / 150 = 258 / 150 = 1.7 staff (full time equivalent)

project effort (person months) = schedule months * staff = 9.2 * 1.7 = 15.6 person months

Step 6: Evaluate Risk - unlikely we would do so for a small project such as our illustration, but if it's a big and nasty 'bet your job' estimate we can go a step further and apply a technique such as Monte Carlo simulation to evaluate a range of outcomes and probabilities. It's beyond our scope to get into that here, but if you've got a risky situation I'll be most pleased to consult with you on how to manage it.

Once we have this list we can get back to his or her Cheesiness to provide the requested estimates and get some initial decisions on what will and won't be included.