Modeling and simulating the execution of a software project
Another day... another expedite comes our way!!

Modeling and simulating the execution of a software project

TL;DR:

In this article I will walk through how to model and simulate the execution of a software project based on actual data. I will also demonstrate how to run a Monte Carlo simulation to provide a forecast for a backlog of 30 items.

Using a data set that contains almost 100 data points, I have created a model upon the first 30 items.

I have also modeled a backlog for the next 30 items in the list and execute a simulation to forecast a delivery data.

Introduction

Forecasting release dates with some reasonable level of confidence is nothing short of a Holy Hand Grenade in Software Development.

After coming across the KanbanSim tool by the excellent Troy Magennis, I immediately wondered what would happen if I tried to forecast a date based the real data we had collected in Kanban team for over 6 months period.

So I decided to run an experiment whereby I created a model of the development process upon which the data was collected, and then run a Monte Carlo simulation to forecast a delivery date for a subset of our backlog.

Naturally, I did have the benefit of hindsight when doing this exercise. Nevertheless, some of the conclusions I reached were quite interesting. Especially realizing that the simulation resulted in a range of dates that was pretty close to the actual final delivery date.

Too good to be true? Let's see...

The Tool

The KanbanSim (TM) used in for this simulation has been developed by Magennis' Focused Objective and it comes with a free and licensed version.

KanbanSim allows for modeling a software development process by using the SimML language, which is no more than a XML representation of a model. This model can be based on Kanban (flow based) or Scrum (iteration based) processes, and incorporate different elements like WIP limits, an estimated backlog, total costs, cycle time distribution, possibility of adding blocking events, defects or expedites based on probability of occurrence, etc.

No alt text provided for this image

On top of that, KanbanSim can execute a simulation of the model and perform more advanced techniques such as forecasting or sensitivity analysis to understand what variables are most impactful in certain aspects such as cycle time. One of the cool features is to be able to visualize the result of a simulation which can be great for coaching purposes:

No alt text provided for this image

The simulation illustrated in this article made use of the free version, and kudos to Troy Magennis for making this piece of software widely available for free.

Steps

The first task is to build a model using the SimML language that represents the environment and team for which we will run the simulation.

KanbanSim comes with pre-loaded models and also with a reference book (see Sources at the bottom of this article for a link) that contains examples on the many attributes SimML supports. The model can use range of estimates and also actual distributions, for this experiment I had the benefit of using real data. But when real data is not available, modeling on estimates is also possible.

The spreadsheet I retrieved the data from looks like this:

No alt text provided for this image

The main data points recorded here was work item type, estimation (in T-Shirt size), total cycle time, count of defects found during QA test and number of days blocked. Let's now look at how incorporate those into the model.

The steps I took for modeling the process were:

  1. Define a process and team composition
  2. Define process stages(Kanban columns)
  3. Define the backlog
  4. Define blocking events
  5. Define Defects
  6. Define Added Scope
  7. Fine-tuning

Process & Team Composition

By default a KanbanSim model is based on a Kanban process. So there's nothing specific to add here regarding the process.

It is handy to have a variable to store the number of team members for each phase of the process so we can do further tweaks to WIP limits and costs for example. Our team originally had two developers and two testers. The SimML would look like this:

No alt text provided for this image

Process Stages

The second step was to define the process stages. The process for this team consisted of basically 3 stages, namely Analysis, Development and Testing. Each stage was divided into two columns: Waiting For and Doing (taking inspiration from TameFlow).

The "Waiting For" column for testing and analysis were modeled as buffers (which was the closest thing SimML would allow me to do). Regarding analysis, from the sample data I used for the model, it averaged to 4 stories every week and that's the value I have used.

No alt text provided for this image

Backlog

For modeling the backlog I used a sample of 30 items that comprised stories and bugs, these were already estimated so I could take advantage of that I create a backlog that contain four categories of deliverables: extra-small, small, medium and large.

For each category that is defined SimML allows for assigning a cycle time range estimate. For this I basically looked at the cycle time ranges I had for the first 30 items and provided estimates based on the actual data recorded.

No alt text provided for this image

At this point I ran just one simulation to check to see that the cycle time distribution was roughly matching the one from the first 30 items sample.

No alt text provided for this image

While the values were matching for the most part, my model at this point did not account for outliers. In the example above, an item took 18 days to complete due to a blocker. SimML supports adding blockers and other events, and that was the next set of steps I had to take as I was building the model.

Blocking Events

Looking at the data, there was one single event that blocked an item for 14 days. Then there was another item blocked for a couple of days. Accordingly, I added two entries in the SimML mark-up to account for these. For each event I specified the likelihood (4%) and a range estimate for blocking time.

No alt text provided for this image

Defects

There is also the possibility to model the creation of defects. But we have to understand how this works with KanbanSim, as I made a mistake when doing this.

In my spreadsheet I kept a count of integration tasks (i.e. defects caught during testing) for every item. So I thought I should include these in my model, accordingly I added the following snippet:

No alt text provided for this image

The above means that there's a 27% of probability an item will raise a defect, which would take roughly one hour to fix and between one and four hours to test. Also, when a defect is raised in the Test column it moves directly to the Develop column.

After running a simulation again, the cycle time distributions were quite different. The mistake I was making was double-counting the cycle time of defects. In the actual process we would not perform flow-backs of work items or create new defects, items would stay in the test column until all issues were resolved. Therefore, the cycle time recorded for every item also included time spent doing rework.

That was the biggest flaw in my model, once I realized what I was doing, I removed the <defects> element from it.

Added Scope

From time to time escalations would hit the team, and so-called expedite items would jump straight into the board. In hindsight it was great to have those labeled because I could also add those to the model, and as we will see later, being able to specify those goes a long way to increase the accuracy of the model.

No alt text provided for this image

For this model, one expedite item will be create every 20 or 30 items. And only one expedite is allowed to be on the board at any given time.

Fine-tuning

Once all the elements were in place I started to perform single runs to spot check that blockers and expedites were being created roughly at the same rate as my sample data.

No alt text provided for this image

Having verified that was OK the next step was to check the cycle time distribution. I noticed that was slightly off, and inspecting the visualization I noticed tickets were backing-up in front of the Test column.

No alt text provided for this image

It was a practice in this team that from time to time a developer would switch to testing when tickets started to back-up under the Waiting for Test column. This is an opportunity to invoke the WIP Constraint principle no. 13 from Principles of Product Development Flow: The Principle of Skill Overlap: Cross train resources at adjacent processes.

We didn't become cross-functional because a guide told us to do so, we did it in the spirit of enabling flow.

Fortunately SimML has another trick which is called phases. With phases we can simulate fluctuations of capacity and/or estimates, and I tried this to simulate 3 cycles where the WIP limit of Test would increase at the expense of decreasing Develop WIP limit.

No alt text provided for this image

I re-ran the simple simulation again, and compared the count of items that were in queued positions before/after the change to confirm that the phases I added had the intended impact.

No alt text provided for this image

We can see in the comparison chart above that by adding phases the count of items in queue goes down, being almost on par with the number of pull transaction (liquidity).

At this point I was happy enough the model was behaving similarly to the actual process. It was the time to run a Monte Carlo simulation on the backlog to check how it compared with the real figures I recorded.

Performing the simulation

For the forecast, we need to set a starting date and delivery data. For this I just simply went to my spreadsheet and took note the date the first item started development (3rd of February) and the date the last item was completed (20th of March).

I then kicked off a basic Monte Carlo simulation of 250 cycles and this was the result I got:

No alt text provided for this image

With 94% confidence, the backlog of 30 items could be delivered by the 24th of March. Considering the actual date I recorded was 20th of March, that is a pretty impressive result.

After this promising outcome, I then wondered what would happen if I included more items from the backlog, would the model still hold?

To do so I created a second set of deliverables, of 30 items again, resulting in a total backlog size of 60 items. Team composition had not change during that period of time, would the outcome be consistent of the delivery date of the last item?

No alt text provided for this image

I updated the end date to be 22nd of April and then re-ran another Monte Carlo simulation of 250 cycles.

No alt text provided for this image

The simulation above gives 28th of April as the delivery date with a confidence of almost 90%. That is 6 days off the actual date recorded (22nd of April).

A quick look at the cycle time distribution shows that there's indeed something wrong as the minimum cycle time recorded was two days. When as a matter of fact in the actual data set there is quite a few items that took one day to complete.

This suggests the model needs some more fine tuning, as the cycle time distribution should resemble the actual one displayed by the team. Nevertheless, I consider the experiment a success as the forecasts were off the actual dates by 3% and 6.5% respectively which I believe it is quite reasonable margin for a software project.

Conclusion

With KanbanSim is possible to build up a model that can be used for forecasting delivery dates of software projects.

There are some caveats to predictability, Daniel Vacanti did a wonderful job in explaining this in his outstanding Actionable Agile Metrics for Predictability so I won't elaborate here. But basically ensuring conservation of flow applies to our process and also having a stable system will contribute positively to having a predictable delivery cadence.

Something I didn't mention in this article so far was that this team took to heart certain principles to make sure we minimized the chance of items getting stuck on our board. In addition to having a team composition that remained stable for 3 months with a cross-functional spirit, we also adopted the Full Kitting philosophy; making sure we had everything in place to execute the work (dev & test environments up and running, adequate resources, etc.).

For me that explains why the result of the simulation matches so closely the actual figures. Highlighting that a thorough understanding of Kanban and some principles of flow do definitely help to create a stable system.

Sources

Magennis, Troy (2011) Forecasting and Simulating Software Development Projects

Vacanti, Daniel (2016) Actionable Agile Metrics for Predictability

KanbanSim

Oh, for a minute I thought you were going to apply formal methods to project management. Interesting series, by the way.

To view or add a comment, sign in

More articles by Daniel Hernández

Others also viewed

Explore content categories