Escalators, Software Projects, and the Science of Queues

I recently started a new job. New jobs always bring new challenges like building relationships with colleagues, learning about the business domain, or getting up to speed on the tech stack.

This job has all of those challenges, but also a challenge that I wasn’t expecting – getting into the office.

Let me explain.

The main entrance to this office is on the third floor of the building. There are many ways to get into the first floor of the building (probably dozens, I don’t know for sure, but I’m not curious enough to find out). There are four ways that I know of to get to the second floor. There are two ways to get to the third floor – one single-file escalator and one elevator. Everyone is trying to get to the third floor.

You probably know where I’m going with this, but stick around.

I get into the office “early”. Early is subjective for a tech company, but it’s enough to say that I get in before 90% of the rest of the people in the office. From the time I enter the building until I reach my desk usually takes just a few minutes. Sometimes I ride the escalator, though most of the time I walk up it. When there are few other people using the escalator, I can get right on and climb my way up quickly.

But sometimes I get into the office later – around the same time that almost everyone else is getting into the office. As I get closer to that single escalator going up to the third floor there is often a huge line of people also trying to get up that escalator. As the line grows, causing lines to form on other floors, the typical three minute trip can take anywhere up to fifteen minutes. The distance I travel is exactly the same. The route is exactly the same. The only difference is utilization.

This is a problem that software teams run into often. We can work quickly and deliver more predictably when the route is clear and we have spare capacity. When we load up a team with multiple projects to “fully utilize” them, we’re actually slowing them down and drastically reducing their predictably.

This is the crux of queuing theory. Here’s an excellent write-up of queuing theory in the context of software development. When utilization is low, delivery speeds are fast. When utilization increases, delivery speeds slow and queues grow. You can start to predict delivery times by measuring queue lengths. And you can improve delivery times by simply reducing the or controlling queue size.

Even if an escalator breaks down, the total time it takes me to get into the office won’t be drastically impacted if there aren’t many other people using it. This is true of a software project, as well. A bug found when the team has spare capacity will have little impact on the overall delivery times.

Here’s the other fun component of queuing theory – batch sizes. For illustration purposes, let’s say it takes one minute to get one person from the entrace of the building to the office. We’ll call this the cycle time. If people arrive no more than once per minute, this will never be a wait. Each person will get to their desk in one minute.

Our office is connected to a train station. So very frequently we have hundreds of people showing up at exactly the same time, which drastically increases the utilization of the escalators. Quickly, a queue forms that continues to grow until people return to arriving less than once per minute. While the first person in the line will have a cycle time of about one minute, the last peson in the queue will have a cycle time orders of magnitude higher (depending on the length of the queue).

The effects of a broken esclator in this situation is signficantly more impactful to overall cycle times.

Big batches slow down delivery. A huge project ahead of a small one means the small one will not be completed until the big one is finished, making it’s delivery timeframe wildly unpredictable.

There are some simple options to fix this issue. The obvious one is to reduce batch sizes and work in progress. Don’t start more than can be finished based on historical cycle times. Don’t pull in new work until the existing work is complete. Split projects into tiny deliverables. I’ve talked before about how splitting big stories can improve delivery times.

Other more difficult options focus on reducing utilization or cycle times in other ways. In our example, we could add more escalators. Or we could speed them up. This is consideribly more complex for a software team, of course, but could be the right long term option.

Maybe we need more people on the team. Maybe we need more automation. Maybe we need to move to a more componentized architecture. These may be positive changes, but they require significantly more effort to implement.

Though, if we want to reduce delivery times, the last thing we need is an attempt to “keep everyone busy”.

Much of this is inspired by the work of Don Reinertsen and Mary & Tom Poppendieck. I’d highly recommend reading The Principles of Product Development Flow: Second Generation Lean Product Development and Lean Software Development: An Agile Toolkit for a much deeper dive into lean software development.

SimCity BuildIt – A Lean Software Training Ground

Confession time. I’m addicted to SimCity BuildIt.

Ok, with that out of the way I want to talk about how this game is making me more conscious of lean software development principles. The gameplay is pretty simple – you construct roads and buildings, then produce raw materials and manufacture goods to complete missions and upgrade buildings.

Simple. But there are some challenges. One challenge is that the time it takes to produce of materials and goods varies by the type. This complicates my life because I don’t know what I’ll need in order to complete a mission ahead of time. So, if a mission comes up which requires 5 watches, for example, I’m in trouble. A watch is built from a chemical, a glass, and a couple plastic materials. Chemicals take 2 hours to make, glass takes 5 hours, plastic takes 15 minutes. The watch itself takes a couple hours and I can only produce 1 watch at a time.

If I go full-blown lean, I’ll do this just-in-time and the first watch will take about 7 hours to build, then each additional one will take 6 hours (I can produce up to 45 materials in parallel, so the material production time is equal to the longest part, in this case the glass).

Here’s the problem. If the mission only lasts 3 hours I won’t be able to complete it. What’s a mayor to do?

I have the ability to carry a limited set of inventory, so I could pre-produce some of these more ‘expensive’ items and hang on to them for when they’re needed. The drawback is that this takes up shelf space and I have no idea when or how many items I’ll need. If I fully stock the shelves but end up needing another item, I’ll have to get rid of some of my fully produced items.

I can purchase fully produced items from other cities. I don’t have to carry the inventory, but I pay a premium in this case and there may not be any available when I need it.


Ok, I hear ya. So what does any of this have to do with software development?

One of my all time favorite software development books is Lean Software Development by Mary and Tom Poppendieck.

I learned that overproduction or producing things that nobody needs is waste. We should be building the smallest increment of an application, measuring the value it returns, and iterating on that. Spending a month adding a feature that nobody uses is a month of lost time that could have been used to produce something of value.

Along the same lines, producing something and letting it sit on a shelf provides no value. Small batch sizes can help with this. Often, we believe that we can’t release a software product until it’s “done”. So we build all the features we can think of, then push them all out at once. While we’re holding all of those completed features in inventory, we’re not realizing any value from them.

Another obvious, but seldom recognized, truth of software development is the effect of the theory of constraints. When building an application we’re only as fast as the slowest part of the process. If I need to produce cheese in SimCity, for example, it’s only a couple of hours – but to produce the raw materials is 5 hours. So what is a 2 hour process on it’s own is actually 7 hours end-to-end.

The theory of constraints shifts our focus to optimizing the slowest part of a system. Instead of compartmentalizing the development process into analysis, design, development, testing, release, etc, see the process as one piece. It doesn’t matter if we can test 20 things per week if we can only develop 4. Optimize the whole.

We have a lot of tools to visualize and optimize our processes. Value Stream Maps can show us where our time is actually spent from start to finish. Cumulative Flow Diagrams can visualize bottlenecks in our workflow. There’s nothing I can do to speed up the production of electrical components, but we have complete control over everything we do when building a software project. The hardest part can be identifying where to focus.

Well, my appliances are done being made. Time to go upgrade a building in my city!

Look Mom, No Hands! Test Driving Code Without A Mocking Framework

I love TDD. I haven’t found a more effective way to incrementally design and build software in the 15+ years that I’ve been doing this. I have formed and evolved a lot of opinions about how I approach TDD, though.

Recently, I wrote a post for EuroStar Software Testing titled Look Mom, No Hands! Test Driving Code Without A Mocking Framework

This is a topic that has been on my mind for a long time. It’s not intended to start a mocks vs stubs flamewar or anything like that. Instead, I wanted to walk through my progression of TDD practices over the years and share what I’ve learned.

Don’t get me wrong – test-driving with a mocking framework is better than not test-driving at all. I just prefer stubs.

Looking back at the test cases in the Booked source code which utilize PHPUnit’s mocking framework (yes, there are still a lot), I can see just how entangled the test code is with the implementation of the production code. The source for Booked changes frequently and it is covered by more than 1000 unit tests. New features are introduced and, occasionally, some of the unrelated tests fail.

They fail because there is too much specified in the mock setup. In order to validate the behavior of some area of the code, I have to set up unrelated mock expectations to get collaborating objects to return usable data. If I change the implementation of an object to no longer use that data, my test shouldn’t fail.

A couple of years ago I stopped using PHPUnit’s mock objects and I’ve seen the resiliency of my unit test suite increase. I’ve also seen my development speed and design quality improve. Instead of ambiguous mock expectations scattered throughout the tests, I’ve built up a library of stub objects which have logical default behavior.

When test-driving increments of functionality, I’m able to concentrate on the behavior that I need to implement rather than getting distracted with test setup and management.

More focus. Better design. Higher quality. No mocks.

We Built the Wrong Thing – From Ambiguity to Stability

Let’s set the scene. You’re out to lunch with your team celebrating a successful launch of a new feature. Your product owner interruptions the conversation to relay an email from a disappointed stakeholder.

From: Stakeholder, Mary (
Sent: Thursday, April 19, 2018 11:51 AM
To: Owner, Product (
Subject: Can we talk?

Thank you for all of your work, but this doesn’t do what I thought it would. Can we talk?

– Mary

The discussion around the table quietly shifts to how nobody ever knows what they want. “We followed all the agile best practices”, a senior developer frustratedly quips, “How did we build the wrong thing?”

What went wrong?

When you get back to the office, you huddle up with Mary and pull up the acceptance criteria.

Story: Adding events to a calendar
As a user
I want to enter events into a calendar
So that everyone knows when people are available
Given I've entered an event into the calendar
When I view the calendar
Then I can see that event

“This is what you asked for, right?”

Mary replies, “Yes – but it’s not what I wanted.”

“What do you mean?”

“Look at this. I want to set up a 3 day training session, but I only have one date picker. And every new event is the same color, so it’s really hard to see who is booked when. And I have no way to know when a new event is created. And…”

“Oh.” you interrupt. “We didn’t know you wanted that. You had all of those meetings with our PO. Why didn’t you ask?”

Mary, now frustrated with the amount of time seemingly wasted, responds “I thought we were all on the same page!”

Specification by Example

Is this a familiar story? Even using the de-facto acceptance criteria format so popular in agile, it’s very easy to build ambiguous expectations. Ambiguity leads to disappointed customers and frustrated developers.

Years ago, I read Gojko Adzic’s Specification by Example and it changed the way I view user stories. I cannot possibly do justice to all of the incredible advice and ideas from the book in the blog post, but I’ll try to summarize.

Instead of a PO or BA working with customers to capture the stories and later reviewing those stories with developers, Gojko recommends running specification workshops. We follow a simple workflow for this:

Derive scope from goals > Specify collaboratively > Illustrate requirements using examples > Refine specifications > Frequently validate the application against the specifications

Deriving scope from goals is probably the biggest change a team will need to make. Instead of being presented a set of acceptance criteria, the team is presented with a goal. For example, stating a goal to know people’s availability instead of the scope to build a calendar.

Working with the stakeholders, the team collaboratively identifies the acceptance criteria. Maybe a calendar is what is built. Maybe it’s a simple list. Maybe it’s a search. The point is that we start with the goal in mind, and collectively identify the scope. This eliminates the translation layer between stakeholder to product owner to development team.

Ambiguity— —

The next couple steps are iterative. We extract real-world examples from the scenarios, and illustrate the acceptance criteria using those examples.

Instead of

Given I've picked a date
When I book that date
Then that date is booked

We have something like

Given Mary has selected 10:00 am on April 18th, 2018
When she completes the booking
Then the calendar indicates that Mary is unavailable on April 18th 2018 between 10:00 am and 10:30 am

It’s only a slight change, but it has massive effects. Using real examples leads to real questions. What if Mary is already busy at that time? What kind of indication should we show? Is the default event length 30 minutes? Can that be changed?

Ambiguity— —

And here’s where it gets fun

Most teams write automated end-to-end tests for their applications, but a lot of the time these tests are defined and written after the functionality is built. We end up simply validating that what we built works how we built it. Even if the tests are built based on more traditional acceptance criteria, the person writing the test has to make some assumptions about how to make the application behave in the way that meets the criteria.

If we have a Cucumber feature file that looks like this:

Story: Adding events to a calendar
As a user
I want to enter events into a calendar
So that everyone knows when people are available
Given I've entered an event into the calendar
When I view the calendar
Then I can see that event

The person implementing the tests has no choice but to make up some dates to pick and the validation will likely be something generic.

When writing automated acceptance tests based on real-world examples, the tests can match the acceptance criteria 1:1. Not only does this enhance the clarity of how to test the application, it also brings gaps in the shared understanding of a story to light early.

Story: Adding events to a calendar
As an event organizer
I want to be able to indicate any events I'm participating in
So that everyone knows when I am available
Given Mary has selected 10:00 am on April 18th, 2018
When she completes the booking
Then the calendar indicates that Mary is unavailable on April 18th 2018 between 10:00 am and 10:30 am

Ambiguity— —

Automating the Acceptance Criteria

One common frustration of test automation is maintenance and fragility. Features change and evolve over time. When tests are driven from an interpretation of the specifications rather than the the actual specifications, maintenance becomes a challenge. It’s difficult to trace a specification change to an associated test (or set of tests). So minor changes in specifications tend to have major impacts to tests.

If the specifications are automated, instead of translated into automated tests, you know exactly what test is affected. In changing the specification, you are forced to change the test and underlying code. You can make micro changes and receive instant feedback that the application still works.


No silver bullets

This isn’t an overnight change. Like most things, it takes deliberate practice. Practice facilitating specification discussions with non-technical people. Practice with finding the right type and number of examples.

The return on this investment can be huge. Specification workshops often lead to significant reduction in scope because technical people and business people are speaking the same language and understand the problem in the same way.

The resulting specifications are free of ambiguity, so everyone has a shared understanding of the exact behaviors they should expect from the application. Validating the application against the specifications in an automated way ensures the application is always working the way everyone understands and expects.

Eliminating the specification ambiguity builds a shared understanding between everyone involved, which leads to long term application stability. And that’s good for everyone.

Have you tried this?

I’m interested in hearing from readers about their experiences. Have you tried this or something similar? How did it go?

How Transaction Costs Influence Story Size

Of all the aspects of the INVEST principle, I think the attribute that software developers have the most influence over is S. Small

Small stories have huge benefits. We can control what gets built with much more granularity.  Less scope means fewer requirements for the team to understand. Less for the end user to understand. Less code to for the developer to change. Less code to test. Less code to release and maintain.

Less risk.

Ron Jeffries recently tweeted a short video on story slicing that triggered a question in my mind about why engineers resist small stories. As I thought about it more, connections formed to Don Reinertsen’s work in  Managing the Design Factory and Principles of Product Development Flow. There he describes the “transaction cost” of an activity. The transaction cost is the cost/time of everything needed to execute an activity.

If we put this in the context of releasing software, this may include the time to build, test, and deploy an application. If those tasks are manual and time consuming, human nature is to perform these tasks less often.

Transaction costs follow a U-curve, where performing an activity may be prohibitively expensive if the size of the activity is too small or too big.  For example, think about a story for user account registration. It would be costly to create a story for each word in the form text. Likewise, the cost of a story would be huge if it includes creating the form, validating it, sending notifications, account activation, error handling, and so on.

So, back to Ron’s video and push for thinly sliced stories. I posed the question about what to do when developers resist small stories. To dig into that, we need to understand why developers resist small stories.

A common argument that I hear against small stories is that it’s not efficient to work that way. We developers are always looking to maximize the work not done 🙂

This can be a problem, because the true work for a story includes everything to get it production ready. Writing each story. Developing each story. Code reviewing each story. Testing each story.

That is the transaction cost of a story.

Driving down those costs drives down the resistance from the developers. There are lots of ways to reduce these costs. We can pair program to eliminate the out-of-band code review. Use TDD and automated acceptance testing to build quality in from the start. Create an automated build and deploy pipeline to continuously deliver stories.

As we reduce the overhead of each story, we can slice stories thinner and thinner.

But wait, there’s more!

As Bill Caputo rightfully pointed out, big “efficient” stories include a lot more work than may be necessary. Thinking back about an account registration story, we may put account registration and account activation into the same story. That builds in an assumption that we have to activate accounts – which we may not.

Not splitting stories means we may efficiently build something we don’t even need. Peter Drucker famously said –

There is nothing so useless as doing efficiently that which should not be done at all.

Addendum – Be careful what you measure

Another reason developers may resist breaking things down into small deliverables is how they’re measured.

In most scrum worlds, a team is being measured based on velocity. Small stories are fewer story points, thus lower velocity.

Uh oh, let’s make bigger stories and drive higher velocity.

To address this problem, Ron suggested giving “credit” for stories completed rather than story points completed. This encourages splitting large stories into thin stories with little pieces of the overall functionality.

What Do DevOps, Test Automation, and Test Metrics Have in Common?

This is a guest post by Limor Wainstein

Agile testing is a method of testing software that follows the twelve principles of Agile software development. An important objective in Agile is frequently releasing high-quality software regularly, and automating tests is one of the main practices that achieves this aim through faster testing efforts.

DevOps is a term that Adam Jacob (CTO of Chef Software) defines as “a word we will use to describe the operational side of the transition to enterprises being software led”. In other words, DevOps is an engineering practice that aims at unifying software development (Dev) and software operation (Ops). DevOps strongly advocates automation and monitoring at all stages of the software project.

DevOps is not a separate concept to Agile, rather, it extends Agile to also include Operations in its cross-functional team. In a DevOps organization, different parts of the team that were previously siloed collaborate as one, with one objective to deliver software fully to the customer.

Agile and DevOps both utilize the value of automation. But there must be a way to measure automation, its progress, and how effective it is in achieving the aims of Agile and DevOps—this is where test metrics become useful.
In this article, you’ll find out about different types of test automation, how automating tests can help your company transition to DevOps, and some relevant test metrics that Agile teams and organizations transitioning to DevOps can benefit from to improve productivity and achieve their aims.

Types of Test Automation

Test automation means using tools that programmatically execute software tests, report outcomes, and compare those outcomes with predicted values. However, there are different types of automation that aim to automate different things.

Automated Unit Tests

Unit tests are coded verifications of the smallest testable parts of an application. Automating unit tests corresponds with one of the main Agile objectives—rapid feedback on software quality. You must aim to automate all possible unit tests.

Automated Functional Tests

Functional tests verify whether applications do what the user needs them to do by testing a slice of functionality of the whole system in each test. Automating functional tests is useful because it saves time—typical testing tools can mimic the actions of a human, and then check for expected results, saving valuable time and improving productivity.

Automated Integration Tests

Integration tests combine individual software modules (units) and test how they work together. Again, by automating integration tests, you get tests that are repeatable and run quickly, increasing the chances of finding defects as early as possible, when they are cheaper to fix.

Test Automation and DevOps

DevOps is a culture that aims to reduce overall application deployment time by uniting development and operations in software-led enterprises. Automation is at the heart of the DevOps movement—reduced deployment time and more frequent software releases means increased testing volume. Without automation, teams must run large numbers of test cases manually, which slows down deployment and ensures the DevOps movement does not succeed in its aims.

One potential pitfall that can hamper the transition to DevOps is not having the required automation knowledge. After all, test automation is technically complex. Acquiring the knowledge to effectively automate tests takes time and costs money. You can either hire expert consultants to get you up and running with automation, hire qualified automation engineers, or retrain current testing staff. Whichever option you choose, it’s clear that automating is essential for implementing a DevOps culture in your development teams.

Enter Test Metrics

What Are Test Automation Metrics?

Implementing automation blindly without measuring it and improving on automated processes is a waste of time. This is where test metrics provide invaluable feedback on you automated testing efforts—test automation metrics are simply measurements of automated tests.

Test automation metrics allow you to gauge the ROI from automated tests, get feedback on test efficiency in finding defects, and a host of other valuable insights.

How Can Test Automation Metrics Help DevOps?

By measuring testing duration, you find out whether current automation efforts are decreasing development cycles and accelerating the time-to- market for software. If tests don’t run quicker with automation than manual tests, then there are clearly technical issues with the automation efforts—perhaps the wrong tests are being automated.

How to Measure Test Automation

Some examples of test metrics used to measure test automation are:

  • Total test duration—a straightforward and useful metric that tracks whether automation is achieving the shared Agile and DevOps aim of faster software testing through increased automation.
  • Requirements coverage—a helpful metric to track what features are tested, and how many tests are aligned with a user story or requirement. This metric provides insight on the maturity of test automation in your company.
  • Irrelevant results—this measurement highlights test failures resulting from changes to the software or problems with the testing environment. In other words, you get insight on the factors that reduce the efficiency of automation from an economic standpoint. Irrelevant results are often compared with useful results, which are test results corresponding to a simple test pass or test failure caused by a defect.

Closing Thoughts

The DevOps movement extends Agile and aims to modernize software development with faster releases of high-quality software. Testing is a common bottleneck in the development cycle, which can hamper any attempt to integrate a DevOps culture.

Test automation is the link that gets software testing up to the speed of development, helping to achieve the aims of Agile and DevOps.

However, there must be a way to track all attempts to automate tests, since test automation is, in itself, an expensive investment and a technical challenge. Test metrics provide valuable feedback to help improve automation and ensure positive ROI.

About Limor

Limor is a technical writer and editor at Agile SEO, a boutique digital marketing agency focused on technology and SaaS markets. She has over 10 years’ experience writing technical articles and documentation for various audiences, including technical on-site content, software documentation, and dev guides. She specializes in big data analytics, computer/network security, middleware, software development and APIs.

When will you be home?

What time will you be home today? It’s a simple question. If you’re like most people you go to work and come home about the same time every day. You should be an expert in estimating when you’ll arrive at home. I bet you cannot predict the time that you’ll get home tonight, though. I bet you’d be even less accurate if I asked you to predict what time you’ll be home six months from now.

Sometimes I feel like this is what we ask software teams to do when we ask them for estimates. Software developers build software every day. We should be able to predict how long it will take to build new software – but we can’t reliably do it! There are ways to improve our estimates, though.

Let’s jump back to the commute example. If your commute is short then you’ll be accurate most of the time. Sometimes you’ll have a meeting that runs late. Maybe it’s Friday and your calendar is clear so you take off early. You can be way off from time to time, but pretty often you’ll be pretty darn accurate. This is why we want to break deliverables into small pieces. Small pieces have less uncertainty.

Things get less predictable the further out you go. Going from a half-mile commute to a mile commute will increase the variability. One mile to twenty miles increases that variability by orders of magnitude. There are simply more unknowns as the size of your commute grows. Traffic, accidents, weather – there are a lot of variables that can affect the actual time.

Big features and deliverables have this same problem. There are simply too many unknowns to be accurate. Undocumented limitations of libraries, complex algorithms, changing requirements, for example. And asking software developers to commit to anything with that level of variance isn’t fair.

But it’s done anyway and leads to all kinds of dysfunctions. Unrealistic estimates based on little knowledge are treated as promises. Broken promises lead to distrust. Distrust leads to infighting, incessant status reporting, and pissed off developers.

So lets stop asking for estimates that are months away. I don’t know even know what time I’ll be home tonight 🙂