I don’t believe in Scrum
I read I Don’t Believe in Sprints yesterday and found myself disagreeing with almost all of the article, while agreeing strongly with its headline. Here’s my attempt to collect my thoughts about why.
The worst thing you can do in software development
The worst thing you can do in software development is to build the wrong thing. If you build the wrong thing, you have to pay three costs:
- you pay the cost of delay for the valuable thing that you could have been building while you spent your time building the wrong thing;
- you pay the cost of maintenance on the software you’ve built, every moment up until the point you realize it’s the wrong software and you delete it (and you pay the cost of deleting it, which may not be negligible by the time you get around to deleting it);
- you pay the frustration cost of realizing that the thing that you spent your time and effort building wasn’t valuable.
How do we end up building the wrong thing? There are plenty of reasons, but two of the most common are:
- we don’t know what other developers are working on and we end up with both of us writing code that solves the same problem, or pieces of code which conflict with each other;
- we don’t understand the customer’s requirements properly and so we end up building a solution that doesn’t solve the problems they’re facing.
How do we avoid those situations? We coordinate. We coordinate within the team to minimize the risk of duplicate or conflicting work, and we coordinate with our external stakeholders to ensure that we are building the right thing at the right time and in the right way to solve their problems.
Now, as I said at the top of the article, I don’t believe in sprints. I think Scrum and Scrum-like processes are a pretty inefficient way to do coordination. However, it’s not enough simply to point at Scrum and say “this is bureaucracy, bureaucracy is bad.” One has to recognize what problems Scrum is trying to solve and either say “those problems aren’t worth solving,” or say “those problems are worth solving, but Scrum solves them badly.”
If you don’t say either of these things, but simply say “Scrum is bureaucracy,” then I have to ask: given this list of Scrum events and the coordination problem they aim to solve, which would you remove?
- Sprint planning is supposed to ensure that the team is working on the most important things. If you remove sprint planning, how do you aim to ensure this happens?
- The daily Scrum (much more commonly known as ‘standup’) is there to ensure that, given the product team is working on somewhat-related things, often on a shared codebase, we do not do conflicting work and we find ways to help each other in case there are impediments. If you remove the standup, how do you aim to make sure those conversations happen?
- The sprint review is there to look with our stakeholders at the work that has been done, check whether it accomplishes the goal it was supposed to accomplish, and give feedback to help guide future iterations of the product. If you remove the review, how do you plan to know if your stakeholders are happy with your work?
- The retrospective is there to look at the way the team worked together during the sprint, how that felt, and how the team can improve for next time. If you remove the retrospective, how do you plan to reflect on the way the work is being done and make improvements?
Looking at the list of Scrum events, to me all the coordination problems they aim to solve seem essential. I cannot see a way to throw out the event without providing a suitable replacement. So, to me, what the author of I Don’t Believe in Sprints calls “bureaucracy,” I would call “a perhaps inefficient way to deal with essential challenges facing any software development team.” Seeing any coordination as bureaucracy and as getting in the way of the ‘real’ work makes it pretty difficult to work with others at all — unless you’re happy to build the wrong thing a lot.
The second-worst thing you can do in software development
I said that I agreed with the headline of the article to which I’m responding. I don’t believe in sprints, or more precisely I don’t believe in Scrum. And yet, the majority of this post so far has amounted to a pretty vigorous defence of Scrum against the charge that it’s just bureaucracy. What gives?
Well, perhaps the second-worst thing you can do in software development is to have a product backlog which expands in size over time. Our aim (I would suggest) as a software development team is to sustainably deliver the maximum value in the minimum amount of time. The best measure of our success is our lead time: the time it takes in between us agreeing to solve a problem and us delivering a solution to that problem. If our backlog expands and we treat it as a first-in, first-out queue, then our lead time will also increase over time, and we will become worse and worse at our job.
Of course, no one treats their backlog as a first-in, first-out queue. Instead, they prioritize the product backlog, so that we aim to work on the right thing first. However, there are two problems with this approach, if the product backlog is growing:
- correct prioritization requires us to apply a ‘weighted shortest job first’ rule. This requires accurate estimation of both the value we can realize by finishing the job, and of the time it will take to finish the job. Both of these measures are hard to estimate.1
- estimates are not static over time. An item in the product backlog which was hard to do 3 months ago is now trivial because Mary ended up doing a bunch of foundational work when she worked on something else, and we can use all of that to deliver this item now. Another item in the product backlog would have delivered tremendous value to our customers 6 months ago, but now Apple have built it into the latest version of iOS so there’s no need for it. So we ought to re-estimate every item in the backlog periodically.
As the backlog grows, either lead time increases, or we are required to do more and more estimation — with higher and higher penalties (in terms of cost of delay) for getting it wrong.
Scrum by itself doesn’t offer us solutions for keeping the size of the backlog under control. It concerns itself with how the team works and introduces cadence and some measurement of work, which can help us. In the best-possible interpretation of Scrum, the measurement of velocity can be used over time by the team to match the rate of incoming items accepted to the backlog to the rate of work delivered, hence ensuring the backlog doesn’t get out of control. However, this requires relatively accurate estimation, so that velocity means enough that you can use it for forecasting. Accurate estimation is hard, and harder if the work we’re doing is non-trivial, and estimation isn’t intrinsically valuable — it’s an activity we do in service of our process.
Ideally, we would find ways to reduce the importance of — and hence the time spent on — estimation. Scrum on its own does not help us to do this. Most teams working with something Scrum-like, in my experience, have a growing backlog and are more and more dependent on estimation (not to mention all the other maintenance costs of a growing backlog) to ensure that they deliver with a reasonable lead time.
How things could be better
I criticized I Don’t Believe in Sprints for not offering alternative solutions to problems which seem to me unavoidable to solve. That places the onus on me to offer an alternative solution if I agree with the headline. Here’s a brief attempt.
Ideally, we spend as little time as possible coordinating with each other to deliver the value we require. That is, there is no coordination effort expended purely in service of our process, or on dealing with queues that we have created for ourselves. Concretely, this would mean that in the ideal state, we have a Kanban board showing each activity required in our process to take something from conception to completion (e.g. user research, visual design, programming, code review, deployment, release, marketing, sales, etc.), and have at most one item currently in progress in each column. We would have achieved one-piece flow, and all activities in support of this flow (planning, stakeholder review, work coordination, reflecting on and improving ways of working) would and could happen on-demand.
When the team working on programming has nothing in their queue, they trigger a planning conversation — “what should we work on next? Let’s get together with our product owner and decide together.” When someone on the team notes a problem or recurring pattern in the way we work together, they can gather the team to discuss and improve this situation. When an item of work is finished, it can be released immediately. These conversations and activities can happen on-demand because there is no other work in progress, and so people are quickly available.2
So: is Scrum closer to my ideal than other ways of working? Absolutely. I would much rather see planning and delivery happening in (typically) two-week iterations than in cycles of months or years, with all the documentation, Gantt-chart-wrangling, and rework that that implies. However, there are two issues:
- If Scrum is not paired with a mechanism to limit the size of the product backlog, it does not move us closer to an ideal state of one-piece flow and minimized lead time, but away from it.
- When the sprint-based approach of two-week cycles is actually serving to deliver on commitments or plans that are on a longer cycle (for example, the quarterly product/epic planning process which is common in the industry), teams experience it as a frustratingly artificial slicing of work. We know we’re going to work on the FooController a bunch over the next 3 months. Why are we only planning the next two weeks of that work and not budgeting in time to improve its foundations so that the total workload over the next 3 months is reduced? Or: we know that our product plan is largely fixed for the next 3 months. Why are we presenting the last two weeks’ work and asking for feedback, if there is no significant way in which that feedback will affect the work we plan and execute in the next two weeks?
Wrapping up
All of this is to say: I agree that many implementations of Scrum, or something Scrum-like with sprints, end up with a lot of bureaucracy and conversations which are ultimately pretty pointless. But, where the bureaucracy is actually unnecessary, this is often an effect of the team’s cadence not being matched to the actual cadence of product planning. And where it’s actually necessary to have the conversations, we need to have an answer for how those conversations will take place.
As ever, working with others is hard, and complex, and every team is on a journey (one hopes in the right direction) and at a different point in that journey. It’s simply not enough for us to point at a thing and say ‘this thing is bad.’ We have to be clear about the goal we’re trying to achieve, whether the thing we’re pointing at is aiming to solve the problems that get in the way of us achieving that goal, and how we would solve those problems better. And we need to keep doing this every day, because the answers change as our team evolves, as our codebases grow, as the technology landscape changes, and as our users and their requirements and expectations change. That might sound like a drag, and a distraction from the ‘real work’, but it is the real work.
1: The more complicated the job, the harder it is to estimate. Unfortunately, complexity often correlates with value: my moving a button 4 pixels to the left is (generally) not a very complex job, but it’s also (generally) not very valuable. Building a working nuclear fusion reactor is so complex it might not even be possible, but it could change the course of human history.
2: What I'm describing here is an absolute ideal state, which would come at the end of a very long journey of continual reduction of cycle times and improvement of coordination and psychological safety levels on the team. I see this as a ‘North Star’ and I'm not suggesting that any team throw out regular cadence for their meetings any time soon. Even in the ideal state where the queues are short enough that one could do everything on demand, there's probably value in having regular and predictable cadence for many conversations.