Anti Agile - Introducing the Surge Sprint
What started off as occasional whispers is increasingly turning into constant background hum: a frustration with “Agile”, and a sense there is something not quite right.
Are these people pining for the good ol’ years of waterfall? Of course not, many of them have never known anything except Agile, or at least the modern take on it. It amuses me that many of those complaining probably don’t realise how good they have it, and that many companies have way more overhead than what they are complaining about.
But when you listen, you quickly see that they aren’t complaining about Agile as laid out in The Agile Manifesto, they are complaining about Scrum - which has become virtually synonymous with Agile at this point. But this was not always the case, and certainly wasn’t what the authors of the manifesto intended.
Scrum is big business, borderline religious. Any complaint is met with “You haven’t implemented it properly”. But as I have previously discussed, the purpose of a system is what it does, not what we intend. If after over 20 years hardly anyone can implement Scrum properly, and many of those that have to work with it get frustrated - then perhaps the issue is more complex than just an implementation failure.
So what are the main complaints?
The main argument is the strictly enforced overhead, ironic for an “agile” methodology (“Individuals and interactions over processes and tools” anyone?). Between all the ceremonies, a developer spends a lot of time in meetings. In a properly implemented Scrum, 15-20% of sprint time is spent in ceremonies.
Standups in particular get a lot of negative attention - especially in environments where they need to be held in the middle of the day for distributed team timezone reasons, breaking flow.
Next, it’s a very inward looking approach. There is literally nothing in the Scrum best practices about getting real customer feedback. The Sprint Review/Demo is the closest thing, but that sometimes doesn’t even have non-tech stakeholders attend, never mind customers.
Should roadmaps be changing every two weeks? Probably not. Bugs and alignments, sure. But not a more significant adjustment. And estimating over that short period is tough - especially with external dependencies, unexpected work absences or production issues. So a lot of time is spent explaining why dates weren’t hit, or estimates were wrong.
Finally, story points are confusing. The intention behind making a distinction between story points and man days is noble, but in practice leads to way more debate and explanation (especially to impatient business stakeholders) than any benefit it brings.
So what happens? In many organisations, I’ve seen Scrum drift into Kanban, and one can see why:
- Easier to ignore certain ceremonies - or make less frequent - without feeling guilty
- Shifts in priorities, illnesses and newly raised bugs can be organically absorbed without too much replanning - since engineers work on whatever is top of the queue
The issue with this approach is that it can be a bit demoralising for the engineers as there is less visibility into clear goals, and so the roadmap looks never-ending - there is no light at the end of the tunnel. Similarly, business sponsors can get frustrated by the lack of clarity.
Surge Sprints
I propose an amendment to Scrum. I call these sprints Surges.
The most important thing to change is the focus on the two week sprint. Two weeks isn’t long enough, and means the ratio of work to ceremony is off-kilter. It also means small delays or prioritization reshuffles commonly go cross-sprint, leading to more meetings to explain why.
So what is a better sprint length? Basecamp has advocated for the Shape Up method, which splits work into six week blocks - with a very clear deliverable at the end of that period.
Why were sprints two weeks long in the first place? I believe it was to try and give as optimal a release cadence as possible. Leaving aside the fact that many teams don’t release that frequently anyway (and so multiple sprints got lumped together into a single, less frequent release for convenience), we have also changed our best practice to continuous integration anyway. There is nothing stopping a six-week Surge from continuously releasing, using feature flags.
The daily standup predates ubiquitous, asynchronous chat. Fifteen years ago, standups were essential. Now, they can be a repetitive distraction. We’re already continuously updating one another and untangling blockers through open, shared chat interfaces.
It is still important to use something like a bot to ensure everyone checks in, but that is enough. Then the team can come together - ideally with product - a couple of times a week.
Every surge should have a session working directly with real customers. The engineers should participate, and develop empathy towards the problems they are trying to solve. I like Steve Krug’s thinking on this with usability testing - Don’t Make Me Think is a great read. The main takeaway is that we should watch users attempt to use our software in silence. That way, we see how people really interact, without being influenced.
As an aside, I think a large reason for the existence of the Demo Gods is that we don’t spend enough time using our software directly with real people.
For the six week surge to make sense, we need to have a good idea of what will be included. Instead of lengthy, inevitably incorrect, estimation sessions, we should first build a detailed “Definition of Done”. This is not a functional specification, but should include the boundary of what can and can’t be achieved in six weeks.
If it is not clear what the surge definition of done will look like, perhaps the problem statement is not well understood. Instead of launching into a poorly understood set of work, set aside a week for a prototype sprint. The goal of this sprint should be to reduce ambiguity of the most unclear problems. It is not to wireframe a well-understood flow that feels good to visualise, but doesn’t reduce uncertainty.
By having a clear Definition of Done and a longer timeframe to work within, detailed specific estimations are not required. Developers hate estimating, because we have all been burnt by the “easy” requirement that turned out to have huge hidden complexity, or the framework bug that wasted days of investigation. And that’s assuming something is possible at all!
Note: XKCD is always amazing, but this goes to show how quickly the macro environment can change - that research team and five years produced great results!
Since there is a commitment to complete everything within the six week sprint, prioritisation is less important - developers can use their own common sense. Wildly incorrect assumptions, or team members not pulling their weight will still be addressed in the retrospective - at least to the level that was ever possible during a normal sprint.
At MISSION+ we’re still experimenting with Surge Sprints - we’ll write a follow up article soon to let everyone know how it went!
So in summary:
- 6 week sprints
- Chat standup bot updates every day
- Two alignment meetings with product per week
- Release frequently, using feature flags to control what functionality is available
- Agree to a clear Definition of Done, simplify estimations
- If a clear Definition of Done is not possible, do a one-week prototype sprint
- One (or more) user testing session per surge
- Developers can use common sense to prioritise within a sprint, but ensure to address misbehaviours in the retrospective
The Agile Manifesto starts with “We are uncovering better ways of developing software by doing it and helping others do it.”. Our collective learnings and understanding have uncovered better ways of doing Scrum, and that’s what Agile is all about! So we’re not Anti-Agile after all.