The First Rule of Scaling Agile: Don't.
Updated: Feb 28, 2022
At least not in the ways being sold right now.
In this post, I'll work to convince you of the following:
Waterfall and functional/hierarchical organizations are the same things.
Over time, we tend to adopt waterfall because of cultural issues.
To scale Agile, you've got to solve those cultural issues and avoid functional/hierarchical organizations.
You want to have a product organization.
If you manage this, you can scale agile - but only horizontally, not vertically.
Most "scaled agile frameworks" attempt to have an agile software department exist and survive inside a waterfall functional organization.
Most of these frameworks won't work.
We will start with the conclusion and work towards the premises here. That conclusion is: "Agile" is the natural state of doing things. It's how people want to work; it's been invented dozens of times; things usually are this way without any intervention.
Think of any small start-up. Most of the work is ad hoc trust-based. Process is introduced only when it speeds people up, and they work together to establish it.
So the real question isn't how do we scale agile in a waterfall organization, but rather how did the organization become waterfall in the first place?
Without understanding that, you won't know what you're up against. Without understanding what you're up against, you won't know how to fix it.
Looking Behind The Waterfall
Unfortunately, there's no master sword here.
First, let's drop the term waterfall. We call things waterfall in software, but in actuality, waterfall is just a subset of a much broader way of doing things we call gated development.
Gated development is like a relay race. One team does its work, then passes the artifact off to the next team, and then to the next.
We know waterfall (and gated development) doesn't work because:
Errors accumulate in the handoffs.
It's tough to backpropagate issues upstream.
Gated development is also tough to scale. Just like running in multiple relay races at the same time is hard to scale - any time you're in two or more races at once, you couple those races - just like poorly coupled software. You may be busy running race A when race B is ready to be handed off to you - just wait for a second; I'll nearly be done.
(We go more into this in our "Agile: Deconstructed" training.)
In terms of software design patterns, the only way to "parallelize" gated development is the pipeline design pattern. But, each pipeline stage has high variability in when it finishes. For example, UX may take an average of 3 months to get done with their part, but it's usually more like 2 to 5 months for a 95% confidence interval. There's also a large amount of variability BETWEEN projects. Project A may be twice as complex as project B. This variability busts the pipeline apart.
You only go about as fast as if you were doing one thing at a time, BUT you're as stressed as if you were doing twelve.
Given all the downsides of gated development AND small teams start with agile anyway, why do people ADOPT gated development?
There are two reasons.
Dunbar's 48 Laws of Power
A complex system that works is invariably found to have evolved from a simple system that also worked - John Gall
Keep the above quote in mind; it will be referred to more.
Gated development is a complex system. It's a MORE complex system than agile development. If complex systems have to evolve from simpler ones that work, we have to ask, why does gated development grow from more straightforward, often agile development?
What are the 'evolutionary forces' pushing gated development into the product, finance, marketing, and engineering spaces?
The first is a breakdown of trust.
As group size grows, the ability to have trusting relationships fades. Much research has been done on this, called Dunbar's Number.
The specific numbers really don't matter, but what does matter is that as groups grow past, they reach a breaking point in trust.
There are two reasons trust breaks down, but only one significant consequence.
First, evildoers slip in...
The larger a group, the more likely you will find people who are only out for themselves in the group. We'll call these folks evildoers.
Evil is apparently arranged in society along a Gaussian curve. In other words, about 1/25 people you meet may meet the technical qualifications to be called a sociopath. One in twenty-five!
And it's worse at the C-Level, where the numbers shrink to 1 in 5.
That 1 in 25 number is essential, as that's about where Dunbar's number kicks in. Most teams can't scale past the 15-30 threshold without changing how they work. Now you know why. Because not everyone is actually pulling towards the shared goal - at least one, maybe more, are pulling towards their own personal goals in a very destructive way.
Why not just fire the evildoers?
That'd be a nice simple way to solve this puzzle, wouldn't it? They wouldn't be very good evildoers if they got caught easily, would they? These folks lie, cheat, and steal routinely.
Literally, every medium-sized team is a game of Among Us - no one knows who the imposter is, but we know, or should know by the laws of probability, that at least one person is an imposter.
Firing is throwing someone out of the airlock. Ultimately, it is the solution you'd want if you found the imposter. But you're going to probably throw many good people out of the airlock first and possibly allow the imposter to hire other imposters.
Eventually, game over.
Second, mistakes are made.
Let's look at the Evolution of Trust. In the game, there are always-betray players; these are your evildoers from the last section. But, as you get to play more advanced scenarios, you run into problems with the dominant 'tit-for-tat' strategy of simply punishing those who betray you.
That problem is mistakes. There will be times when it feels like someone betrayed you, but they didn't intend it. They just made a mistake. The right way to handle this is to have some patience in your culture - forgive errors, ask for forgiveness, and avoid a culture of blame.
And the most heinous evildoer of all is the dishonest agreeable (personality type who will always seem to convince you that they just, gosh darn it, made another mistake!).
But how can we throw anyone outside the airlock without a culture of blame?
What usually happens is that people are really only assuaged by punitive measures - they want to see 'some process' introduced to ensure the last mistake was never made.
It's far easier to forgive and understand at small group sizes, but at larger group sizes, the chances of someone making a mistake that hurts you that you don't have a relationship with go up, so the demands for the disciplinary process start to kick in.
This is only made worse when evildoers are afoot, as some excellent sociopathic tactics are to intentionally make mistakes and ensure they're blamed on perceived enemies.
To 'fix' things, punitive measures are introduced.
So we add rules, then rules on top of those rules, and more and more rules. Everyone needs to ask for permission; all work needs to be double-checked. The gates are down.
For those trying to tamp down on mistakes, double-checking and permissions are seen as a way to reduce mistakes. However, of course, it reduces mistakes that the expensive of a) productivity and b) employee growth since we need mistakes to learn.
For those looking to get ahead, backing various punitive measures at different turns in the team's internal politics becomes a great way to punish rivals.
All of this comes down to culture - the H part of our Big OH culture model. If you want to scale agile, you will need to hire, train, retain and cultivate high H people. H stands for humility and honesty, and it's a personality trait that tends to predict collaborative vs. competitive values.
This still doesn't fix mistakes, but it does allow you to at least baseline your new scaled agile on a foundation of trust.
A foundation of trust does three things:
Helps each team grow while limiting the kinds of mistakes that attract punitive measures
Enables people to be more patient since you can often assume good intent for far longer
Allows TEAMS themselves to trust one another when they do grow too large and need to split
How you split is very important, and pushing for the right way to break teams up will cause you to run into the other significant cultural issue.
"...But that's not how Global Corp does it..."
That "other big cultural issue" is familiarity and momentum in organizational design.
How you split teams is important
Let's say we're working at Taco Bell. We have a workspace, making tacos, burritos, and burrito bowls. Tacos take hard shells, beef, and cheese, burritos take softshells, beef, and cheese, and burrito bowls take chips, lettuce, and beef.
Demand is outstripping supply; we need to scale; what do we do? We get some extra space, but how do we split things up?
One way would be to have a taco workspace, a burrito workspace, and a taco bowl workspace. This way, an employee can stay in one place, keep all their needed materials around them, and generate a lot of output without getting exhausted.
This would be a product-focused team split. That's not how most people do it, for some reason, though.
Instead, they specialize in the BEEF area, another place is the SOFT TORTILLA area, and so on, and workers have to go to each location, take their part, and walk all the way through the kitchen to finish their product. They get in their way, and when one person is busy in the beef area, others have to wait.
"Well, that's silly."
I agree, but we do it in software all the time.
How are you supposed to split teams if you're building up a full stack of code from three initial founders, and now you have fifteen people?
Two forces are going to be in play. The first is familiarity. People will split based on how they've seen it done elsewhere. That almost invariably means a backend team and a frontend team.
This is frustratingly dumb, as now each product (which is what people actually work on) needs a meeting between two teams of people. Meanwhile, those teams have little to talk about other than whether to use tabs or spaces.
The second force is that each founder will probably get a subteam.
So these splits often happen to minimize the difference between how people have seen org divisions occur in the past and maximize the recognition of the power structures.
This is the wrong function to optimize.
The function you want to optimize is expected effectiveness.
Culture strikes again - low openness leads to functional teams.
We've already talked about how trying to build a foundation of trust can weaken the need to recognize accidental/implicit power structures and instead arrange power structures explicitly along the direction of organizational effectiveness.
In other words, imagine all organizations possible with 15 people - like an algorithm that generates every possible graph as an org chart. In some graphs, the intern will be the CEO. So think of EVERY possible one.
The function you want to optimize across this space is expected effectiveness - how well does this organization deliver value in the short, medium, and long terms? That's it.
Low openness, another personality trait that's highly anticorrelated with personal values around risk-taking and being open to change, will predict how much your org chart is determined not by effectiveness but familiarity.
Is the intern the CEO? Yeah right. Let's not even consider that org chart. And, you're probably right. There are very few organizations where that's a good idea. But that same knee-jerk response also will push away more reasonable approaches like "let's put the most honest person in the management position."
Replacing putting the most honest person in a management position are a variety of more familiar org structures like:
Let's put the person I owe a favor to in management
Let's put the person who seems best at Skill X in management
Let's put the person who screamed the loudest about being in management in management
Let's put the bully in management (because he'll "get things done" regardless of the social cost)
What organization ensures expected effectiveness?
First, to recap - how did we get here?
Because while ensuring we have high H provides a foundation of trust, and ensuring we have high O allows us to try new team topologies, neither actually tells you which topology to choose - just how to evaluate them.
There are two related reasons we tend to specialize by function rather than by product, both related to low O or traditional values.
First, we tend to promote people to management who excel at a skillset. This doesn't make sense, as management is an entirely unrelated skill. However, that is how it traditionally happens, so our managers tend to sit over similarly skilled individuals.
Second, we tend to split things by functions because we've seen others do it and assume that's how it's supposed to be done.
These things work together since it's usually easiest to split by direct manager, and we've already got those.
But none of this should be new to you, dear software engineer. Functional breakdowns should already reek...
We're trying to group people together, much like we're trying to group code in modules. And we already know to do that, maximize cohesion inside modules and minimize coupling between them!
Functional breakdown is similar to Logical Cohesion. For example, think of a folder structure where the models, views, and controllers all go under their own folders.
This kind of cohesion isn't all that helpful, because when we want to add something new to our code, we rarely want to update all the models at the same time, all the views at the same time, etc... which is what this kind of cohesion would make easy. Instead, we usually want to add or modify a single view, add or modify a single model, etc.
(Yes, I know it's popular. I still hate it.)
Product or project breakdown is what we're after, which is the (ironically named) Functional Cohesion in the types of cohesion. Here, all the parts contribute to a single goal.
Likewise, functional (the departments, not the cohesion) breakdown leads to a LOT of coupling between teams. In fact, to get anything done, every single team has to touch it from finance to IT. A product breakdown ensures all these skillsets sit in a single group, so the teams DON'T need to talk, reducing coupling.
Reducing talking and dependencies minimizes the rate of mistakes that one team might make that affect other groups. This means that when mistakes happen, they happen internally to a team in a high trust environment where they can be used to maximize learning and minimize punishment.
Mistakes are like defects. Well-factored code will not have a defect that causes other modules to fail because dependencies are minimized.
Basic software engineering, folks.
"I've seen Product Teams not work too."
The number one reason you've seen product teams not work is that the following happened:
The company was initially functionally oriented
They tried to do a re-org
The functional heads still maintain most of the power and allow everyone to call it a product org
You're new and were told this is a product org, but you still have to get your code reviewed by someone you've never met two buildings away who can say no for any reason.
Doing a functional organization FIRST builds many implicit power structures and momentum. It's harder to move away from functional than just avoiding it in the first place.
So what do we do? How do we "scale agile"?
First, get your culture right.
You need to get culturally aligned on the H and O values. This will affect your hiring, how you give feedback and the behaviors that get people promotions and PIPs.
Only after you get culturally aligned will scaling agile have the foundation to succeed. And if you decide that you don't need Big OH culture, scaling agile isn't for you.
If you're oH culture (think a defense contractor), people probably weren't that pissed off by waterfall in the first place and not that interested in agile techniques.
If you're an Oh culture (think a tech shop with a certain charismatic, Mars-lusting C level), no amount of process will work because the culture will always reward 'genius' and 'big thinkers' who (often by cheating) get results no matter what.
If you're in an oh culture, you're in a workaholic shop. Agile is a waste of time. Everything is a waste of time. Put that coffee down. Always. Be. Coding.
Second, scale horizontally. Not vertically!
Most "scaled agile" frameworks attempt to scale vertically. Agile doesn't work this way and never has. I'll have more on this in another blog post that goes more into our approach of "Agile: Deconstructed," but suffice to say, vertical scaling is just another way of introducing waterfall.
You've got to remain cross-functional - if anything, add more functions to the product teams. Does each product team have its own HR? Their own finance? That's easier said than done.
Third, respect Gall's Law.
A complex system must come from a simple system - this is how you might have gotten waterfall in the first place. But you now know what's going on, and you want to fix it. This means no big bang scaling agile implementations but relatively slow, incremental, iterative, and intentional process improvements. We call this "Recursive Agile" - apply agile's lessons to all your agile transformations.
A socio-technical system - that's what an engineering team is - abides by many of the same laws as a traditional technical system. So iterative, incremental changes will succeed more than any big bang (waterfall) approach.
Finally, invest in people management. You will need a way to decide "who's in charge" - that's one of the main reasons we even introduce org structures. You want to find people who demonstrate your Big OH culture and train them in people management. The authors of this blog offer coaching - just reach out.
Want to learn more about Guildmaster's approach to Agile and Process? Click "Subscribe to the Soapbox" below for more! Love this and want others to see? Share to Reddit! Want more insights into Culture, Process, Management, and Leadership in Tech? Email us at email@example.com, and follow us on Twitter and LinkedIn