I updated my blog this weekend and wanted to share some thoughts along the way:
Disclaimer: Based on real user interviews.
I've been very interested in the topic of development velocity for a long time. It's something we spend a lot of time thinking about at Onboard AI, both for our internal operations as well as for the thousands of developers that use Onboard to ship code faster.
In my conversations with engineering leaders, there is near unanimous agreement that development speed should be a top 3 priority for CTOs, alongside quality and compliance. CTOs, however, have:
- a wide variety of ways to measure dev speed
- very little consensus on what actually slows devs down.
Naturally #2 varies based on company size, culture, as well as the nature of the product and composition of the team. Is it a team of young, high-energy but low experience developers? Relatively flat hierarchy? Are the teams siloed?
I figured it would be interesting to examine this topic at the grassroots, or individual contributor, level. So I found 100 software engineers from these companies →
- Meta
- Amazon
- Paradigm
- Heroku
- Zeta
- C3
- Roblox
- Humane
- Stripe
- Granular
- Palantir
- Segment
- eBay
- Splunk
- DraftKings
- Arista Networks
- Zillow
- Workday
- Cisco
- Convoy
- CrowdStrike
- SpaceX
…and I asked them exactly one question.
What's stopping you from shipping faster?
I was surprised by two things:
- Almost no-one had to think about it. People shot back with an immediate answer, as though I had just asked about a pain that was always on their mind, but never surfaced by another person.
- There were a surprisingly high number of people for whom the reason had to do with build, compile, and deployment times. Devs really hate waiting for things.
The answers were diverse but I have done my best to broadly divide them into categories and subcategories.
Codebase
Dependency bugs
Software is a delicate of house cards with open source libraries all the way down. In a lot of software, the source code comprises mostly of dependencies, and very often your most frustrating bug is no even your fault.
Jack, ex-Microsoft and ex-Bridgewater lists his #1 barrier to shipping faster -
Hitting random mysterious bugs with libraries that require reading tons of old stack overflow links and Github issues
Complicated codebase
This one is closest to my heart. In the last decade as dev speed has evolved into a necessary competitive advantage, nearly every software team suffers from code sprawl. The logic is sound:
→ Growing startup
→ must ship faster
→ no time to write docs
→ early engineers churn as company grows
→ new engineers need to navigate the mess.
Maria, who is an SDE at Amazon says this about her team’s codebase:
There is so much undocumented in our service, including poor records of new features, nonexistent or outdated info on our dependencies, or even essential things like best practices for testing, a lot of time is wasted in syncs trying to find the right information
Interestingly, this problem only gets worse with time -
Nobody has time to update the documentation, which creates a vicious cycle where more features are shipped, causing the docs to be more outdated and useless, meaning no one updates or uses them, and so on. A lot of our docs haven’t been updated in years
Microservices architectures bring the additional challenge of being difficult to understand on a systems level.
Jennifer, senior engineer at Palantir describes her struggles with navigating microservices architectures:
…ramping up on how all the different microservices interacted with each other for a project that touched many pieces at once was always a challenge.
Of course, this problem extends to enormous monoliths too. Here’s what Pranav, a senior engineer at Stripe had to say about that:
We had Ruby code in the millions of lines. We had a service called livegrep that would let us search the entire codebase very quickly but it was still hard to find things sometimes and other times, you found way too many matches (for eg, if you were search for a typed structure to see how folks have used it)
Often at larger companies, the problem is understanding systems, rather than modules. This especially harder when the stack is fragmented. Here’s Dharma, a senior engineer at Meta, on why this matters:
I have to touch cross-language repos. One part of the code will be in PHP, and another in Python or C++, and switching between each and seeing how they interact makes it harder to finish the features.
Niam, an engineer at Google, had this to say about how editing codebases and not code is the hard part -
I would say deeply understanding the different components that make up a system.
Like since the codebase is so big and you have to design a feature implementation, you have to make sure that you understand very well how your feature fits within the whole picture along with finding code pointers to the individual subsystems you need to make changes to for it to work*
Process
QA Loops
QA processes represent the tradeoff between quality and velocity. Developers will often accuse QA of being to “nitty” and adding useless repetition to the dev cycle.
Taylor, who has worked at a series of high-growth startups and now runs one, described the QA process like this:
Me creating a test spec for QA. QA finding problems (because QA will always find problems) Getting list of problems 2 days later Fixing merge conflicts because more code has shipped since I last pushed. Plus context switching Back to QA*
In the amount of time it takes QA to review code, underlying context might change, making the change more involved.
Waiting for spec
As teams grow, the number of stakeholders involved in any decision grow superlinearly. Invariably this introduces additional steps, redraws, and amendments to the spec before engineers can begin execution.
Brianna, an ex-Convoy engineer had only this to say when asked why she wasn’t shipping faster -
just awaiting spec approval
Awaiting stakeholder approval
Raj had this to say about Amazon’s tedious stakeholder involvement, which he felt was overreaching to the effect of detriment.
At Amazon? Meetings, approval, talking to 10 different stakeholders because changing the color of a button affects 15 micro services
There is certainly stakeholder creep and that can significantly slow down development. Classic too many cooks. Here’s Josh, ex-Meta senior software engineer -
Under standing product requirements took forever Too much process for sign off and too many reviewers.
Writing tests
Devs complains around tests could basically be divided into not enough tests, and bad tests.
Not enough tests - not having E2E testing means every new task needs to include writing a lot of new ad-hoc tests, no matter how small the task.
Here’s what Grant, SWE at a fintech unicorn had to say about that -
…the biggest thing was we didn’t have good tests or good types, so I had to do a lot of work to do e2e testing of stuff whenever I wanted to ship stuff
Bad tests - clog up the CI/CD pipeline and add unnecessary time to the deploy process. Here’s Sadir, a senior engineer at Splunk, on what slows him down.
My reason would be running pipelines take lots of time and to ensure proper code coverage with test cases sometimes we require these pipelines taking their due time, which in turn slows us down
Tooling
Deployment/build speed
This one was quite surprising for me because I did not realize how many people suffered from this. I have a whole theory around the toxicity of idle time - time that is not a break but not productive either. Waiting for deployments is exactly that.
Deploy times can often be into hours. Here’s what Aryan, a developer at Arista Networks had to say about what slows him down:
So for me at work it takes like 3-4 hours for a full build to complete that I can run all the tests for a PR on, and those test can take anywhere from 1d to 3d to get done, even with a lot of optimization So even a small change, like modifying a filename can get tedious
Raymond, senior engineer at Stripe, had this to say:
CI/CD pipelines being slow
People
Awaiting PR Review
Most modern software companies have some version of a code review process where peers or sometimes senior engineers review code before it is merged. Often, engineers view this as a secondary responsibility (the primary one being writing new code). Naturally, this means code spends a disproportionate amount of time in review.
Roman, a former senior product engineer at Stripe, had this to say about PR reviews:
The entire PR flow i’d say, even when it goes well then you wait till someone approves the PR to continue
At some companies, the problem is at the other end of the spectrum - overreaching PR reviews eating time.
I think the number one reason I’m unable to ship faster at my job is the tedious code review process. Other engineers scrutinize and meticulously review code, pointing out the tiniest flaws, making it harder to ship sometimes.
It seems like other engineers do it just for the sake of it. Another reason I can’t ship code faster sometimes is because I’m asking for approval from senior engineers. I work on a product team, and so, every change I make is very crucial and can be catastrophic, which leads to the senior engineers (L6 folks) scrutinizing code at times.
This was a recurring theme, and there were come patterns:
-
Most an engineers agreed that time spent awaiting PR review + time doing revisions is a necessary evil.
-
Most also agree that nit-picky PR reviews are often a result of:
-
Personal vendettas/prejudice/patronization.
-
Developers having too much free time.
-
Misaligned incentives (just because thoroughly reviewing code is good doesn’t mean every nit should be picked)
-
-
The general sense I got from engineers about the PR process does lead me to recommend to engineering leaders that they audit their teams’ PR practices. There might be lost time there that is easily recoverable.
Scope creep
Josh, a former engineer at C3.ai answered with brevity when asked what slowed him down -
PM-induced scope creep.
The human tendency to stuff last-minute items into the crevices of their luggage minutes before leaving for the airport manifests itself at software companies as scope creep. Slowly and surely, it pushed back your release-date, with every incremental addition feeling like an insignificant task, but in aggregate adding significantly molasses to a team’s velocity.
Unclear requirements
Unclear requirements slowing developers down is unsurprising. Based on my conversations, this was broadly because of three reasons:
Productivity closely varies with conviction. Lack of clarity on goals and success state definitions = low conviction = low motivation = low productivity.
You can run faster when the ground underneath you feels firm.
Developers don’t trust management when management lacks clarity on what should be built. More so when developers don’t feel that they have influence over what should be built.
Justine, an ex-Meta software engineer said this about unclear requirements in her org.
Understanding product requirements took forever Too much process for sign off and too many reviewers. Bad management.
Excessive meetings
This is unsurprisingly a major concern for developers. A common framework to think about this is the “maker” schedule and the “manager” schedule. Managers can scatter meetings through their day since their tasks can be done in pieces in the time between meetings. Making is deep work - it requires vast swaths of uninterrupted creative time.
The problem usually takes some form of managers managing maker time, inadvertently imposing the manager schedule on makers.
Maria, an SDE at Amazon, cites excessive meetings as a #1 time sink.
…lot of time is wasted in syncs trying to find the right information.
Motivation
Diane, a former engineer at Meta, had this to say about the #1 reason that slowed her down:
honest answer is i was on ads and that’s a very old / complicated / large stack (edited) and i didn’t understand it my friends on younger teams seemed happier, i was miserable
This one should not be surprising. Great engineers are usually vert smart people, so they want to work on things that are interesting and inspiring. Some projects are generally more interesting others, and those tend to move faster.
Another aspect of this the lack of reward for marginally more effort. Rohan, an Amazon SDE, had this to say:
Also motivation. I didn’t care to grind because I was on track for a promotion the same rate anyway so what’s the point of putting in more than a few hours if I’m beyond expectations anyway
At large companies, individual impact is nearly impossible to measure. This means two things:
It’s easy to get away with coasting. (Rest and vest).
It’s hard to get recognized and rewarded for going above and beyond.
The result? Many people coast, far fewer blow past expectations.
Conclusion
It is generally accepted that development speed is a business-critical advantage. Great companies live and die by how fast they ship. From the patterns I gathered, larger companies generally tend to be at a disadvantage when it comes to shipping fast. Sometimes for good reason, sometimes not.
They are intrinsically more risk-averse because there are serious costs to missteps. This manifests itself as extensive PR reviews, QA, planning etc. They are more likely to have excessive meetings and push makers to run on manager schedule.
This is perhaps the circle of life - it is the added agility of an early-stage startup that allows them to upset slow incumbents. If they succeed, they themselves become slow-moving incumbents, waiting to be upset by their upstart successor.
Now, some graphs.
Why aren’t you shipping faster (visualized)?
I had to use some discretion to assign each message a category, and below is my best attempt. Complicated codebase awaiting stakeholder approval were the biggest deterrents to dev speed. That said, there is a significant long-tail of reasons. Turns out, developers have a lot to complain about.
Nearly a third of respondents alluded to a company or team-specific qualm, so I decided to omit those in interest of keeping this generally useful. (e.g. Workday engineers feel slowed down by their proprietary programing language. Fascinating, but not relevant to nearly any other company).
Top responses to "Why aren't you shipping faster?"
To make this more readable, I made a more concise chart that combines these into 7 overarching categories, leaving some of the long-tail reasons out.
- Process
- People
- Codebase
- DevOps/tooling
- Motivation
- Debugging
- Docs
Summary: top blockers to shipping faster.
People and process are usually related, and those two combined account for nearly half of all respondents.
Disclaimer: names of survey participants have been changed to protect their privacy.