At Teachers Pay Teachers, we’ve made fearlessness a core part of our engineering culture. We believe the fear of breaking things prevents teams from being as productive as possible. It forces teams to make the same mistakes perpetually, deters risk taking, creates silos of knowledge, and makes it hard to move quickly. You could say that fear is a mind-killer… obligatory Dune quote time…
Years ago, I developed a fear of breaking things. I want to tell you how that happened, why the fear sucks, and how to recognize if the fear of breaking things controls your team.
I had never broken anything fantastically.
That was easy to say since it had been less than two years since I'd received my CS degree. I had recently moved to NYC and started programming for a financial data provider. My team wrote crawlers that scraped public web pages and predicted stocks based on our findings. We dealt with gigs and gigs of data (back when that was a lot), and as the newbie engineer on the team, I was trying to do everything by the book to avoid creating any problems.
I had been working on a stressful project that was kicked off by a fancy-dressing project manager and good ol’ game of guess-the-estimate-I’m-looking-for-when-I-ask-how-long-it-will-take-are-you-sure-it-won’t-take-half-that-long. Now it was the night before the project was due.
I had built a crawler that should have started running that day, Wednesday, but our queuing system scheduled it to run on Saturday for some reason. I needed to fix that before everyone left the office that night. This involved editing the database.
We stored all of our scheduled jobs in a database table. I wanted to change the date of my job to run on the correct date. This was a common enough occurrence for our team. We typically wrote queries to the database and ran them on production, directly from SQL Server Management Studio, without any reviews, whenever we felt like it. The astute reader may notice a few problems with this process.
I proceeded to write something like:
UPDATE jobs SET runs_at = ‘11-16-2015’ WHERE id = 8675309
Then I highlighted the query with my cursor and hit the run button… and I noticed the query didn’t complete as quickly as I thought it should.
I immediately realized that I had only highlighted and executed the first two rows of my query… on the entire table.
I scrambled to where our only-slightly-prickly DBA sat and confessed what I’d done. He assured me that things were even worse than I thought: I had just rescheduled all the jobs to run at the same time. And that time was in about an hour. And I had wiped out all our previous schedule history.
After making me sweat long enough, our DBA told me he was able to reconstruct the table using a backup, but we’d lose about a day of history. He told me that I should learn an important lesson from this: "When executing database queries on production, write your update statements on one line so this won’t happen again."
And that’s how I developed a fear of breaking things.
I became more concerned with not breaking things and less concerned with why things were breakable in the first place. I should have taken away: "What can I do to make it harder to break things and prevent people from breaking things the way I just did?"
Like me, many engineers feel the fear of breaking things because we take responsibility for our own work, we hate making our team deal with problems we created, and we naturally get embarrassed when the breaking happens fantastically and publicly. At Teachers Pay Teachers, we’ve found that the problems that crop up due to the fear of breaking things aren’t obvious, so we’ve outlined them here along with a few telltale signs that you may struggle with this fear.
Problems caused by the fear of breaking things #
When your team is too afraid of breaking things, you actually end up breaking more things by repeating past mistakes. Individuals end up solving their own problems and dealing with individual incidents, but these solutions fail to address problems systemically.
In my story, each developer actually had a checklist they ran through to avoid making mistakes when issuing queries to production. All the documentation in the world wasn't going to solve the problem, though. These checklists were too vulnerable to engineers forgetting or skipping steps when they thought they might not apply. At some point, every engineer on the team created a problem similar to mine. And the systemic question, "Should we be able to query production directly like this?" never came up.
At least we tried with documentation; more often though, you’ll end up with symptom-fixes with no documentation or communication. Certain engineers will unintentionally end up hoarding information in their heads and workflows. Other engineers will actively, but quietly resolve issues. However, they will keep their fixes secret because they fear being blamed for breaking things in the first place. These knowledge silos make it tough for the rest of the team to take risks or experiment with better approaches.
These problems, in combination with the responsibility engineers feel for breaking things, deter engineers from refactoring, extending the codebase, experimenting, or taking risks. Engineers know they’ll have to fix the problems they create. Let’s say an engineer refactors some code, and they have to spend a few late nights fighting fires that popped up because you don’t have systems in place to detect, prevent, and resolve problems they accidentally created. How long do you think it will take for that engineer to stop trying to refactor things? Once this happens to enough engineers, your development speed and ability to react to changes grinds to a halt.
Warning signs this fear controls you #
To prevent the fear of breaking things from overrunning your team, you’ve got to know how much fear exists now. Teams quickly grow accustomed to the fear and develop patchy workarounds, so recognizing this fear can be tough. Here are a few things to watch out for:
Creative punishments are a telltale sign that your team suffers from the fear of breaking things. I once worked on a team that had a policy that if you broke the build, you had to buy everyone on the team doughnuts. An engineer on our continuous delivery team mentioned a past team made people wear "funny" outfits for a day. Most people have heard of “break the build” tip jars.
If you squint hard enough, it’s possible to see how creative punishments can be fun bonding experiences for some people, but when you’ve seen enough of these punishments, you realize something: The only result of "creative punishments" is “creative avoidance of punishments.” Underlying problems rarely get fixed. If you spent as much time on punishing as you did on fixing issues systemically, you’d get more done long-term.
Scary files are those files that never get refactored. They’re 2k+ lines long. People call them "scary files". You know the ones I’m talking about. It probably has “manager” in the name of the file. We’ve all got files that are scarier than others, but these are the ones people actively avoid touching.
We’re in the process of actively addressing our scariest files, and it takes making a commitment. We’ll post an update on how it goes, but we’re starting by surfacing metrics on the line count, complexity, and test coverage of the scariest files. Now it’s just a matter of making each of those graphs go the right direction (the hardest part, of course).
Only Fabio knows how to do that. A lot of teams have a Fabio. Fabio has worked there forever and knows all the tricks of making things happen on your system. Fabio is the only one allowed to change the authentication system. Fabio isn’t always named Fabio (but usually is). Fabio happens because of the problem with knowledge silos we mentioned above.
Watch out for areas in your system or codebase that only one or two people can safely modify. On top of pigeonholing Fabio and adding a potential single point of failure to your system, this is a clear sign that you probably suffer from the fear of breaking things.
How to combat the fear of breaking things #
There are many fears in software engineering, but the fear of breaking things running amok is the most toxic. It kills speed, agility, and productivity extremely quickly. To prevent the fear from overtaking your team, you must change your attitude and response to breaking things.
You can start by making a conscious shift:
ask "why are things breakable" rather than “why did things break”
celebrate identification of systemic issues over blaming individuals
solve underlying problems in your system instead of patching symptoms
build a system of safety nets to cover common areas of breakage
In a future post, we’ll dive deeper into these solutions, particularly building a system of safety nets.
In the comments, we’d love to hear stories of how you’ve broken systems fantastically or other warning signs that signal the fear of breaking things.