Day 26

Technical debt is typically understood as the consequence of shortcuts programmers take to meet a deadline. At some point in the future this debt is repaid by cleaning up the code, or so the basic finance analogy goes.

Technical debt, much like debt in the real world, is not all created equal. Some types of technical debt come with a high interest rate and put serious constraints on what you can do in the future. Debt like this can easily ruin your startup. Other types of technical debt just exist, harmlessly, like a 30 year mortgage bearing a 1% interest rate.

You want to distinguish between the different kind of shortcuts you can take and the ways in which the debt can bite you. In this post I expand the use of ‘technical debt’ so all design choices that create future obligations are considered debt. You can accumulate debt to the point where you spend all day every day dealing with these obligations. You have to work longer and longer hours just to avoid drowning. Any additional setbacks and you’re going under. The financial analogy here is that you don’t want to get to the point where the majority of your income is eaten by interest payments. You won’t have a good time digging yourself out of that hole.

I categorize tech debt roughly like this:

Sloppy code debt. This can be anything from gratuitous copy-pasting to a hairball of dependencies between different parts of your stack. This is what most people think of in the context of Technical Debt. Harmless technical debt in this category is inside self-contained in edges of your architecture. Self-contained parts are easy to understand and easy to rewrite if you need to. The worst kind of debt in this category is at the core of your app. If the code you have to deal with all the time is a mess that’s going to slow you down and result in bugs.

Creative user debt. Your users will use your app in ways you don’t anticipate. Suppose you introduce a ‘tags’ feature then some users will inevitably try to add 100 or 1000 tags in places where you expected people to use at most a handful. If you’re in a hurry and don’t at least assert()[1] to put a cap on the number of tags that can be associated with a thing you’ll wake up one day to an email from a user about the app being slow. After some minutes of confusion and looking at some SQL logs you figure out what’s going on. Goodbye road map, hello long days of frantic repair work.

Not making a nice error message when a user adds to many tags? Harmless tech debt. Having the UI glitch when there are a silly amount of tags? That’s fine, too. Having servers melt down because you didn’t spend 10 seconds to enforce limits? Not worth it.

Package dependency debt. Software you write today will stop working sooner or later, and often much sooner than you’d think. Sometimes people call this bit-rot, as if software suffers from biological degradation. It doesn’t, of course, but the environments we write software for change and that amounts to the same thing. Server-side packages get patched for security updates, as does the OS itself. On the client browsers change and break your code.

When your app is large and has many dependencies this kind of debt isn’t just a chore to deal with. It can actually become a real problem, especially when you’re forced into version upgrades in order to get those security patches. It gets even worse when you compile your own packages or deviate from the packages that come with your OS. Have your app depend on 250 packages? That’s easy. Monitoring 250 packages for security vulnerabilities? Practically impossible.

To mitigate this we think it’s best to roll your own solution instead of adding dependencies to your app, whenever possible. It’s more work up front, but your code will solve exactly the right problem and once it’s done it will continue to work with minimal maintenance.

3rd party integrations. All integrations with 3rd party services create debt and debt of this kind is always bad. Every integration you make with a 3rd party is going to break, and probably at the most inconvenient time. There is a time for caveats, but this isn’t one of those times. Any 3rd party you depend on is either (slowly) going out of business or they’re growing so quickly that they have growing pains and need to make breaking changes to cope. In the past we’ve built integrations with Slack, Twitter, Google Maps, Google Analytics, Google Workspace né Google Suite né Google Apps, MS Active Directory and more. They have all completely broken down, at some point or another. The ongoing cost of integration maintenance is real and not to be underestimated.

Overengineering debt. When you overengineer you make something stronger than it needs to be. When you under-engineer you make something that you expect will collapse sooner rather than later. This might sound bad, but it isn’t. You can’t predict where the bottlenecks in your architecture will be in advance, especially in the face of exponentially growing traffic. Exceptions notwithstanding, the simplest pieces of architecture are the easiest to diagnose and improve.

If we have to choose between having extra package dependencies (debt of itself) and a brittle 100 line Python script we’ll go with the Python script every time. Sometimes the hacky solution keeps working for years with zero issues. Using your SQL database as a message queue? It’s not pretty and it requires polling, but it’ll work and your customers won’t know the difference. Do you want to wake up a script? Polling a file’s atime or sending a SIGUSR signal is no crime. Customizations for customers? You can hardcode them. It’s fine.

Feature debt. Sometimes features turn out bad and confusing. If at all possible, you want to get rid of those features right away. When you’ve just launched you can still delete a feature without notice. Once you have paying customers removing features from your product becomes a far more delicate matter. Sometimes two features are fundamentally incompatible, and by introducing one feature you can’t introduce another. For instance, if you promise end-to-end encryption you can’t also offer server-side search functionality (the inverse is also true). Once you realize your mistake you have to fork your code and you’ll end up with customers on both forks.

Make this mistake 5 times and you’ll struggle to keep the different versions of your product apart. This will slow your development efforts down to a crawl. Every feature has a recurring cost to it, in maintenance and in the space it takes up in the user interface, so you can’t afford to add features haphazardly.

Customer support debt: when you build a feature in a rush and don’t spend enough time making the interface intuitive you’ll also pay a price in customer support. This debt isn’t so bad because the cost of the fix doesn’t increase over time. However, if you find yourself repeatedly answering the same questions that’s a problem. Either make the feature good and self-explanatory or only enable it for a handful of users. If you allow poorly conceived features to accumulate you’re not just creating a lot of support work for yourself, you also alienate your users. A backlog of support tickets can also be considered a type of debt.

Every startup accumulates some technical debt. This is inevitable, and it’s not even a bad thing. When you start out you don’t know exactly what you’re making. By writing and rewriting your app your figure out what works. No point in creating beautiful code if you’re likely to rip it out in a week or two anyway.

Besides, before you have any users technical debt isn’t real. Anything can be fixed with the delete key at this early stage. The more users (or worse, customers!) you have the harder it gets to keep debt at a manageable level. Database migrations must now be planned carefully. Removing features becomes difficult or near impossible. 3rd party integrations break and will hijack your development schedule. Security updates and other maintenance work will slowly eat up a larger and larger percentage of your week. You can’t let these factors spiral out of control.

If at all possible, make the core of your app rock solid before you start charging people money. That way you can continue building on a solid foundation with little tech debt. If this means building a much smaller app, so be it. Smaller is better. You want few dependencies and few moving parts. Knowing exactly which corners to cut and which parts to rewrite until they’re just right is very, very hard. We still get it wrong all the time.

[1] Asserts abort a program when a condition isn’t met. The user gets a “server error” message and the programmer gets an automated email that a specific edge case has been hit. By adding many asserts throughout your code that check whether your data and inputs look correct you’ll save yourself days and weeks of debugging time.

Day 26

Thinking different about technical debt

← Previous day (day 25) • Next day (day 27) →

Read more