DIY javascript error logging

There are many SaaS products out there that help you with javascript error and event logging, but in this blog post I want to make the case for rolling your own solution.

We log 3 types of events: (1) javascript exceptions with stack traces, (2) failed assertions, and (3) general usage/diagnostics information.

Exception handling

We can use a global event handler to log exceptions. This used to be somewhat difficult, but nowadays window.onerror works great. The browser gives you everything you need. Straight from the mozilla docs:

You can even get a pretty good stacktrace with Error.stack. It not part of the official web standard, but it works on all major browsers and that’s good enough. Once you’ve collected all data you want to log you can just send it to your server with an ajax request. Alternatively, you can use an <img> tag. Something like this works just fine:

let errimg = document.createElement('img');
errimg.src = '/jserror.png?e=' + encodeURIComponent(JSON.stringify(obj));
document.querySelector('body').appendChild(errimg);

One thing to watch out for is that GET requests can get truncated. You also want to make sure that you don’t log errors when you’re already in your error handler (otherwise you’ll DDOS yourself :)) and you probably want to drop errors you’ve already reported. Reporting an exception once per session is enough for debugging purposes.

What metadata you want to log is up to you, but we find it useful to log these things:

  • Username, account name. If you find a bug and fix it you want to tell people you fixed the bug, but you can’t do that if you don’t know which people got the error message.
  • Browser version. Helps when you want to replicate the bug. This was super important back in the IE6-9 days, when you had to make tons of browser-specific workarounds. Nowadays you mainly want to know if people are using a really old or unusual browser.
  • Javascript app bundle version and page load timestamp. Some people keep their browser open for weeks at a time and you don’t want to waste hours trying to replicate a bug that has been fixed ages ago.
  • Adblocker usage. Add a <div> with a bunch of spammy keywords to your page. Use setTimeout to check the boundingRect of that node a couple seconds after your page has finished loading. If the node is gone, you know they have an adblocker installed.

Be careful not to log anything that could contain customer data. Easier debugging is great, but not when you have to compromise your customer’s privacy to do it. It’s fine to log counts, IDs, and checksums. If you can’t figure out how to replicate the bug with only a stack trace to guide you then you can always add more asserts to your code and wait for one of them to trigger.

Assertions

To debug exceptions you only have a stack trace to work with. Debugging is a lot simpler when you make liberal use of assertions in your clientside code. You can use the same error logging code you use for exceptions, but asserts can log some extra diagnostics variables.

Usage tracking

Every time you add a new feature to your product you want to track if it gets used. If not, figure out why not. Is the feature too hard to discover? Do people just not care about it? Adding a tracking hook takes 1 minute, but the insights you get are invaluable.

Our rule of thumb: we get an email notification every single time a new feature is used. If the notifications don’t drive us nuts that means we built the wrong thing. This really helps us calibrate our intuition. And it’s motivating to see the notifications flow in right after you push to production!

You also want to track how often users get blocked by your software. Every time a user wants to do something but they get a “computer says no!” message they get a little bit unhappy with your software. They upload a file and it doesn’t work because the extension is wrong or the file is too large? Log it and fix the problem. Sometimes the fix can be as simple as telling users the file is too large before they have uploaded it. Instead of a simple “access denied” error see if you can make the error more helpful. You can add a button “ask administrator (name) for permission”. Measure which problems users run into fix them one by one.

Serverside

We take a whitelisting approach. We get email notifications about everything to start with. Then we add filters for all the errors we can’t do much about. Errors caused by connection timeouts, errors caused by virus scanner browser plugins, things like that. Every javascript exception potentially breaks your entire site for some users. That means every exception is worth investigating. You’ll inevitably discover your site breaks when an ajax POST times out, or when a dependency fails to load. Or when an adblocker removes some DOM nodes. No matter how well you test your software, your users will find creative ways to break it.

You can also use feature usage tracking for spam/fraud detection. If your SaaS service is inexpensive it will be used by credit card fraudsters to test if their stolen credit cards work. You can easily distinguish between real users and bots or fraud signups by comparing some basics statistics on feature usage and which buttons have been clicked on.

If you use a 3rd party service for error logging you can’t cross-reference data. You can’t easily discover which features get used by people who end up buying vs by trial users that fizzle out. If you have subscription data in one database and usage/error tracking in another database querying gets complicated, so you won’t do it.

Another reason why we want to do our own event logging is that we might accidentally log something that contains sensitive data. Our own logs rotate automatically, but 3rd party logging/event services will hang on to that data indefinitely.

Writing your own javascript error/event logging code isn’t much work and it will give you valuable insight in how people try to use your software and the bugs they run in to.


You can follow us on Twitter @jdvhouten and @wcools and look for #80daystartup

Read more

Work/new-life balance
Durable tech
Dogfooding
Early user feedback
Spending time to save time
Products want to be platforms
Always be launching
Enjoying the journey
Work-life balance
Recap @ Day 59
Perils of caching
Desktop-first
Making sense of contradictions
Trust signals
DIY javascript error logging
Taxes: an automation story
Magical thinking
Start small
High conviction, low conviction
Most deals fail

Post archive