How to explain Log Levels to your kids and your team

I’ve had to explain log levels on many occasions when working with new teams and over time developed a hopefully simple explanation of how logging should work in any application. This post is of course opinionated; As long as you, your team and your organisation have the same opinion you should be fine and my opinion won’t matter. But if you are seeking a better explanation of logging levels then this post is here to help you reason through this.

Log Levels

Fatal, Info, Debug, Trace, Silly, Verbose, Alert– there are so many logging levels that every time you start writing code and use a particular logging library it is often confusing what level to use when logging a particular message. To add to the confusion there is also a level associated with the logger and usually the loggers’ level decides what log levels will be output by that logger.

Syslog RFC 5424 section section 6.2.1 defines seven severities and many logging libraries support these. You don’t really need so many levels and the more levels you have the more the cognitive load of choosing the right level when writing code. Sticking to some core principles discussed here and a few levels should suffice.

Exploring a Colossal Cave

So imagine besides your nighttime programing job you are in daytime an intrepid cave explorer and are about to set out exploring a new cave with your teammate. You are the first humans ever to enter into the cave as you lead your less experienced teammate.

To share your first hand experience you have a twitter feed where you will post updates as you progress. But you are also aware of the dangers of entering an unexplored cave and are well prepared. You carry with you a walkie talkie which connects you with a fellow teammate who follows behind you and have an external hard communication line that connects you to emergency services. Both of you carry a backpack with essential first-aid kit, food and water.

You start into the cave by timing yourself and post some info into your twitter feed “We are entering the cave…”, “the time now is 14:00 hours”.

As you explore the cave you notice a deep slippery gorge filled with poisonous snakes and you warn your fellow explorer behind you “Watch your step on the left, snake nest”.

The level you log a message at should depend solely on the severity of the message itself. The output level the logger is set at should not worry you too much

Further in, your teammate splits up and decides to explore a small stream going down a gully. Soon you hear him cry out on your walkie talkie, “I’ve slipped, I’ve slipped, I’m going to have to drop my backpack and swim out”. “Okay I’ve gotten out but no backpack, I’m going to have to turn around.” You realise it was an error letting him go it alone and now he has had to abort the exploration, too bad. Before leaving he puts up a few flares and takes a snap of his whereabouts. He will share this internally so that you both can check and learn, debug and decipher what went wrong afterwards.

As you wander deeper into the cave you see beautiful cave paintings and in excitement make a misstep. You fall into a pit and while you manage to regain your control you hit your ankle hard and feel the crack of bone. This is critical. You need someone to come and help you out. Luckily the hardline to emergency services gets you some help.

Defining Log Levels

So now that you are out of the Colossal Cave and resting your injured ankle you decide to use your experience and write some rules for log levels.

Message Severity and Output Level

You realise that every message you want to log has a level of severity associated with it. The level you log a message at should depend solely on the severity of the message itself. The output level the logger is set at should not worry you too much. The loggers’ output level should automatically filter out messages that are of lower severity, it’s not your worry. For example, emergency services use a logger set at level critical which will automatically filter out any info messages even if you send it to the logger while your twitter feed uses a logger set at level info and will show everything on your twitter feed so your fans can follow your progress and mishaps in real time!

Importantly the loggers output level does not change the severity of your message.

log data may inadvertently expose runtime, security or personal details held in your system…

The Rules

info (or verbose): Use this to log messages that communicate details of where you are in your code, what you are about to do. If you are tracking performance log when you entered a code block, how long you took, what you did or did not do.

warn: Use this level to communicate expected error conditions or errors you can safely recover from. Also use it to highlight any risks which might need addressing in time or may get more serious if they continue to be ignored in subsequent releases of your code.

error: Use this level to communicate unexpected conditions that caused your code to abort what it intended to do during this run.

critical (or alert or emergency): Use this level to communicate unexpected conditions that will cause your code to continue failing even during subsequent runs. Usually critical is a call for help to some human to come and help your program or system to recover, even if it’s 2AM in the night.
Note: If your chosen logging library or platform does not support a critical level, use error and include a tag in the message to say it’s a critical condition. You can then get it immediate attention or human intervention to resolve.

debug: This is an interesting level and should be used to log information that helps you debug the program, dump the contents of memory, stack traces and other variables in context when errors happen. This kind of variable log data may inadvertently expose runtime, security or personal details held in your system and hence should be logged on this seperate level. Remember you can combine the debug level with one of the other levels, usually error or critical. First log a friendly message at one of the other levels and then log the system state variables using debug immediately after. On productions systems this gives you the option to completely turn off the debug level logs or purge the debug logs based on your organisations security guidelines.

That’s it, I hope you enjoy exploring many more caves and do so safely.

Image credit: Jorge Fernández on Unsplash