LifeInVistaprint

April 20th, 2015

Avoiding Null References with Functional Patterns

Author: David Durschlag

No one likes null reference exceptions. They convey little information, are hard to debug, and can “delay” error catching, by being thrown so far after the original problem that the underlying issue is lost to the knowledge of man. One of the most common causes of these painful errors is the use of null as a way of expressing non-exceptional failure. This article will discuss this particular case, point out some of the dangers inherent in this common pattern, and suggest a couple different alternatives, taking inspiration from the world of functional programming.

What is null?

C#’s  null  constant is used in a number of ways. Sometimes, it is used as a pointer with value zero, a holdover from C and C++ made necessary by COM interoperability. Sometimes, it is treated as a special value, distinct from any other “normal” value of the type. Often, however, it is used to express a “non-exceptional failure.”

When should exceptions be thrown?

This is a complicated question, with a lot of opinions and justifications on both sides. The important part is that, in many cases, it is considered undesirable to throw, and this leads to the non-exceptional failure case — something has gone awry, and that needs to be expressed to calling code without throwing an exception.

Non-Exceptional Failures

Consider a dictionary from a person’s name to their favorite pet. Its type would be Dictionary<string,Pet>. Maybe this is a member variable of the PopulationManager class. The population manager exposes a method with the signature Pet GetFavoritePet(string name) , but there’s a caveat — it will often be called with names of people who the system doesn’t know about. Heck, maybe this function is publicly available from a web service, and anyone can call it with anything. For performance reasons, the developer doesn’t want to throw an exception if the name isn’t recognized.

Luckily, the developer finds the handy TryGetValue method of Dictionary . She uses it to return the correct value if it’s available, and null otherwise, neatly avoiding any exceptions. Yet the Dictionary class itself doesn’t have a method that works by returning null if the key isn’t found — you can either use TryGetValue or its indexer, which throws an exception if the key is missing. Why?

The reason is that  Dictionary can store null values. For example, a user who has registered with the system, but not specified their favorite pet, might be in the dictionary, but with a null value. In some cases, this is the same as not being in the dictionary at all. In some cases, however, it’s not. If  PopulationManager.GetFavoritePet returns null, does it mean that the user isn’t registered, or that the user is registered but hasn’t entered their pet? What if things get worse — sometimes the dictionary isn’t accessible due to thread locking, and we don’t want to throw, so we’ll return null. Sometimes the user is registered, but has no pets. Which null is which?

Differentiating between levels of null-ness

Let’s create a class that helps solve this problem. First, let’s define an interface that could help us:

Now we can change the signature for getting favorite pets to IMightHaveFailed<Pet> GetFavoritePet(string name). That gives us a value that means “we successfully looked up that user, but they had a null pet” which is different from “we failed to look up that user.” This lets callers respond to each in turn:

This also lets us handle the case where we might have a null dictionary, and if not we might have a null user, and if not they might have no pets, and if not they might have no favorite. It’s not pretty, though: IMightHaveFailed<IMightHaveFailed<IMightHaveFailed<IMightHaveFailed<Pet>>>>. Is there a better way to do this?

Maybe

It turns out that there’s already a class in .Net that looks very much like IMightHaveFailed. Its name is Nullable<T> , but it’s more commonly seen as ?. Unfortunately, there are a few problems here:  Nullable<Nullable<Nullable<Nullable<Pet>>>>  doesn’t win any prizes for aesthetic beauty, either, and the syntactic sugar only goes one level deep:  Pet????  isn’t legal.

It turns out that Nullable isn’t a Microsoft innovation — it’s actually a cheap knockoff of F#’s Option type (also known as the Maybe Monad in some contexts). The actual Option/Maybe does a few tricks the knockoff doesn’t, and one of them is providing a way to branch and run different code based on whether the result succeeded or not. Perhaps by going down that path we can come up with a more legible, yet effective, way of expressing different levels of null-ness.

Trying it out

It turns out that C# already has a standard idiom for this — in fact, it’s already shown up earlier! The Try-Out pattern is used in various classes provided by Microsoft to avoid using null, including in TryGetValue. This immediately changes the signature of pet retrieval to bool TryGetFavoritePet(string name, out Pet pet), but that only gets us one layer of null-ness. The most elegant way to get more seems to be to introduce an enum:

Our client code can now handle all the possible reasons for petlessness, but we’ve introduced a new enum which will invariably cause the following:

That’s a lot of boilerplate to say “I want different behaviors for each possible state.” It also doesn’t parameterize well — what if in the “no favorite” case we want to provide the caller with the list of all pets, since there is no favorite? The signature becomes a mess, null starts showing up again, or both.

Functioning functionally

Of course, if you only need one level of null, you could use a reimplementation of the Maybe Monad itself. But what if we want to do something different for each case, and include the list of pets in the “no favorite” case? Let’s pull a different tool out of the functional toolbox and break out first class functions themselves:

The calling code for this is a lot like the switch statement above, but with the boilerplate syntax trimmed off:

The signature for this version is definitely longer than the Try-Enum version, but the tradeoff is a substantial gain in power (alternative parameterizations by case) and conciseness (at the callsite).

Wrapping it up

There’s more than one way to do most things, and that applies to handling non-exceptional failure cases in C#, too. Just because there is a plethora of ways, however, doesn’t mean they’re all created equal. By avoiding null values, we can prevent difficult to debug exceptions down the line, make our signatures more clearly state the behavior of our methods, and reduce heavy syntax that obscures the actual purpose of our code. Try-Out and first class functions both have advantages and disadvantages, but either is preferable to wondering “what does this null mean?”

David Durschlag is a Lead Software Engineer at Vistaprint

Recent Posts

Join Life in Vistaprint Search Jobs