2. Null in C# 8
The story is simple in C# if you forget about anything before C# 8. Neither
value types nor reference types can be null
. The types whose values can be
null
are only nullable value types and nullable reference types such
as int?
, string?
, etc. But, as you might expect, various traps are waiting
for you.
Nullable value types and nullable reference types
C# has value types and reference types, both of which have nullable types and non-nullable types.
If T
is a value type and non-nullable, then T?
is a nullable value
type†1. The notation of T?
is a syntax sugar and its
actual type is Nullable<T>
(struct
). Therefore, T
and T?
are different
types at run time as well as at compile time. The T?
, that is, Nullable<T>
is a nullable type. Unlike Java's Optional<T>
, both T?
and Nullable<T>
prohibit nesting. The C# compiler achieves this by treating Nullable<T>
types specially. Note that instances of the nullable value type are immutable
objects.
†1 Nullable value types have been available since C# 2.0.
In contrast, if T
is a reference type and non-nullable, then T?
is a
nullable reference type†2. The compiler only
distinguishes between T
and T?
. At run time, T
and T?
is the same
thing (that is, T?
, which was called reference types before C# 8). And, of
course, T?
is a nullable type.
†2 From C# 8, nullable reference types have been available, making traditional reference types non-nullable reference types. However, there are some tricks to keep backward compatibility and the default setting is to do so.
These can be summarized as follows:
Non-nullable types | Nullable types | |
---|---|---|
Value types | Non-nullable value types (e.g., int ) |
Nullable value types (e.g., int? , Nullable<int> ) |
Reference types | Nullable value types (e.g., string ) |
Nullable reference types (e.g., string? ) |
Null checks for expressions of reference types
The following table summarizes how to see whether the expression expr
is
null
:
How | is null |
is not null |
---|---|---|
Equality operators | expr == null |
expr != null |
is pattern matching |
expr is null |
!(expr is null) |
Property pattern | !(expr is {}) |
expr is {} |
C# 9 also allows
expr is not null
.
Nullable reference types at compile time and run time
We described that “the compiler only distinguishes” between the nullable and non-nullable reference types. But in practice, it is not so simple. The compiler does not only distinguish between them but also treat them similarly, depending on the context. For example, the following class results in a compilation error:
public class RaiseCS0111
{
public void M(string name)
{
}
public void M(string? name)
{
}
}
Let R
be a non-nullable reference type, and when the compiler considers the
method signatures, it regards R
and R?
as the same type. Thus, M(string)
and
M(string?)
have the same signature, so the compilation fails.
On the other hand, when it comes to overriding methods since the signatures are
identical, then the compiler distinguishes between R
and R?
so that it
warns against the following code:
public class Base
{
public virtual void M(string? name)
{
}
}
public sealed class RaiseCS8610 : Base
{
public override void M(string name)
{
}
}
Contrariwise, if you override
M(string)
withM(string?)
, the compiler doesn't warn. This is because the Liskov substitution principle [1] is applied.
Roughly speaking, the compiler does as follows:
- Ignore
?
inR?
and compile ➜ If there are errors, output them - Then, run the data flow analysis on nullability, considering
?
inR?
➜ If there are warnings, output them
If you find this hard to understand, consider that ?
in R?
does not
represent a type, but an attribute that exists only at compile time (e.g.,
[MaybeNull]
). In other words, convert the following code:
string? foo;
to as follows, just in your brain:
[MaybeNull] string foo;
In such a case, it is easier to understand that the annotation is added to
variables and parameters rather than ?
represents the type.
The trouble is that there is no distinction between R
and R?
at run time.
For example, let's consider the following code:
public static void Main() {
var array = new[]
{
"abc", null, "def",
};
var all = array.OfType<string?>()
.ToArray();
foreach (var s in all)
{
Console.WriteLine(s);
}
}
If you look at the result, there are just two lines of output, and so all
does not contain null
. In other words, OfType<R>()
and OfType<R?>()
have
the same result. Looking at
the reference implementation,
OfType<T>()
only returns elements that match T
with is
pattern matching.
In the first place, specifying R?
with is
pattern matching results in a
compilation error as follows:
public class RaiseCS8650
{
public void M(object? o)
{
if (o is string?)
{
}
}
}
Now, let's try is
pattern matching with the type parameter T
as follows:
public class C
{
public static void M<T>(object? o)
{
if (o is T)
{
Console.WriteLine("true");
}
else
{
Console.WriteLine("false");
}
}
public static void Main()
{
M<string?>("a");
M<string?>(null);
}
}
The result is true
and false
. It shows that specifying R?
for the type
parameter T
with is
pattern matching is equivalent to specifying R
. This
is a natural consequence since is
pattern matching determines the type at run
time, and ?
disappears at run time by type erasure [2].
However, as shown in the example of OfType<T>()
, there are some cases where
we are easy to misunderstand. If you write your original methods, of course,
you can prevent specifying R?
for T
with type constraints, as described
below. On the other hand, for APIs in the standard library such as LINQ, the
user needs to be careful whether they determine T
at run time.
Default values
Let's consider the null object pattern, discussed in the Java part, again in C#:
...
public sealed class Program
{
private static readonly Action DoNothing = () => {};
private readonly Func<char, Action> toAction;
public Program()
{
var map = new Dictionary<char, Action>()
{
['h'] = MoveCursorLeft,
['j'] = MoveCursorDown,
['k'] = MoveCursorUp,
['l'] = MoveCursorRight,
};
toAction = c => map.TryGetValue(c, out var action)
? action
: DoNothing;
}
public void HandleKeyInput()
{
var c = GetPressedKey();
var action = toAction(c);
action();
}
...
Like the example in Java, null
does not appear. It's great, but there is a
little bit of concern.
When the TryGetValue(TKey, out TValue)
method of Dictionary<TKey, TValue>
class returns false
, what is the value of action
? As Action
is a
non-nullable type, you might assume that action
should never be null
. But
the answer is null
.
The API reference states as
follows:
value
TValue
When this method returns, contains the value associated with the specified key, if the key is found; otherwise, the default value for the type of the value parameter. This parameter is passed uninitialized.
The default value of the reference type is null
, so it is as the
specification says. To justify this, or rather to teach it to the compiler,
.NET Core 3.0 added a new special attribute to the
TryGetValue
†3. Specifically, the second parameter is
annotated with the
MaybeNullWhenAttribute
as
follows:
public bool TryGetValue(TKey key, [MaybeNullWhen(false)] out TValue value)
†3 Confirmed with .NET Core SDK 3.0.100-preview9.
[MaybeNullWhen(false)]
tells the compiler that “When the return value
of the method is false
, the parameter can be null
even if the type of the
parameter is a non-nullable reference type.” This allows the compiler to
warn the code that accesses parameters of non-nullable reference types without
a null check, on the path where the return value of the method is false
.
In short, the standard library, which debuted before non-nullable reference
types were born, is designed with the assumption that the value of the
reference types can be null
, so it is somewhat incompatible with non-null
reference types. Therefore, the C# language designers bravely added the new
attributes to the standard library that correspond to the annotations of the
Checker Framework in Java and updated the compiler to track nullability.
To solve that, at first glance, it seems that you could change the signature as follows:
public bool TryGetValue(TKey key, out TValue? value)
However, this doesn't work. The reason is explained in detail in the article on Microsoft Developer Blogs, Try out Nullable Reference Types [3].
To summarize:
Types | Examples | Can be null ? |
default |
---|---|---|---|
V |
int , bool |
Never | 0 , false , etc. |
V? , Nullable<V> |
int? , Nullable<int> |
Yes | null |
R |
string |
Yes | null |
R? |
string? |
Yes | null |
where V
and R
are used as the non-nullable value and reference types,
respectively.
LINQ
Now consider an operation to retrieve the first element in an array that meets the criteria. Take a look at the following code:
var firstFavorite = new[] { "foo", "bar", "baz" }
.Where(matchesFavorite)
.FirstOrDefault();
// 'firstFavorite' can be 'null' ... Hmm!?
if (firstFavorite is {})
{
...
}
// Or
if (firstFavorite is string s)
{
...
}
The FirstOrDefault()
method returns default(T)
if IEnumerable<T>
which
this
represents is empty, otherwise it returns the first element. And the
type of its return value is T
. In this example, T
is a non-nullable
reference type, string
, so it returns null
for the empty
IEnumerable<string>
. This was usual before C# 8, but now unusual. Because,
although the return value may be null
if its type is string?
, it must not
be null
if string
.
The return value of FirstOrDefault()
should be annotated with the
MaybeNullAttribute
to allow the return value to be null
as follows:
[return: MaybeNull]
public static TSource FirstOrDefault<TSource>(this IEnumerable<TSource> source)
[return: MaybeNull]
tells the compiler that “The return value of a
method can be null
, even if its type is a non-nullable reference type.”
This allows the compiler to warn code that accesses the return value of the
method without a null check.
Of course, you can use DefaultIfEmpty(defaultValue).First()
instead of
FirstOrDefault()
. However, unless you can use the return value in such a way
as to do with the null object pattern, after all, it is necessary to compare
the return value with defaultValue
.
Again, let's dare to write as follows:
var firstFavorite = new[] { "foo", "bar", "baz" }
.Where(matchesFavorite)
.Take(1);
foreach (var s in firstFavorite)
{
...
}
// Or...
var firstOrEmpty = new[] { "foo", "bar", "baz" }
.Where(matchesFavorite)
.Take(1)
.ToArray();
if (firstOrEmpty.Length > 0)
{
var s = firstOrEmpty[0];
...
}
To digress a little from the topic of null
, Take(1)
is sometimes more
useful than FirstOrDefault()
. It is when the element type T
is a value
type. If the value of default(T)
is unwieldy, it may be happy for you to
deal with IEnumerable<T>
containing at most one element rather than the
default value. (It was no more than a way of thinking to get a better
understanding in the Java part, but it's practical in C#.)
The options for dealing with at most one (i.e., zero or one) in C# are as follows:
Number of instances | Nullable types/Comprehensions | null /Alternatives |
---|---|---|
At most 1 | T? |
null |
0 or more | T[] |
Array.Empty<T> |
0 or more | IEnumerable<T> |
Enumerable.Empty<T> |
Implicit and explicit operators for nullable value types
With nullable value types, you can assign not only an expression of type T?
and null
but also an expression of type T
to a variable of type T?
. This
is because the implicit
operator of type Nullable<T>
representing the
conversion from T?
to T
applies. For example, if T
is int
:
int? v1 = null;
int? v2 = 123;
Values on the right side are implicitly converted to type Nullable<int>
.
(The implicit
operator to convert from int
to int?
applies.) That is,
it is equivalent to:
var v1 = new Nullable<int>();
var v2 = new Nullable<int>(123);
On the contrary, assigning an expression of type T?
to a variable of type T
requires an explicit type conversion (i.e., the explicit
operator to convert
from T?
to T
). So, the following example results in compilation errors:
int? maybeInt = 123;
int intValue = maybeInt;
It can be done without the errors if you use the explicit type conversions (i.e., casts to int
) as follows:
int? maybeInt = 123;
var intValue = (int)maybeInt;
However, this is equivalent to the following code:
int? maybeInt = 123;
var intValue = maybeInt.Value;
The Value
property of Nullable<T>
returns the value if the object has that.
Otherwise, it throws the exception InvalidOperationException
. Thus, you
should be careful that the explicit type conversions also throw an exception if
there is no value at converting to type T
.
Checking the presence of a value for nullable value types
You can check whether an expression of type Nullable<T>
has a value, using
the HasValue
property as follows:
int? maybeInt = ...
if (maybeInt.HasValue) {
var intValue = maybeInt.Value;
...
}
Alternatively, you can also check that, comparing an expression of type T
with null
as follows:
int? maybeInt = ...
// The next statement is equivalent to 'if (maybeInt is {}) {'
// or 'if (!(maybeInt is null)) {'
if (maybeInt != null) {
var intValue = maybeInt.Value;
...
}
You can also apply pattern matching:
int? maybeInt = ...
// The next statement is equivalent to 'if (maybeInt is int intValue) {'
if (maybeInt is {} intValue) {
...
}
Note that there is no Value
or HasValue
property for nullable reference
types. Again, for nullable reference types, this is obvious because T
and
T?
are virtually the same type. However, of course, comparison with null
and pattern matching can be applied in the same way.
Lifted operators for nullable value types
The nullable value type allows you to apply operators of type T
as is to type
T?
. The operators of type T
which are applied to type T?
are called
lifted operators.
The result of the operations is an object of type T?
, which has no value if
either or both operands have no value (i.e., null
), or which has the value of
the result of the operation if both operands have the value. For example, if
a
and b
are objects of type int?
, the following code:
var c = a + b;
is almost equivalent to as follows:
// You can also use '&&' instead of '&'.
var c = (a.HasValue & b.HasValue)
? new Nullable<int>(a.Value + b.Value)
: new Nullable<int>();
However, only the operations of type bool?
are subject to special rules. See
Nullable Boolean logical operators
for more information.
Boxing and unboxing nullable value types
For the nullable value types, boxing an object of type T?
results in null
if
the object has no value, otherwise the result of boxing the value of type T
that the object has. Here is an example:
int? maybeInt = ...
object boxedInt = maybeInt;
This is equivalent to the following code:
int? maybeInt = ...
var boxedInt = (maybeInt.HasValue)
? (object)maybeInt.Value
: null;
An object boxing a value of type T
can also be unboxed into another object of
type T?
. Here is an example:
int intValue = ...
object boxedInt = intValue;
int? maybeInt = (int?)boxedInt;
Note that there is no boxing or unboxing for nullable reference types, of course, because they are reference types.
?.
and ?[]
operators
?.
and ?[]
operators are the
null-conditional operators†4.
†4 According to Wikipedia [4], an operator with the same meaning as this operator is defined in other programming languages, but for now, its terminology varies from language to language.
When expr
is an expression of nullable types, expr?.Member
is “access
to Member
of expr
only if expr
is not null
.” If expr
is null
,
there is no access to Member
, and the expression results in null
unless
Member
is of type void
.
Similarly, expr?[index]
is “access to the indexer this[int]
of expr
only if expr
is non-null.” If expr
is null
, the expression results
in null
without access to this[int]
.
Both result in compilation errors if expr
is of non-nullable value types.
More precisely, if it is unknown at compile time whether the type of
Member
orthis[int]
is a reference or value type (for example, if it is a generic type parameter and it has no type constraint), it results in a compilation error.
To summarize:
The type of expr |
expr |
The result of expr?.Member , expr?[int] |
---|---|---|
Nullable types or Non- nullable reference types |
null |
Nothing if the type is void ,
null otherwise |
not null |
expr.Member , expr[int] |
|
Non-nullable value types | never null |
A compilation error |
For example, if Member
represents invoking “a method that returns
void
,” or “a delegate such as Action
,” the result is
similar to as follows†5:
if (expr is {})
{
expr.Member();
}
Otherwise, as long as Member
returns a value or reference, it is null
if
expr
is null
, or expr.Member
if expr
is not null
. Assuming that R
and V
are non-nullable reference types and non-nullable value types
respectively, the results are as follows:
The type ofMember |
The result simular to expr?.Member †5 |
The type of the expression |
---|---|---|
R /R? |
(expr is null) ? null : expr.Member |
R? |
V |
(expr is null) ? (V?)null : expr.Member |
V? |
V? |
(expr is null) ? null : expr.Member |
V? |
For an indexer, it is null
if expr
is null
, or expr[index]
if expr
is
not null
. Similarly, with R
and V
, the results are as follows:
The type ofthis[int] |
The result simular to expr?[index] †5 |
The type of the expression |
---|---|---|
R /R? |
(expr is null) ? null : expr[index] |
R? |
V |
(expr is null) ? (V?)null : expr[index] |
V? |
V? |
(expr is null) ? null : expr[index] |
V? |
†5 In both
expr?.Member
andexpr?[index]
,expr
is evaluated only once.
However, unlike Swift, if you use ?.
and ?[]
as l-values, you get
compilation errors as follows:
public sealed class Program
{
public static void Main()
{
var m = new Program();
// Sets values with 'SetBar()' and 'SetChar()'
// if 'MaybeFoo' is not 'null'
m.MaybeFoo?.SetBar("hello");
m.MaybeFoo?.SetChar(0, 'h');
// The next statements result in compilation errors,
// even if we wish the same result as above.
m.MaybeFoo?.Bar = "hello";
m.MaybeFoo?[0] = 'h';
}
public Foo? MaybeFoo { get; set; }
public sealed class Foo
{
private char[] table = {};
public string Bar { get; set; } = "";
public char this[int k]
{
get => table[k];
set => table[k] = value;
}
public void SetBar(string newBar)
=> Bar = newBar;
public void SetChar(int k, char c)
=> this[k] = c;
}
}
In particular, when the type of Member
or this[int]
is non-nullable,
the ?.
and ?[]
operators perform operations to cause null
to land.
In other words, it is possible that these operators cause ?
infection so
that R?
and V?
, which did not exist before, are born newly. Therefore,
these are operations to defer disposition of null
. They are similar to
such as the misuse of null object patterns, catching NullReferenceException
(NRE), etc., so they might cause null
to slip through null checks in
unexpected places. Hence, we should not use ?.
and ?[]
(instead of .
and
[]
) easily.
??
and ??=
operators
??
and ??=
operators are the
null-coalescing operator†6
and null-coalescing assignment operator, respectively.
†6 Like null-conditional operators, other programming languages also have operators defined with the same meaning. See Wikipedia [5].
expr1 ?? expr2
†7 is equivalent to as follows:
(expr1 is {}) ? expr1 : expr2
However, expr1
is evaluated only once in the case of expr1 ?? expr2
.
If expr1
is of nullable value types, ??
operators do operation similar to
GetValueOrDefault(T)
method of Nullable<T>
, except that expr2
is
evaluated only if expr1
is null
.
expr1 ?? throw new Exception()
†7 throws an exception when
expr1
is null
, otherwise it is an expression that returns expr1
.
variable ??= expr
†7 is equivalent to as follows:
if (variable is null)
{
variable = expr;
}
†7 If
expr1
orvariable
is of non-nullable value types, a compilation error occurs.
In particular, if expr2
is of non-nullable types, ??
operators
prevent null
from the landing. They dispose of null
. Conversely, if
expr2
is of nullable types, we should not use ??
operators easily,
as well as ?.
and ?[]
operators.
!
postfix operators
null-forgiveness operators or null forgiving operators
treat an expression of nullable reference types (e.g., null
itself or the
expression that can be null
) as that of non-nullable reference types. (It is
substantially a directive to the compiler so as not to issue the warnings.) You
can place postfix !
with an expression of nullable reference types to
suppress the warnings, as follows:
public sealed class Foo
{
public Foo()
{
// 'NonNullable = null;' causes the warning.
NonNullable = null!;
}
public string NonNullable { get; }
public void RaiseNoWarnings()
{
string? maybeNull = null;
// 'string t = maybeNull;' causes the warning.
string t = maybeNull!;
// 'var n = maybeNull.Length;' causes the warning.
var n = maybeNull!.Length;
}
}
Just to be sure, don't postfix !
if you don't understand the compiler's
warning and just want to clear it. If you do that, you should be surprised
later.
The compiler ignores
!
if you postfix!
with an expression of non-nullable types. Note also that the postfix!
with nullable value types just suppresses warnings (unlike Unconditional Unwrapping in Swift). If you postfix!
, as if it were theValue
property ofNullable<T>
to get a value (or throw an exception if there is no value), however, you will not get the value but just a compilation error.
Constraints of generic type parameters
Type parameters are troublesome. Suppose there is a class Foo<T>
. So its
type parameter T
may not only be a non-nullable type such as int
or
string
but also a nullable type such as int?
or string?
. Therefore,
parameters and return values of type T?
go wrong. The types that have a
generic type parameter T
and contain the T?
types result in compilation
error as follows:
public sealed class Foo<T>
// Uncommenting the next line will remove the errors:
// where T : class
// or:
// where T : struct
{
public T? Default { get; } = default;
public void DoSomething(T? t)
{
}
public void DoAnything()
{
T? bar;
}
}
Limiting the type parameter T
to a non-nullable reference type (class
) or a
non-nullable value type (struct
) with type constraints resolves these errors.
That is, we must clear it whether
T?
is in truth a reference typeT
or a value typeNullable<T>
. It's like an oath or a curse in C#.
There is also notnull
constraint representing that the type parameter T
is
a non-nullable type. This is something like “class
or struct
.”
If T
has a notnull
constraint, Foo<T>
also cannot have the T?
types
because it is unclear whether T
is a reference type or a value type. And if
you restrict the T
of Foo<T>
to be notnull
, of course, you should no
longer specify a nullable type for T
. Let's check it out it the following
code:
public sealed class Foo<T>
where T : notnull
{
public Foo(T t)
{
}
}
public static class Foo
{
public static Foo<T> NewFoo<T>(T t)
where T : notnull
=> new Foo<T>(t);
public static void RaiseWarnings(string? maybeNull)
{
_ = NewFoo(1);
_ = NewFoo("a");
int? i = 1;
string? notNull = "a";
_ = NewFoo(notNull);
// The next two lines cause the warnings.
_ = NewFoo(i);
_ = NewFoo(maybeNull);
}
}
It is an interesting feature of nullable reference types that NewFoo(notNull)
does not cause the warnings. The type inference treats even the expressions of
type string?
as of type string
, as long as the data flow analysis believes
that the value should never be null
. On the other hand, this never happens
for value types.
Finally, although it is complicated, there is also a class?
constraint. If
T
has a class?
constraint, Foo<T>
cannot contain any T?
type because
T
itself is a nullable type. However, you can assign a reference type to T
regardless of whether it is nullable or not.
The above is summarized as follows:
Constraint of T |
Foo<T> contains T? |
Foo<V> |
Foo<V?> |
Foo<R> |
Foo<R?> |
---|---|---|---|---|---|
Nothing | A compilation error | ○ | ○ | ○ | ○ |
where T : class |
OK | ○ | |||
where T : class? |
A compilation error | ○ | ○ | ||
where T : struct |
OK | ○ | |||
where T : notnull |
A compilation error | ○ | ○ |
where R
is a non-nullable reference type, and V
is a non-nullable value type.
typeof
operator
The typeof
operator cannot be used on a nullable reference type. This topic
is similar to is
pattern matching and, for example, typeof(string?)
results
in a compilation error.
In contrast, typeof(int?)
returns the Type
object representing Nullable<int>
. With the results of object.GetType()
, to summarize:
// 'Console.WriteLine(typeof(string?));'
// results in a compilation error.
// System.Nullable`1[System.Int32]
Console.WriteLine(typeof(int?));
string? s = "a";
// System.String
Console.WriteLine(s.GetType());
int? i = 1;
// System.Int32
Console.WriteLine(i.GetType());
Again, if you evaluate typeof(T)
with specifying R?
for the type parameter T
, the result is typeof(R)
.