Null + true 怎么会是字符串?

既然 true不是字符串类型,那么 null + true怎么会是字符串呢?

string s = true;  //Cannot implicitly convert type 'bool' to 'string'
bool b = null + true; //Cannot implicitly convert type 'string' to 'bool'

这背后的原因是什么?

5449 次浏览

The reason for this is convenience (concatenating strings is a common task).

As BoltClock said, the '+' operator is defined on numeric types, strings, and can be defined for our own types as well (operator overloading).

If there is not an overloaded '+' operator on the argument's types and they are not numeric types, the compiler defaults to string concatenation.

The compiler inserts a call to String.Concat(...) when you concatenate using '+', and the implementation of Concat calls ToString on each object passed into it.

The reason why is because once you introduce the + then the C# operator binding rules come into play. It will consider the set of + operators available and select the best overload. One of those operators is the following

string operator +(string x, object y)

This overload is compatible with the argument types in the expression null + true. Hence it is selected as the operator and is evaluated as essentially ((string)null) + true which evaluates to the value "True".

Section 7.7.4 of the C# language spec contains the details around this resolution .

null will be cast to null string, and there is implicit converter from bool to string so the true will be cast to string and then, + operator will be applied: it's like: string str = "" + true.ToString();

if you check it with Ildasm:

string str = null + true;

it's as bellow:

.locals init ([0] string str)
IL_0000:  nop
IL_0001:  ldc.i4.1
IL_0002:  box        [mscorlib]System.Boolean
IL_0007:  call       string [mscorlib]System.String::Concat(object)
IL_000c:  stloc.0
var b = (null + DateTime.Now); // String
var b = (null + 1);            // System.Nullable<Int32> | same with System.Single, System.Double, System.Decimal, System.TimeSpan etc
var b = (null + new Object()); // String | same with any ref type

Crazy?? No, there must be a reason behind it.

Someone call Eric Lippert...

The compiler goes out hunting for an operator+() that can take a null argument first. None of the standard value types qualify, null is not a valid value for them. The one and only match is System.String.operator+(), there's no ambiguity.

The 2nd argument of that operator is also a string. That goes kapooey, cannot implicitly convert bool to string.

Interestingly, using Reflector to inspect what is generated, the following code:

string b = null + true;
Console.WriteLine(b);

is transformed into this by the compiler:

Console.WriteLine(true);

The reasoning behind this "optimization" is a bit weird I must say, and does not rhyme with the operator selection I would expect.

Also, the following code:

var b = null + true;
var sb = new StringBuilder(b);

is transformed into

string b = true;
StringBuilder sb = new StringBuilder(b);

where string b = true; is actually not accepted by the compiler.

Bizarre as this may seem, it's simply following the rules from the C# language spec.

From section 7.3.4:

An operation of the form x op y, where op is an overloadable binary operator, x is an expression of type X, and y is an expression of type Y, is processed as follows:

  • The set of candidate user-defined operators provided by X and Y for the operation operator op(x, y) is determined. The set consists of the union of the candidate operators provided by X and the candidate operators provided by Y, each determined using the rules of §7.3.5. If X and Y are the same type, or if X and Y are derived from a common base type, then shared candidate operators only occur in the combined set once.
  • If the set of candidate user-defined operators is not empty, then this becomes the set of candidate operators for the operation. Otherwise, the predefined binary operator op implementations, including their lifted forms, become the set of candidate operators for the operation. The predefined implementations of a given operator are specified in the description of the operator (§7.8 through §7.12).
  • The overload resolution rules of §7.5.3 are applied to the set of candidate operators to select the best operator with respect to the argument list (x, y), and this operator becomes the result of the overload resolution process. If overload resolution fails to select a single best operator, a binding-time error occurs.

So, let's walk through this in turn.

X is the null type here - or not a type at all, if you want to think of it that way. It's not providing any candidates. Y is bool, which doesn't provide any user-defined + operators. So the first step finds no user-defined operators.

The compiler then moves on to the second bullet point, looking through the predefined binary operator + implementations and their lifted forms. These are listing in section 7.8.4 of the spec.

If you look through those predefined operators, the only one which is applicable is string operator +(string x, object y). So the candidate set has a single entry. That makes the final bullet point very simple... overload resolution picks that operator, giving an overall expression type of string.

One interesting point is that this will occur even if there are other user-defined operators available on unmentioned types. For example:

// Foo defined Foo operator+(Foo foo, bool b)
Foo f = null;
Foo g = f + true;

That's fine, but it's not used for a null literal, because the compiler doesn't know to look in Foo. It only knows to consider string because it's a predefined operator explicitly listed in the spec. (In fact, it's not an operator defined by the string type... 1) That means that this will fail to compile:

// Error: Cannot implicitly convert type 'string' to 'Foo'
Foo f = null + true;

Other second-operand types will use some other operators, of course:

var x = null + 0; // x is Nullable<int>
var y = null + 0L; // y is Nullable<long>
var z = null + DayOfWeek.Sunday; // z is Nullable<DayOfWeek>

1 You may be wondering why there isn't a string + operator. It's a reasonable question, and I'm only guessing at the answer, but consider this expression:

string x = a + b + c + d;

If string had no special-casing in the C# compiler, this would end up as effectively:

string tmp0 = (a + b);
string tmp1 = tmp0 + c;
string x = tmp1 + d;

So that's created two unnecessary intermediate strings. However, because there's special support within the compiler, it's actually able to compile the above as:

string x = string.Concat(a, b, c, d);

which can create just a single string of exactly the right length, copying all the data exactly once. Nice.