具有 null 或未定义值的 JavaScript 字符串串联行为

你可能知道,在 JavaScript'' + null = "null"'' + undefined = "undefined"中(在大多数浏览器中我都可以测试: Firefox,Chrome 和 IE)。我想知道这个奇怪现象的起源(布兰登 · 艾希的脑子里到底在想些什么? !)以及在 ECMA 的未来版本中是否有任何改变它的目标。实际上,使用 'sthg' + (var || '')连接字符串和变量,并使用第三方框架(如 下划线或其他框架)来使用锤子敲打果冻指甲,这种做法令人非常沮丧。

编辑:

为了满足 StackOverflow 要求的标准,并澄清我的问题,它有三个方面:

  • String串联中,JS 将 nullundefined转换为它们的字符串值,这种古怪现象背后的历史是什么?
  • 在未来的 ECMAScript 版本中是否有可能改变这种行为?
  • String与潜在的 nullundefined对象连接起来而不会陷入这个问题(在字符串的中间得到一些 "null""undefined") ,最好的方法是什么?根据 最漂亮的的主观标准,我的意思是: 简短、干净和有效。不用说,'' + (obj ? obj : '')不是真的很漂亮..。
88373 次浏览

To add null and '' they need to meet a minimum common type criterium which in this case is a string type.

null is converted to "null" for this reason and as they are string the two are concatenated.

The same happens with numbers:

4 + '' = '4'

as there is a string in there which can't be converted to any number, so the 4 will be converted to string instead.

You can use Array.prototype.join to ignore undefined and null:

['a', 'b', void 0, null, 6].join(''); // 'ab6'

According to the spec:

If element is undefined or null, Let next be the empty String; otherwise, let next be ToString(element).


Given that,

  • What is the history behind the oddity that makes JS converting null or undefined to their string value in String concatenation?

    In fact, in some cases, the current behavior makes sense.

    function showSum(a,b) {
    alert(a + ' + ' + b + ' = ' + (+a + +b));
    }

    For example, if the function above is called without arguments, undefined + undefined = NaN is probably better than + = NaN.

    In general, I think that if you want to insert some variables in a string, displaying undefined or null makes sense. Probably, Eich thought that too.

    Of course, there are cases in which ignoring those would be better, such as when joining strings together. But for those cases you can use Array.prototype.join.

  • Is there any chance for a change in this behavior in future ECMAScript versions?

    Most likely not.

    Since there already is Array.prototype.join, modifying the behavior of string concatenation would only cause disadvantages, but no advantages. Moreover, it would break old codes, so it wouldn't be backwards compatible.

  • What is the prettiest way to concatenate String with potential null or undefined?

    Array.prototype.join seems to be the simplest one. Whether it's the prettiest or not may be opinion-based.

Is there any chance for a change in this behavior in future ECMAScript versions?

I would say the chances are very slim. And there are several reasons:

We already know what ES5 and ES6 look like

The future ES versions are already done or in draft. Neither one, afaik, changes this behavior. And the thing to keep in mind here is that it will take years for these standards to be established in browsers in the sense that you can write applications with these standards without relying on proxy tools that compile it to actual Javascript.

Just try to estimate the duration. Not even ES5 is fully supported by the majority of browsers out there and it will probably take another few years. ES6 is not even fully specified yet. Out of the blue, we are looking at at least another five years.

Browsers do their own things

Browsers are known to make their own decisions on certain topics. You don't know whether all browsers will fully support this feature in exactly the same way. Of course you would know once it is part of the standard, but as of now, even if it was announced to become part of ES7, it would only be speculation at best.

And browsers may make their own decision here especially because:

This change is breaking

One of the biggest things about standards is that they usually try to be backwards compatible. This is especially true for the web where the same code has to run on all kinds of envrionments.

If the standard introduces a new feature and it's not supported in old browsers, that's one thing. Tell your client to update their browser to use the site. But if you update your browser and suddenly half the internet breaks for you, that's a bug uhm-no.

Sure, this particular change is unlikely to break a lot of scripts. But that's usually a poor arguments because a standard is universal and has to take every chance into account. Just consider

"use strict";

as the instruction to switch to strict mode. It goes to show huw much effort a standard puts into trying to make everything compatible, because they could've made strict mode the default (and even only mode). But with this clever instruction, you allow old code to run without a change and still can take advantage of the new, stricter mode.

Another example for backwards compatibility: the === operator. == is fundamentally flawed (though some people disagree) and it could've just changed its meaning. Instead, === was introduced, allowing old code to still run without breaking; at the same time allowing new programs to be written using a more strict check.

And for a standard to break compatibility, there has to be a very good reason. Which brings us to

There is just no good reason

Yes, it bugs you. That's understandable. But ultimately, it is nothing that can't be solved very easily. Use ||, write a function – whatever. You can make it work at almost no cost. So what is really the benefit for investing all the time and effort into analyzing this change which we know is breaking anyway? I just don't see the point.

Javascript has several weak points in its design. And it has increasingly become a bigger issue as the language became more and more important and powerful. But while there are very good reasons to change a lot of its design, other things just arent't meant to be changed.


Disclaimer: This answer is partly opinion-based.

What is the prettiest way to concatenate String with potential null or undefined object without falling into this problem [...]?

There are several ways, and you partly mentioned them yourself. To make it short, the only clean way I can think of is a function:

const Strings = {};
Strings.orEmpty = function( entity ) {
return entity || "";
};


// usage
const message = "This is a " + Strings.orEmpty( test );

Of course, you can (and should) change the actual implementation to suit your needs. And this is already why I think this method is superior: it introduced encapsulation.

Really, you only have to ask what the "prettiest" way is, if you don't have encapsulation. You ask yourself this question because you already know that you are going to get yourself into a place where you cannot change the implementation anymore, so you want it to be perfect right away. But that's the thing: requirements, views and even envrionments change. They evolve. So why not allow yourself to change the implementation with as little as adapting one line and perhaps one or two tests?

You could call this cheating, because it doesn't really answer how to implement the actual logic. But that's my point: it doesn't matter. Well, maybe a little. But really, there is no need to worry because of how simple it would be to change. And since it's not inlined, it also looks a lot prettier – whether or not you implement it this way or in a more sophisticated way.

If, throughout your code, you keep repeating the || inline, you run into two problems:

  • You duplicate code.
  • And because you duplicate code, you make it hard to maintain and change in the future.

And these are two points commonly known to be anti-patterns when it comes to high-quality software development.

Some people will say that this is too much overhead; they will talk about performance. It's non-sense. For one, this barely adds overhead. If this is what you are worried about, you chose the wrong language. Even jQuery uses functions. People need to get over micro-optimization.

The other thing is: you can use a code "compiler" = minifier. Good tools in this area will try to detect which statements to inline during the compilation step. This way, you keep your code clean and maintainable and can still get that last drop of performance if you still believe in it or really do have an environment where this matters.

Lastly, have some faith in browsers. They will optimize code and they do a pretty darn good job at it these days.

The ECMA Specification

Just to flesh out the reason it behaves this way in terms of the spec, this behavior has been present since version one. The definition there and in 5.1 are semantically equivalent, I'll show the 5.1 definitions.

Section 11.6.1: The Addition operator ( + )

The addition operator either performs string concatenation or numeric addition.

The production AdditiveExpression : AdditiveExpression + MultiplicativeExpression is evaluated as follows:

  1. Let lref be the result of evaluating AdditiveExpression.
  2. Let lval be GetValue(lref).
  3. Let rref be the result of evaluating MultiplicativeExpression.
  4. Let rval be GetValue(rref).
  5. Let lprim be ToPrimitive(lval).
  6. Let rprim be ToPrimitive(rval).
  7. If Type(lprim) is String or Type(rprim) is String, then
    a. Return the String that is the result of concatenating ToString(lprim) followed by ToString(rprim)
  8. Return the result of applying the addition operation to ToNumber(lprim) and ToNumber(rprim). See the Note below 11.6.3.

So, if either value ends up being a String, then ToString is used on both arguments (line 7) and those are concatenated (line 7a). ToPrimitive returns all non-object values unchanged, so null and undefined are untouched:

Section 9.1 ToPrimitive

The abstract operation ToPrimitive takes an input argument and an optional argument PreferredType. The abstract operation ToPrimitive converts its input argument to a non-Object type ... Conversion occurs according to Table 10:

For all non-Object types, including both Null and Undefined, [t]he result equals the input argument (no conversion). So ToPrimitive does nothing here.

Finally, Section 9.8 ToString

The abstract operation ToString converts its argument to a value of type String according to Table 13:

Table 13 gives "undefined" for the Undefined type and "null" for the Null type.

Will it change? Is it even an "oddity"?

As others have pointed out, this is very unlikely to change as it would break backward compatibility (and bring no real benefit), even more so given that this behavior is the same since the 1997 version of the spec. I would also not really consider it an oddity.

If you were to change this behavior, would you change the definition of ToString for null and undefined or would you special-case the addition operator for these values? ToString is used many, many places throughout the spec and "null" seems like an uncontroversial choice for representing null. Just to give a couple of examples, in Java "" + null is the string "null" and in Python str(None) is the string "None".

Workaround

Others have given good workarounds, but I would add that I doubt you want to use entity || "" as your strategy since it resolves true to "true" but false to "". The array join in this answer has the more expected behavior, or you could change the implementation of this answer to check entity == null (both null == null and undefined == null are true).

Why not filter to check for truthyness and then join?

const concatObject = (obj, separator) =>
Object.values(obj)
.filter((val) => val)
.join(separator);




let myAddress = {
street1: '123 Happy St',
street2: undefined,
city: null,
state: 'DC',
zip: '20003'
}


concatObject(myAddress, ', ')
> "123 Happy St, DC, 20003"

Use coalescing - syntax is ??

Examples, can test in your browser console:

  • "Hello " + (null ?? "")
  • "Hello " + (undefined ?? "")

Both will yield: 'Hello '