JavaScript 中对象/数组的性能如何? (特别是针对 Google V8)

在 JavaScript (特别是 GoogleV8)中与数组和对象相关的性能将非常有趣。我在互联网上找不到任何关于这个主题的综合性文章。

我知道有些对象使用类作为它们的底层数据结构。如果有很多属性,它有时会被视为一个哈希表?

我也理解有时数组被当作 C + + 数组(即快速随机索引,慢速删除和调整大小)。而且,在其他时候,它们更像对象(快速索引、快速插入/删除、更多内存)。而且,也许有时候它们被存储为链表(即缓慢的随机索引,快速删除/插入在开始/结束)

在 JavaScript 中,Array/Object 检索和操作的精确性能如何? (特别是针对 Google V8)

更具体地说,它对性能的影响是什么:

  • 向对象添加属性
  • 从 Object 中移除属性
  • 对象中的属性进行索引
  • 向 Array 添加项
  • 从 Array 中移除项
  • 为数组中的项建立索引
  • 调用 Array.pop ()
  • 调用 Array.push ()
  • 调用 Array.shift ()
  • 调用 Array.unshift ()
  • 调用 Array.slice ()

任何文章或链接,以获得更多的细节,也将不胜感激。 :)

编辑: 我真的很好奇 JavaScript 数组和对象是如何在底层工作的。另外,在 背景中,V8引擎“知道”到“切换”到另一个数据结构?

例如,假设我创建一个数组,其中包含..。

var arr = [];
arr[10000000] = 20;
arr.push(21);

这到底是怎么回事?

或者... 这个怎么样?

var arr = [];
//Add lots of items
for(var i = 0; i < 1000000; i++)
arr[i] = Math.random();
//Now I use it like a queue...
for(var i = 0; i < arr.length; i++)
{
var item = arr[i].shift();
//Do something with item...
}

对于传统的数组,性能会非常糟糕; 然而,如果使用 LinkedList... ... 也没有那么糟糕。

47366 次浏览

At a basic level that stays within the realms of JavaScript, properties on objects are much more complex entities. You can create properties with setters/getters, with differing enumerability, writability, and configurability. An item in an array isn't able to be customized in this way: it either exists or it doesn't. At the underlying engine level this allows for a lot more optimization in terms of organizing the memory that represents the structure.

In terms of identifying an array from an object (dictionary), JS engines have always made explicit lines between the two. That's why there's a multitude of articles on methods of trying to make a semi-fake Array-like object that behaves like one but allows other functionality. The reason this separation even exists is because the JS engines themselves store the two differently.

Properties can be stored on an array object but this simply demonstrates how JavaScript insists on making everything an object. The indexed values in an array are stored differently from any properties you decide to set on the array object that represents the underlying array data.

Whenever you're using a legit array object and using one of the standard methods of manipulating that array you're going to be hitting the underlying array data. In V8 specifically, these are essentially the same as a C++ array so those rules will apply. If for some reason you're working with an array that the engine isn't able to determine with confidence is an array, then you're on much shakier ground. With recent versions of V8 there's more room to work though. For example, it's possible to create a class that has Array.prototype as its prototype and still gain efficient access to the various native array manipulation methods. But this is a recent change.

Specific links to recent changes to array manipulation may come in handy here:

As a bit of extra, here's Array Pop and Array Push directly from V8's source, both implemented in JS itself:

function ArrayPop() {
if (IS_NULL_OR_UNDEFINED(this) && !IS_UNDETECTABLE(this)) {
throw MakeTypeError("called_on_null_or_undefined",
["Array.prototype.pop"]);
}


var n = TO_UINT32(this.length);
if (n == 0) {
this.length = n;
return;
}
n--;
var value = this[n];
this.length = n;
delete this[n];
return value;
}




function ArrayPush() {
if (IS_NULL_OR_UNDEFINED(this) && !IS_UNDETECTABLE(this)) {
throw MakeTypeError("called_on_null_or_undefined",
["Array.prototype.push"]);
}


var n = TO_UINT32(this.length);
var m = %_ArgumentsLength();
for (var i = 0; i < m; i++) {
this[i+n] = %_Arguments(i);
}
this.length = n + m;
return this.length;
}

I created a test suite, precisely to explore these issues (and more) (archived copy).

And in that sense, you can see the performance issues in this 50+ test case tester (it will take a long time).

Also as its name suggest, it explores the usage of using the native linked list nature of the DOM structure.

(Currently down, rebuilt in progress) More details on my blog regarding this.

The summary is as followed

  • V8 Array is Fast, VERY FAST
  • Array push / pop / shift is ~approx 20x+ faster than any object equivalent.
  • Surprisingly Array.shift() is fast ~approx 6x slower than an array pop, but is ~approx 100x faster than an object attribute deletion.
  • Amusingly, Array.push( data ); is faster than Array[nextIndex] = data by almost 20 (dynamic array) to 10 (fixed array) times over.
  • Array.unshift(data) is slower as expected, and is ~approx 5x slower than a new property adding.
  • Nulling the value array[index] = null is faster than deleting it delete array[index] (undefined) in an array by ~approx 4x++ faster.
  • Surprisingly Nulling a value in an object is obj[attr] = null ~approx 2x slower than just deleting the attribute delete obj[attr]
  • Unsurprisingly, mid array Array.splice(index,0,data) is slow, very slow.
  • Surprisingly, Array.splice(index,1,data) has been optimized (no length change) and is 100x faster than just splice Array.splice(index,0,data)
  • unsurprisingly, the divLinkedList is inferior to an array on all sectors, except dll.splice(index,1) removal (Where it broke the test system).
  • BIGGEST SURPRISE of it all [as jjrv pointed out], V8 array writes are slightly faster than V8 reads =O

Note: These metrics applies only to large array/objects which v8 does not "entirely optimise out". There can be very isolated optimised performance cases for array/object size less then an arbitrary size (24?). More details can be seen extensively across several google IO videos.

Note 2: These wonderful performance results are not shared across browsers, especially *cough* IE. Also the test is huge, hence I yet to fully analyze and evaluate the results : please edit it in =)

Updated Note (dec 2012): Google representatives have videos on youtubes describing the inner workings of chrome itself (like when it switches from a linkedlist array to a fixed array, etc), and how to optimize them. See GDC 2012: From Console to Chrome for more.

While running under node.js 0.10 (built on v8) I was seeing CPU usage that seemed excessive for the workload. I traced one performance problem to a function that was checking for the existence of a string in an array. So I ran some tests.

  • loaded 90,822 hosts
  • loading config took 0.087 seconds (array)
  • loading config took 0.152 seconds (object)

Loading 91k entries into an array (with validate & push) is faster than setting obj[key]=value.

In the next test, I looked up every hostname in the list one time (91k iterations, to average the lookup time):

  • searching config took 87.56 seconds (array)
  • searching config took 0.21 seconds (object)

The application here is Haraka (a SMTP server) and it loads the host_list once at startup (and after changes) and subsequently performs this lookup millions of times during operation. Switching to an object was a huge performance win.

I'd like to complement existing answers with an investigation to the question of how implementations behave regarding growing arrays: If they implement them the "usual" way, one would see many quick pushes with rare, interspersed slow pushes at which point the implementation copies the internal representation of the array from one buffer to a larger one.

You can see this effect very nicely, this is from Chrome:

16: 4ms
40: 8ms 2.5
76: 20ms 1.9
130: 31ms 1.7105263157894737
211: 14ms 1.623076923076923
332: 55ms 1.5734597156398105
514: 44ms 1.5481927710843373
787: 61ms 1.5311284046692606
1196: 138ms 1.5196950444726811
1810: 139ms 1.5133779264214047
2731: 299ms 1.5088397790055248
4112: 341ms 1.5056755767118273
6184: 681ms 1.5038910505836576
9292: 1324ms 1.5025873221216042

Even though each push is profiled, the output contains only those that take time above a certain threshold. For each test I customized the threshold to exclude all the pushes that appear to be representing the fast pushes.

So the first number represents which element has been inserted (the first line is for the 17th element), the second is how long it took (for many arrays the benchmark is done for in parallel), and the last value is the division of the first number by that of the one in the former line.

All lines that have less than 2ms execution time are excluded for Chrome.

You can see that Chrome increases array size in powers of 1.5, plus some offset to account for small arrays.

For Firefox, it's a power of two:

126: 284ms
254: 65ms 2.015873015873016
510: 28ms 2.0078740157480315
1022: 58ms 2.003921568627451
2046: 89ms 2.0019569471624266
4094: 191ms 2.0009775171065494
8190: 364ms 2.0004885197850513

I had to put the threshold up quite a bit in Firefox, that's why we start at #126.

With IE, we get a mix:

256: 11ms 256
512: 26ms 2
1024: 77ms 2
1708: 113ms 1.66796875
2848: 154ms 1.6674473067915691
4748: 423ms 1.6671348314606742
7916: 944ms 1.6672283066554338

It's a power of two at first and then it moves to powers of five thirds.

So all common implementations use the "normal" way for arrays (instead of going crazy with ropes, for example).

Here's the benchmark code and here's the fiddle it's in.

var arrayCount = 10000;


var dynamicArrays = [];


for(var j=0;j<arrayCount;j++)
dynamicArrays[j] = [];


var lastLongI = 1;


for(var i=0;i<10000;i++)
{
var before = Date.now();
for(var j=0;j<arrayCount;j++)
dynamicArrays[j][i] = i;
var span = Date.now() - before;
if (span > 10)
{
console.log(i + ": " + span + "ms" + " " + (i / lastLongI));
lastLongI = i;
}
}