为什么缓存区域性对数组性能很重要?

在下面的 博客中,有一个关于数组优于链表的陈述:

数组具有更好的缓存区域性,可以在性能上产生相当大的差异。

这是什么意思? 我不明白缓存本地化如何能提供巨大的性能优势。

35550 次浏览

See my answer about spatial and temporal locality.

In particular, arrays are contiguous memory blocks, so large chunks of them will be loaded into the cache upon first access. This makes it comparatively quick to access future elements of the array. Linked lists on the other hand aren't necessarily in contiguous blocks of memory, and could lead to more cache misses, which increases the time it takes to access them.

Consider the following possible memory layouts for an array data and linked list l_data of large structs

Address      Contents       | Address      Contents
ffff 0000    data[0]        | ffff 1000    l_data
ffff 0040    data[1]        |   ....
ffff 0080    data[2]        | ffff 3460    l_data->next
ffff 00c0    data[3]        |   ....
ffff 0100    data[4]        | ffff 8dc0    l_data->next->next
| ffff 8e00    l_data->next->next->next
|   ....
| ffff 8f00    l_data->next->next->next->next

If we wanted to loop through this array, the first access to ffff 0000 would require us to go to memory to retrieve (a very slow operation in CPU cycles). However, after the first access the rest of the array would be in the cache, and subsequent accesses would be much quicker. With the linked list, the first access to ffff 1000 would also require us to go to memory. Unfortunately, the processor will cache the memory directly surrounding this location, say all the way up to ffff 2000. As you can see, this doesn't actually capture any of the other elements of the list, which means that when we go to access l_data->next, we will again have to go to memory.

Typically, when using an array you access items that are near each other. This is especially true when accessing an array sequentially.

When you access memory, a chunks of it are cached at various levels. Cache locality refers to the likelihood of successive operations being in the cache and thus being faster. In an array, you maximize the chances of sequential element access being in the cache.

With lists, by counter-example, there's no guarantee that items which appear sequentially in the list are actually arranged near eachother in memory. This means fewer cache hits, and degraded performance.