为什么 Python 列表在排序时要慢一些？

小开

最佳答案

Cache misses. When N int objects are allocated back-to-back, the memory reserved to hold them tends to be in a contiguous chunk. So crawling over the list in allocation order tends to access the memory holding the ints' values in sequential, contiguous, increasing order too.

Shuffle it, and the access pattern when crawling over the list is randomized too. Cache misses abound, provided there are enough different int objects that they don't all fit in cache.

At r==1, and r==2, CPython happens to treat such small ints as singletons, so, e.g., despite that you have 10 million elements in the list, at r==2 it contains only (at most) 100 distinct int objects. All the data for those fit in cache simultaneously.

Beyond that, though, you're likely to get more, and more, and more distinct int objects. Hardware caches become increasingly useless then when the access pattern is random.

Illustrating:

>>> from random import randint, seed
>>> seed(987987987)
>>> for x in range(1, 9):
...     r = 10 ** x
...     js = [randint(1, r) for _ in range(10_000_000)]
...     unique = set(map(id, js))
...     print(f"{r:12,} {len(unique):12,}")
...
10           10
100          100
1,000    7,440,909
10,000    9,744,400
100,000    9,974,838
1,000,000    9,997,739
10,000,000    9,999,908
100,000,000    9,999,998

小开

The answer is likely locality of data. Integers above a certain size limit are allocated dynamically. When you create the list, the integer objects are allocated from (mostly) nearby memory. So when you loop through the list, things tend to be in cache and the hardware prefetcher can put them there.

In the sorted case, the objects get shuffled around, resulting in more cache misses.

小开

As the others said, cache misses. Not the values/sortedness. The same sorted values, but with freshly sequentially created objects, is fast again (actually even a bit faster than the not case):

s_new = [--x for x in s_yes]

Just picking one size:

For rand 1000000
yes 3.6270992755889893
not 1.198620080947876
new 1.02010178565979

Looking at address differences from one element to the next (just 10⁶ elements) shows that especially for s_new, the elements are nicely sequentially arranged in memory (99.2% of the time the next element came 32 bytes later), while for s_yes they're totally not (just 0.01% came 32 bytes later):

s_yes:
741022 different address differences occurred. Top 5:
Address difference 32 occurred 102 times.
Address difference 0 occurred 90 times.
Address difference 64 occurred 37 times.
Address difference 96 occurred 17 times.
Address difference 128 occurred 9 times.


s_not:
1048 different address differences occurred. Top 5:
Address difference 32 occurred 906649 times.
Address difference 96 occurred 8931 times.
Address difference 64 occurred 1845 times.
Address difference -32 occurred 1816 times.
Address difference -64 occurred 1812 times.


s_new:
19 different address differences occurred. Top 5:
Address difference 32 occurred 991911 times.
Address difference 96 occurred 7825 times.
Address difference -524192 occurred 117 times.
Address difference 0 occurred 90 times.
Address difference 64 occurred 37 times.

Code for that:

from collections import Counter


for s in 's_yes', 's_not', 's_new':
print(s + ':')
ids = list(map(id, eval(s)))
ctr = Counter(j - i for i, j in zip(ids, ids[1:]))
print('   ', len(ctr), 'different address differences occurred. Top 5:')
for delta, count in ctr.most_common(5):
print(f'    Address difference {delta} occurred {count} times.')
print()