How to find the cumulative sum of numbers in a list?

time_interval = [4, 6, 12]

I want to sum up the numbers like [4, 4+6, 4+6+12] in order to get the list t = [4, 10, 22].

I tried the following:

t1 = time_interval[0]
t2 = time_interval[1] + t1
t3 = time_interval[2] + t2
print(t1, t2, t3)  # -> 4 10 22
244924 次浏览

在 Python 2中,您可以像下面这样定义自己的生成器函数:

def accumu(lis):
total = 0
for x in lis:
total += x
yield total


In [4]: list(accumu([4,6,12]))
Out[4]: [4, 10, 22]

在 Python 3.2 + 中,你可以使用 itertools.accumulate():

In [1]: lis = [4,6,12]


In [2]: from itertools import accumulate


In [3]: list(accumulate(lis))
Out[3]: [4, 10, 22]

首先,您需要一个子序列的运行列表:

subseqs = (seq[:i] for i in range(1, len(seq)+1))

Then you just call sum on each subsequence:

sums = [sum(subseq) for subseq in subseqs]

(这不是最有效的方法,因为您要重复添加所有的前缀。但是对于大多数用例来说,这可能并不重要,而且如果不需要考虑运行总数,就更容易理解。)

如果您正在使用 Python 3.2或更新的版本,您可以使用 itertools.accumulate来完成:

sums = itertools.accumulate(seq)

如果您使用的是3.1或更早版本,那么您可以直接从文档中复制“等价于”源代码(除了在2.5或更早版本中将 next(it)更改为 it.next())。

试试这个:

result = []
acc = 0
for i in time_interval:
acc += i
result.append(acc)
values = [4, 6, 12]
total  = 0
sums   = []


for v in values:
total = total + v
sums.append(total)


print 'Values: ', values
print 'Sums:   ', sums

运行这个代码会给出

Values: [4, 6, 12]
Sums:   [4, 10, 22]
lst = [4, 6, 12]


[sum(lst[:i+1]) for i in xrange(len(lst))]

如果您正在寻找一个更有效的解决方案(更大的列表?)生成器可能是一个很好的调用(或者如果您真的关心性能,就使用 numpy)。

def gen(lst):
acu = 0
for num in lst:
yield num + acu
acu += num


print list(gen([4, 6, 12]))
In [42]: a = [4, 6, 12]


In [43]: [sum(a[:i+1]) for i in xrange(len(a))]
Out[43]: [4, 10, 22]

对于小列表,这比上面@Ashwini 的生成器方法 有点更快

In [48]: %timeit list(accumu([4,6,12]))
100000 loops, best of 3: 2.63 us per loop


In [49]: %timeit [sum(a[:i+1]) for i in xrange(len(a))]
100000 loops, best of 3: 2.46 us per loop

对于更大的列表,生成器肯定是一条出路... ..。

In [50]: a = range(1000)


In [51]: %timeit [sum(a[:i+1]) for i in xrange(len(a))]
100 loops, best of 3: 6.04 ms per loop


In [52]: %timeit list(accumu(a))
10000 loops, best of 3: 162 us per loop

如果你正在对这样的数组进行大量的数值计算,我建议使用 numpy,它带有一个累积和函数 cumsum:

import numpy as np


a = [4,6,12]


np.cumsum(a)
#array([4, 10, 22])

在这种情况下,Numpy 通常比纯 Python 快,参见与 @Ashwini's accumu的比较:

In [136]: timeit list(accumu(range(1000)))
10000 loops, best of 3: 161 us per loop


In [137]: timeit list(accumu(xrange(1000)))
10000 loops, best of 3: 147 us per loop


In [138]: timeit np.cumsum(np.arange(1000))
100000 loops, best of 3: 10.1 us per loop

但是当然,如果它是您唯一使用 numpy 的地方,那么可能不值得依赖它。

有点古怪,但似乎有效:

def cumulative_sum(l):
y = [0]
def inc(n):
y[0] += n
return y[0]
return [inc(x) for x in l]

I did think that the inner function would be able to modify the y declared in the outer lexical scope, but that didn't work, so we play some nasty hacks with structure modification instead. It is probably more elegant to use a generator.

无需使用 Numpy,您可以直接在数组上循环,并在过程中累积和。例如:

a=range(10)
i=1
while((i>0) & (i<10)):
a[i]=a[i-1]+a[i]
i=i+1
print a

结果:

[0, 1, 3, 6, 10, 15, 21, 28, 36, 45]

请看:

a = [4, 6, 12]
reduce(lambda c, x: c + [c[-1] + x], a, [0])[1:]

产出(如预期) :

[4, 10, 22]

这将是哈斯克尔式的:

def wrand(vtlg):


def helpf(lalt,lneu):


if not lalt==[]:
return helpf(lalt[1::],[lalt[0]+lneu[0]]+lneu)
else:
lneu.reverse()
return lneu[1:]


return helpf(vtlg,[0])

我用 Python 3.4对前两个答案做了一个基准测试,发现在很多情况下,itertools.accumulatenumpy.cumsum快,通常快得多。但是,正如您在评论中看到的,情况可能并非总是如此,而且很难彻底探索所有选项。(如果你有更多感兴趣的基准测试结果,可以随意添加评论或编辑这篇文章。)

有些时候..。

For short lists accumulate is about 4 times faster:

from timeit import timeit


def sum1(l):
from itertools import accumulate
return list(accumulate(l))


def sum2(l):
from numpy import cumsum
return list(cumsum(l))


l = [1, 2, 3, 4, 5]


timeit(lambda: sum1(l), number=100000)
# 0.4243644131347537
timeit(lambda: sum2(l), number=100000)
# 1.7077815784141421

对于较长的列表,accumulate的速度要快3倍左右:

l = [1, 2, 3, 4, 5]*1000
timeit(lambda: sum1(l), number=100000)
# 19.174508565105498
timeit(lambda: sum2(l), number=100000)
# 61.871223849244416

如果 numpy array没有转换成 listaccumulate仍然比 list快2倍:

from timeit import timeit


def sum1(l):
from itertools import accumulate
return list(accumulate(l))


def sum2(l):
from numpy import cumsum
return cumsum(l)


l = [1, 2, 3, 4, 5]*1000


print(timeit(lambda: sum1(l), number=100000))
# 19.18597290944308
print(timeit(lambda: sum2(l), number=100000))
# 37.759664884768426

如果将导入放在这两个函数之外,仍然返回 numpy array,那么 accumulate的速度仍然快了近2倍:

from timeit import timeit
from itertools import accumulate
from numpy import cumsum


def sum1(l):
return list(accumulate(l))


def sum2(l):
return cumsum(l)


l = [1, 2, 3, 4, 5]*1000


timeit(lambda: sum1(l), number=100000)
# 19.042188624851406
timeit(lambda: sum2(l), number=100000)
# 35.17324400227517

如果你想要一个没有麻木在2.7工作的 Python 方式,这将是我的方式做到这一点

l = [1,2,3,4]
_d={-1:0}
cumsum=[_d.setdefault(idx, _d[idx-1]+item) for idx,item in enumerate(l)]

现在让我们尝试一下,并将其与所有其他实现进行测试

import timeit, sys
L=list(range(10000))
if sys.version_info >= (3, 0):
reduce = functools.reduce
xrange = range




def sum1(l):
cumsum=[]
total = 0
for v in l:
total += v
cumsum.append(total)
return cumsum




def sum2(l):
import numpy as np
return list(np.cumsum(l))


def sum3(l):
return [sum(l[:i+1]) for i in xrange(len(l))]


def sum4(l):
return reduce(lambda c, x: c + [c[-1] + x], l, [0])[1:]


def this_implementation(l):
_d={-1:0}
return [_d.setdefault(idx, _d[idx-1]+item) for idx,item in enumerate(l)]




# sanity check
sum1(L)==sum2(L)==sum3(L)==sum4(L)==this_implementation(L)
>>> True


# PERFORMANCE TEST
timeit.timeit('sum1(L)','from __main__ import sum1,sum2,sum3,sum4,this_implementation,L', number=100)/100.
>>> 0.001018061637878418


timeit.timeit('sum2(L)','from __main__ import sum1,sum2,sum3,sum4,this_implementation,L', number=100)/100.
>>> 0.000829620361328125


timeit.timeit('sum3(L)','from __main__ import sum1,sum2,sum3,sum4,this_implementation,L', number=100)/100.
>>> 0.4606760001182556


timeit.timeit('sum4(L)','from __main__ import sum1,sum2,sum3,sum4,this_implementation,L', number=100)/100.
>>> 0.18932826995849608


timeit.timeit('this_implementation(L)','from __main__ import sum1,sum2,sum3,sum4,this_implementation,L', number=100)/100.
>>> 0.002348129749298096

Assignment expressions 来自 PEP 572 (new in Python 3.8) offer yet another way to solve this:

time_interval = [4, 6, 12]


total_time = 0
cum_time = [total_time := total_time + t for t in time_interval]

累计和的纯 Python 单线程:

cumsum = lambda X: X[:1] + cumsum([X[0]+X[1]] + X[2:]) if X[1:] else X

这是一个受到 递归累积和递归累积和启发的递归版本。一些解释:

  1. The first term X[:1] is a list containing the previous element and is almost the same as [X[0]] (which would complain for empty lists).
  2. 第二项中的递归 cumsum调用处理当前元素 [1]和其余的长度将减少一个的列表。
  3. if X[1:]if len(X)>1短。

测试:

cumsum([4,6,12])
#[4, 10, 22]


cumsum([])
#[]

与累积积分相似:

cumprod = lambda X: X[:1] + cumprod([X[0]*X[1]] + X[2:]) if X[1:] else X

测试:

cumprod([4,6,12])
#[4, 24, 288]

你可以用一个简单的 for循环来计算线性时间的累积和列表:

def csum(lst):
s = lst.copy()
for i in range(1, len(s)):
s[i] += s[i-1]
return s


time_interval = [4, 6, 12]
print(csum(time_interval))  # [4, 10, 22]

标准库的 itertools.accumulate可能是一个更快的选择(因为它是在 C 中实现的) :

from itertools import accumulate
time_interval = [4, 6, 12]
print(list(accumulate(time_interval)))  # [4, 10, 22]
l = [1,-1,3]
cum_list = l


def sum_list(input_list):
index = 1
for i in input_list[1:]:
cum_list[index] = i + input_list[index-1]
index = index + 1
return cum_list


print(sum_list(l))

In Python3, To find the cumulative sum of a list where the ith element 是原始列表中第一个 i + 1元素的和,你可以这样做:

a = [4 , 6 , 12]
b = []
for i in range(0,len(a)):
b.append(sum(a[:i+1]))
print(b)

OR you may use list comprehension:

b = [sum(a[:x+1]) for x in range(0,len(a))]

输出

[4,10,22]

根据列表的长度和性能,可能有许多答案。一个非常简单的方法,我可以不用考虑表演就可以想到的是:

a = [1, 2, 3, 4]
a = [sum(a[0:x]) for x in range(1, len(a)+1)]
print(a)

[1, 3, 6, 10]

这是通过使用列表内涵,这可能工作得相当不错,只是这里我在子数组上添加了很多次,你可以在这里即兴发挥,使它变得简单!

为你的努力干杯!

Here's another fun solution. This takes advantage of the locals() dict of a comprehension, i.e. local variables generated inside the list comprehension scope:

>>> [locals().setdefault(i, (elem + locals().get(i-1, 0))) for i, elem
in enumerate(time_interval)]
[4, 10, 22]

下面是 locals()在每次迭代中的查找结果:

>>> [[locals().setdefault(i, (elem + locals().get(i-1, 0))), locals().copy()][1]
for i, elem in enumerate(time_interval)]
[{'.0': <enumerate at 0x21f21f7fc80>, 'i': 0, 'elem': 4, 0: 4},
{'.0': <enumerate at 0x21f21f7fc80>, 'i': 1, 'elem': 6, 0: 4, 1: 10},
{'.0': <enumerate at 0x21f21f7fc80>, 'i': 2, 'elem': 12, 0: 4, 1: 10, 2: 22}]

对于小名单而言,业绩并不糟糕:

>>> %timeit list(accumulate([4, 6, 12]))
387 ns ± 7.53 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


>>> %timeit np.cumsum([4, 6, 12])
5.31 µs ± 67.8 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


>>> %timeit [locals().setdefault(i, (e + locals().get(i-1,0))) for i,e in enumerate(time_interval)]
1.57 µs ± 12 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

很明显,对于更大的名单来说,这个结果是平淡无奇的。

>>> l = list(range(1_000_000))
>>> %timeit list(accumulate(l))
95.1 ms ± 5.22 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)


>>> %timeit np.cumsum(l)
79.3 ms ± 1.07 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)


>>> %timeit np.cumsum(l).tolist()
120 ms ± 1.23 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)


>>> %timeit [locals().setdefault(i, (e + locals().get(i-1, 0))) for i, e in enumerate(l)]
660 ms ± 5.14 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

尽管这种方法很丑陋,也不实用,但它确实很有趣。

Since Python 3.8 it's possible to use 赋值表达式, so things like this became easier to implement

nums = list(range(1, 10))
print(f'array: {nums}')


v = 0
cumsum = [v := v + n for n in nums]
print(f'cumsum: {cumsum}')

生产

array: [1, 2, 3, 4, 5, 6, 7, 8, 9]
cumsum: [1, 3, 6, 10, 15, 21, 28, 36, 45]

同样的技术也可以应用于求精液的产品、平均值等。

p = 1
cumprod = [p := p * n for n in nums]
print(f'cumprod: {cumprod}')


s = 0
c = 0
cumavg = [(s := s + n) / (c := c + 1) for n in nums]
print(f'cumavg: {cumavg}')

results in

cumprod: [1, 2, 6, 24, 120, 720, 5040, 40320, 362880]
cumavg: [1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0, 4.5, 5.0]

我认为下面的代码是最简单的:

a=[1,1,2,1,2]
b=[a[0]]+[sum(a[0:i]) for i in range(2,len(a)+1)]