对一个数字数组的每 n 个元素求平均

我有一个数字数组。我想创建一个新的数组,它是每个连续三元素的平均值。所以新数组的大小是原始数组的三分之一。

举个例子:

 np.array([1,2,3,1,2,3,1,2,3])

应返回数组:

 np.array([2,2,2])

有人能提出一个有效的方法吗? 我一无所知。

54797 次浏览

If your array arr has a length divisible by 3:

np.mean(arr.reshape(-1, 3), axis=1)

Reshaping to a higher dimensional array and then performing some form of reduce operation on one of the additional dimensions is a staple of numpy programming.

For googlers looking for a simple generalisation for arrays with multiple dimensions: the function block_reduce in the scikit-image module (link to docs).

It has a very simple interface to downsample arrays by applying a function such as numpy.mean, but can also use others (maximum, median, ...). The downsampling can be done by different factors for different axes by supplying a tuple with different sizes for the blocks. Here's an example with a 2D array; downsampling only axis 1 by 5 using the mean:

import numpy as np
from skimage.measure import block_reduce


arr = np.stack((np.arange(1,20), np.arange(20,39)))


# array([[ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
#        [20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38]])


arr_reduced = block_reduce(arr, block_size=(1,5), func=np.mean, cval=np.mean(arr))


# array([[ 3. ,  8. , 13. , 17.8],
#        [22. , 27. , 32. , 33. ]])

As it was discussed in the comments to the other answer: if the array in the reduced dimension is not divisible by block size, padding values are provided by the argument cval (0 by default).

To apply the accepted answer to 2D array for each column/feature:

arr.reshape(-1, downsample_ratio, arr.shape[1]).mean(axis = 1)