如何规范化一个二维麻木数组在 Python 少冗长?

给定一个3乘以3的数组

a = numpy.arange(0,27,3).reshape(3,3)


# array([[ 0,  3,  6],
#        [ 9, 12, 15],
#        [18, 21, 24]])

来规范化我想到的二维数组的行

row_sums = a.sum(axis=1) # array([ 9, 36, 63])
new_matrix = numpy.zeros((3,3))
for i, (row, row_sum) in enumerate(zip(a, row_sums)):
new_matrix[i,:] = row / row_sum

肯定有更好的办法,不是吗?

也许需要澄清一下: 通过规范化,我的意思是,每行条目的总和必须是一。但我想大多数人都会明白的。

233736 次浏览

Broadcasting is really good for this:

row_sums = a.sum(axis=1)
new_matrix = a / row_sums[:, numpy.newaxis]

row_sums[:, numpy.newaxis] reshapes row_sums from being (3,) to being (3, 1). When you do a / b, a and b are broadcast against each other.

You can learn more about broadcasting here or even better here.

I think this should work,

a = numpy.arange(0,27.,3).reshape(3,3)


a /=  a.sum(axis=1)[:,numpy.newaxis]

Scikit-learn offers a function normalize() that lets you apply various normalizations. The "make it sum to 1" is called L1-norm. Therefore:

from sklearn.preprocessing import normalize


matrix = numpy.arange(0,27,3).reshape(3,3).astype(numpy.float64)
# array([[  0.,   3.,   6.],
#        [  9.,  12.,  15.],
#        [ 18.,  21.,  24.]])


normed_matrix = normalize(matrix, axis=1, norm='l1')
# [[ 0.          0.33333333  0.66666667]
#  [ 0.25        0.33333333  0.41666667]
#  [ 0.28571429  0.33333333  0.38095238]]

Now your rows will sum to 1.

In case you are trying to normalize each row such that its magnitude is one (i.e. a row's unit length is one or the sum of the square of each element in a row is one):

import numpy as np


a = np.arange(0,27,3).reshape(3,3)


result = a / np.linalg.norm(a, axis=-1)[:, np.newaxis]
# array([[ 0.        ,  0.4472136 ,  0.89442719],
#        [ 0.42426407,  0.56568542,  0.70710678],
#        [ 0.49153915,  0.57346234,  0.65538554]])

Verifying:

np.sum( result**2, axis=-1 )
# array([ 1.,  1.,  1.])

it appears that this also works

def normalizeRows(M):
row_sums = M.sum(axis=1)
return M / row_sums

Or using lambda function, like

>>> vec = np.arange(0,27,3).reshape(3,3)
>>> import numpy as np
>>> norm_vec = map(lambda row: row/np.linalg.norm(row), vec)

each vector of vec will have a unit norm.

You could also use matrix transposition:

(a.T / row_sums).T

I think you can normalize the row elements sum to 1 by this: new_matrix = a / a.sum(axis=1, keepdims=1). And the column normalization can be done with new_matrix = a / a.sum(axis=0, keepdims=1). Hope this can hep.

You could use built-in numpy function: np.linalg.norm(a, axis = 1, keepdims = True)

Here is one more possible way using reshape:

a_norm = (a/a.sum(axis=1).reshape(-1,1)).round(3)
print(a_norm)

Or using None works too:

a_norm = (a/a.sum(axis=1)[:,None]).round(3)
print(a_norm)

Output:

array([[0.   , 0.333, 0.667],
[0.25 , 0.333, 0.417],
[0.286, 0.333, 0.381]])

We can achieve the same effect by premultiplying with the diagonal matrix whose main diagonal is the reciprocal of the row sums.

A = np.diag(A.sum(1)**-1) @ A