convert nan value to zero

I have a 2D numpy array. Some of the values in this array are NaN. I want to perform certain operations using this array. For example consider the array:

[[   0.   43.   67.    0.   38.]
[ 100.   86.   96.  100.   94.]
[  76.   79.   83.   89.   56.]
[  88.   NaN   67.   89.   81.]
[  94.   79.   67.   89.   69.]
[  88.   79.   58.   72.   63.]
[  76.   79.   71.   67.   56.]
[  71.   71.   NaN   56.  100.]]

I am trying to take each row, one at a time, sort it in reversed order to get max 3 values from the row and take their average. The code I tried is:

# nparr is a 2D numpy array
for entry in nparr:
sortedentry = sorted(entry, reverse=True)
highest_3_values = sortedentry[:3]
avg_highest_3 = float(sum(highest_3_values)) / 3

This does not work for rows containing NaN. My question is, is there a quick way to convert all NaN values to zero in the 2D numpy array so that I have no problems with sorting and other things I am trying to do.

349829 次浏览

其中 A是2D 数组:

import numpy as np
A[np.isnan(A)] = 0

函数 isnan产生一个指示 NaN值所在位置的 bool 数组。布尔数组可用于索引相同形状的数组。把它想象成一个面具。

这应该会奏效:

from numpy import *


a = array([[1, 2, 3], [0, 3, NaN]])
where_are_NaNs = isnan(a)
a[where_are_NaNs] = 0

在上面的例子中,_ are _ NaN 是:

In [12]: where_are_NaNs
Out[12]:
array([[False, False, False],
[False, False,  True]], dtype=bool)

对于您的目的,如果所有的项目都存储为 str,并且您只是使用您正在使用的排序,然后检查第一个元素,并将其替换为’0’

>>> l1 = ['88','NaN','67','89','81']
>>> n = sorted(l1,reverse=True)
['NaN', '89', '88', '81', '67']
>>> import math
>>> if math.isnan(float(n[0])):
...     n[0] = '0'
...
>>> n
['0', '89', '88', '81', '67']

奶奶永远比不上奶奶

if z!=z:z=0

对于二维数组来说

for entry in nparr:
if entry!=entry:entry=0

Drake 的回答使用 nan_to_num的代码示例:

>>> import numpy as np
>>> A = np.array([[1, 2, 3], [0, 3, np.NaN]])
>>> A = np.nan_to_num(A)
>>> A
array([[ 1.,  2.,  3.],
[ 0.,  3.,  0.]])

你可以使用 Nan _ to _ num:

Nan _ to _ num (x) : 将 奶奶替换为 ,将 信息替换为 有限数字

例子(见文件) :

>>> np.set_printoptions(precision=8)
>>> x = np.array([np.inf, -np.inf, np.nan, -128, 128])
>>> np.nan_to_num(x)
array([  1.79769313e+308,  -1.79769313e+308,   0.00000000e+000,
-1.28000000e+002,   1.28000000e+002])

你可以使用 np.where找到你有 NaN的地方:

import numpy as np


a = np.array([[   0,   43,   67,    0,   38],
[ 100,   86,   96,  100,   94],
[  76,   79,   83,   89,   56],
[  88,   np.nan,   67,   89,   81],
[  94,   79,   67,   89,   69],
[  88,   79,   58,   72,   63],
[  76,   79,   71,   67,   56],
[  71,   71,   np.nan,   56,  100]])


b = np.where(np.isnan(a), 0, a)


In [20]: b
Out[20]:
array([[   0.,   43.,   67.,    0.,   38.],
[ 100.,   86.,   96.,  100.,   94.],
[  76.,   79.,   83.,   89.,   56.],
[  88.,    0.,   67.,   89.,   81.],
[  94.,   79.,   67.,   89.,   69.],
[  88.,   79.,   58.,   72.,   63.],
[  76.,   79.,   71.,   67.,   56.],
[  71.,   71.,    0.,   56.,  100.]])

你可以使用 lambda 函数,一个1D 数组的例子:

import numpy as np
a = [np.nan, 2, 3]
map(lambda v:0 if np.isnan(v) == True else v, a)

这会给你一个结果:

[0, 2, 3]