使用浮动类型 NaN 创建空熊猫 DataFrame 的优雅方法

我想创建一个熊猫数据框架填充 NaNs。在我的研究,我发现 一个答案:

import pandas as pd


df = pd.DataFrame(index=range(0,4),columns=['A'])

这段代码生成一个 DataFrame,其中填充了类型为“ object”的 NaN。因此它们不能在以后使用,例如 interpolate()方法。因此,我用这段复杂的代码创建了 DataFrame (受到 这个答案的启发) :

import pandas as pd
import numpy as np


dummyarray = np.empty((4,1))
dummyarray[:] = np.nan


df = pd.DataFrame(dummyarray)

这导致 DataFrame 中填充了类型为“ float”的 NaN,因此以后可以在 interpolate()中使用它。是否有更优雅的方式来创建相同的结果?

146422 次浏览

You could specify the dtype directly when constructing the DataFrame:

>>> df = pd.DataFrame(index=range(0,4),columns=['A'], dtype='float')
>>> df.dtypes
A    float64
dtype: object

Specifying the dtype forces Pandas to try creating the DataFrame with that type, rather than trying to infer it.

Simply pass the desired value as first argument, like 0, math.inf or, here, np.nan. The constructor then initializes and fills the value array to the size specified by arguments index and columns:

>>> import numpy as np
>>> import pandas as pd
>>> df = pd.DataFrame(np.nan, index=[0, 1, 2, 3], columns=['A', 'B'])


>>> df
A   B
0 NaN NaN
1 NaN NaN
2 NaN NaN
3 NaN NaN


>>> df.dtypes
A    float64
B    float64
dtype: object

Hope this can help!

 pd.DataFrame(np.nan, index = np.arange(<num_rows>), columns = ['A'])

You can try this line of code:

pdDataFrame = pd.DataFrame([np.nan] * 7)

This will create a pandas dataframe of size 7 with NaN of type float:

if you print pdDataFrame the output will be:

     0
0   NaN
1   NaN
2   NaN
3   NaN
4   NaN
5   NaN
6   NaN

Also the output for pdDataFrame.dtypes is:

0    float64
dtype: object

For multiple columns you can do:

df = pd.DataFrame(np.zeros([nrow, ncol])*np.nan)