与熊猫合并索引数据框架

我有两个数据框架,每个数据框架有两个索引列。我想把它们合并。例如,第一个数据框如下:

                   V1


A      1/1/2012    12
2/1/2012    14
B      1/1/2012    15
2/1/2012    8
C      1/1/2012    17
2/1/2012    9

第二个数据框架如下:

                   V2


A      1/1/2012    15
3/1/2012    21
B      1/1/2012    24
2/1/2012    9
D      1/1/2012    7
2/1/2012    16

因此,我想得到以下结论:

                   V1   V2


A      1/1/2012    12   15
2/1/2012    14   N/A
3/1/2012    N/A  21
B      1/1/2012    15   24
2/1/2012    8    9
C      1/1/2012    7    N/A
2/1/2012    16   N/A
D      1/1/2012    N/A  7
2/1/2012    N/A  16

我已经尝试了使用 pd.merge.join方法的几个版本,但似乎没有工作。你有什么建议吗?

151808 次浏览

You can do this with merge:

df_merged = df1.merge(df2, how='outer', left_index=True, right_index=True)

The keyword argument how='outer' keeps all indices from both frames, filling in missing indices with NaN. The left_index and right_index keyword arguments have the merge be done on the indices. If you get all NaN in a column after doing a merge, another troubleshooting step is to verify that your indices have the same dtypes.

The merge code above produces the following output for me:

                V1    V2
A 2012-01-01  12.0  15.0
2012-02-01  14.0   NaN
2012-03-01   NaN  21.0
B 2012-01-01  15.0  24.0
2012-02-01   8.0   9.0
C 2012-01-01  17.0   NaN
2012-02-01   9.0   NaN
D 2012-01-01   NaN   7.0
2012-02-01   NaN  16.0

You should be able to use join, which joins on the index as default. Given your desired result, you must use outer as the join type.

>>> df1.join(df2, how='outer')
V1  V2
A 1/1/2012  12  15
2/1/2012  14 NaN
3/1/2012 NaN  21
B 1/1/2012  15  24
2/1/2012   8   9
C 1/1/2012  17 NaN
2/1/2012   9 NaN
D 1/1/2012 NaN   7
2/1/2012 NaN  16

Signature: _.join(other, on=None, how='left', lsuffix='', rsuffix='', sort=False) Docstring: Join columns with other DataFrame either on index or on a key column. Efficiently Join multiple DataFrame objects by index at once by passing a list.