将许多 python Pandas 数据框架放到一个 Excel 工作表中

只要是不同的工作表,在 Excel 工作簿中添加许多大熊猫数据框是非常容易的。但是,如果您想使用熊猫内置的 df.to _ excel 功能,那么将多个数据框放入一个工作表中就有些棘手了。

# Creating Excel Writer Object from Pandas
writer = pd.ExcelWriter('test.xlsx',engine='xlsxwriter')
workbook=writer.book
worksheet=workbook.add_worksheet('Validation')
df.to_excel(writer,sheet_name='Validation',startrow=0 , startcol=0)
another_df.to_excel(writer,sheet_name='Validation',startrow=20, startcol=0)

上面的代码不起作用

 Sheetname 'Validation', with case ignored, is already in use.

现在,我已经做了足够多的实验,我找到了一种方法使它工作。

writer = pd.ExcelWriter('test.xlsx',engine='xlsxwriter')   # Creating Excel Writer Object from Pandas
workbook=writer.book
df.to_excel(writer,sheet_name='Validation',startrow=0 , startcol=0)
another_df.to_excel(writer,sheet_name='Validation',startrow=20, startcol=0)

会成功的。因此,我在 stackoverflow 上发布这个问题的目的有两个。首先,我希望这将有助于某人,如果他/她试图把许多数据框架到一个工作表中的 Excel。

其次,有人能帮助我理解这两个代码块之间的区别吗?在我看来,它们几乎是相同的,除了第一块代码创建的工作表被称为“验证”,而第二块没有提前。我明白。

我不明白的是,为什么会有什么不同呢?即使我不提前创建工作表这一行,最后一行之前的这一行,

 df.to_excel(writer,sheet_name='Validation',startrow=0 , startcol=0)

将创建一个工作表。因此,当我们到达代码的最后一行时,工作表“验证”也已经在第二块代码中创建。那么,我的问题基本上就是,为什么第二个代码块可以工作,而第一个代码块却不能?

如果还有其他方法可以使用内置的 df.to _ excel 功能将许多数据框架放入 excel 中,也请与我们分享! !

105120 次浏览

user3817518: "Please also share if there is another way to put many dataframes into excel using the built-in df.to_excel functionality !!"

Here's my attempt:

Easy way to put together a lot of dataframes on just one sheet or across multiple tabs. Let me know if this works!

-- To test, just run the sample dataframes and the second and third portion of code.

Sample dataframes

import pandas as pd
import numpy as np


# Sample dataframes
randn = np.random.randn
df = pd.DataFrame(randn(15, 20))
df1 = pd.DataFrame(randn(10, 5))
df2 = pd.DataFrame(randn(5, 10))

Put multiple dataframes into one xlsx sheet

# funtion
def multiple_dfs(df_list, sheets, file_name, spaces):
writer = pd.ExcelWriter(file_name,engine='xlsxwriter')
row = 0
for dataframe in df_list:
dataframe.to_excel(writer,sheet_name=sheets,startrow=row , startcol=0)
row = row + len(dataframe.index) + spaces + 1
writer.save()


# list of dataframes
dfs = [df,df1,df2]


# run function
multiple_dfs(dfs, 'Validation', 'test1.xlsx', 1)

Put multiple dataframes across separate tabs/sheets

# function
def dfs_tabs(df_list, sheet_list, file_name):
writer = pd.ExcelWriter(file_name,engine='xlsxwriter')
for dataframe, sheet in zip(df_list, sheet_list):
dataframe.to_excel(writer, sheet_name=sheet, startrow=0 , startcol=0)
writer.save()


# list of dataframes and sheet names
dfs = [df, df1, df2]
sheets = ['df','df1','df2']


# run function
dfs_tabs(dfs, sheets, 'multi-test.xlsx')

I would be more inclined to concatenate the dataframes first and then turn that dataframe into an excel format. To put two dataframes together side-by-side (as opposed to one above the other) do this:

writer = pd.ExcelWriter('test.xlsx',engine='xlsxwriter')   # Creating Excel Writer Object from Pandas
workbook=writer.book
df.to_excel(writer,sheet_name='Validation',startrow=0 , startcol=0)
new_df = pd.concat([df, another_df], axis=1)
new_df.to_excel(writer,sheet_name='Validation',startrow=0 , startcol=0)

To create the Worksheet in advance, you need to add the created sheet to the sheets dict:

writer.sheets['Validation'] = worksheet

Using your original code:

# Creating Excel Writer Object from Pandas
writer = pd.ExcelWriter('test.xlsx',engine='xlsxwriter')
workbook=writer.book
worksheet=workbook.add_worksheet('Validation')
writer.sheets['Validation'] = worksheet
df.to_excel(writer,sheet_name='Validation',startrow=0 , startcol=0)
another_df.to_excel(writer,sheet_name='Validation',startrow=20, startcol=0)

Explanation

If we look at the pandas function to_excel, it uses the writer's write_cells function:

excel_writer.write_cells(formatted_cells, sheet_name, startrow=startrow, startcol=startcol)

So looking at the write_cells function for xlsxwriter:

def write_cells(self, cells, sheet_name=None, startrow=0, startcol=0):
# Write the frame cells using xlsxwriter.
sheet_name = self._get_sheet_name(sheet_name)
if sheet_name in self.sheets:
wks = self.sheets[sheet_name]
else:
wks = self.book.add_worksheet(sheet_name)
self.sheets[sheet_name] = wks

Here we can see that it checks for sheet_name in self.sheets, and so it needs to be added there as well.

The answer by Adrian can be simplified as follows

writer = pd.ExcelWriter('test.xlsx',engine='xlsxwriter')
df.to_excel(writer,sheet_name='Validation',startrow=0 , startcol=0)
another_df.to_excel(writer,sheet_name='Validation',startrow=20, startcol=0)

Works for pandas 0.25.3 with python 3.7.6

Use with - you don't have to call writer.save() or writer.close() explicitly.

Also, it automatically manages workbook.close(), if you use workbook=writer.book.
(Other answers forgot to do this, and this happens quite often because we are human ;)

import pandas as pd


df = pd.DataFrame(data={'col1':[9,3,4,5,1,1,1,1], 'col2':[6,7,8,9,5,5,5,5]})
df2 = pd.DataFrame(data={'col1':[25,35,45,55,65,75], 'col2':[61,71,81,91,21,31]})


with pd.ExcelWriter('test.xlsx', engine='xlsxwriter') as writer:
df.to_excel(writer, sheet_name='testSheetJ', startrow=1, startcol=0)
df2.to_excel(writer, sheet_name='testSheetJ', startrow=1+len(df)+3, startcol=0)

Result:

enter image description here