如何用 Python 打开 Excel 文件?

如何打开用 Python 读取的 Excel 文件?

我已经打开了文本文件,例如,使用 read 命令打开 sometextfile.txt

407632 次浏览

This isn't as straightforward as opening a plain text file and will require some sort of external module since nothing is built-in to do this. Here are some options:

http://www.python-excel.org/

If possible, you may want to consider exporting the excel spreadsheet as a CSV file and then using the built-in python csv module to read it:

http://docs.python.org/library/csv.html

Try the xlrd library.

[Edit] - from what I can see from your comment, something like the snippet below might do the trick. I'm assuming here that you're just searching one column for the word 'john', but you could add more or make this into a more generic function.

from xlrd import open_workbook


book = open_workbook('simple.xls',on_demand=True)
for name in book.sheet_names():
if name.endswith('2'):
sheet = book.sheet_by_name(name)


# Attempt to find a matching row (search the first column for 'john')
rowIndex = -1
for cell in sheet.col(0): #
if 'john' in cell.value:
break


# If we found the row, print it
if row != -1:
cells = sheet.row(row)
for cell in cells:
print cell.value


book.unload_sheet(name)

Edit:
In the newer version of pandas, you can pass the sheet name as a parameter.

file_name =  # path to file + file name
sheet =  # sheet name or sheet number or list of sheet numbers and names


import pandas as pd
df = pd.read_excel(io=file_name, sheet_name=sheet)
print(df.head(5))  # print first 5 rows of the dataframe

Check the docs for examples on how to pass sheet_name:
https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_excel.html

Old version:
you can use pandas package as well....

When you are working with an excel file with multiple sheets, you can use:

import pandas as pd
xl = pd.ExcelFile(path + filename)
xl.sheet_names


>>> [u'Sheet1', u'Sheet2', u'Sheet3']


df = xl.parse("Sheet1")
df.head()

df.head() will print first 5 rows of your Excel file

If you're working with an Excel file with a single sheet, you can simply use:

import pandas as pd
df = pd.read_excel(path + filename)
print df.head()

You can use xlpython package that requires xlrd only. Find it here https://pypi.python.org/pypi/xlpython and its documentation here https://github.com/morfat/xlpython

There's the openpxyl package:

>>> from openpyxl import load_workbook
>>> wb2 = load_workbook('test.xlsx')
>>> print wb2.get_sheet_names()
['Sheet2', 'New Title', 'Sheet1']


>>> worksheet1 = wb2['Sheet1'] # one way to load a worksheet
>>> worksheet2 = wb2.get_sheet_by_name('Sheet2') # another way to load a worksheet
>>> print(worksheet1['D18'].value)
3
>>> for row in worksheet1.iter_rows():
>>>     print row[0].value()
import pandas as pd
import os
files = os.listdir('path/to/files/directory/')
desiredFile = files[i]
filePath = 'path/to/files/directory/%s'
Ofile = filePath % desiredFile
xls_import = pd.read_csv(Ofile)

Now you can use the power of pandas DataFrames!

This code worked for me with Python 3.5.2. It opens and saves and excel. I am currently working on how to save data into the file but this is the code:

import csv
excel = csv.writer(open("file1.csv", "wb"))

 

This may help:

This creates a node that takes a 2D List (list of list items) and pushes them into the excel spreadsheet. make sure the IN[]s are present or will throw and exception.

this is a re-write of the Revit excel dynamo node for excel 2013 as the default prepackaged node kept breaking. I also have a similar read node. The excel syntax in Python is touchy.

thnx @CodingNinja - updated : )

###Export Excel - intended to replace malfunctioning excel node


import clr


clr.AddReferenceByName('Microsoft.Office.Interop.Excel, Version=15.0.0.0, Culture=neutral, PublicKeyToken=71e9bce111e9429c')
##AddReferenceGUID("{00020813-0000-0000-C000-000000000046}") ''Excel                            C:\Program Files\Microsoft Office\Office15\EXCEL.EXE
##Need to Verify interop for version 2015 is 15 and node attachemnt for it.
from Microsoft.Office.Interop import  * ##Excel
################################Initialize FP and Sheet ID
##Same functionality as the excel node
strFileName = IN[0]             ##Filename
sheetName = IN[1]               ##Sheet
RowOffset= IN[2]                ##RowOffset
ColOffset= IN[3]                ##COL OFfset
Data=IN[4]                      ##Data
Overwrite=IN[5]                 ##Check for auto-overwtite
XLVisible = False   #IN[6]      ##XL Visible for operation or not?


RowOffset=0
if IN[2]>0:
RowOffset=IN[2]             ##RowOffset


ColOffset=0
if IN[3]>0:
ColOffset=IN[3]             ##COL OFfset


if IN[6]<>False:
XLVisible = True #IN[6]     ##XL Visible for operation or not?


################################Initialize FP and Sheet ID
xlCellTypeLastCell = 11                 #####define special sells value constant
################################
xls = Excel.ApplicationClass()          ####Connect with application
xls.Visible = XLVisible                 ##VISIBLE YES/NO
xls.DisplayAlerts = False               ### ALerts


import os.path


if os.path.isfile(strFileName):
wb = xls.Workbooks.Open(strFileName, False)     ####Open the file
else:
wb = xls.Workbooks.add#         ####Open the file
wb.SaveAs(strFileName)
wb.application.visible = XLVisible      ####Show Excel
try:
ws = wb.Worksheets(sheetName)       ####Get the sheet in the WB base


except:
ws = wb.sheets.add()                ####If it doesn't exist- add it. use () for object method
ws.Name = sheetName






#################################
#lastRow for iterating rows
lastRow=ws.UsedRange.SpecialCells(xlCellTypeLastCell).Row
#lastCol for iterating columns
lastCol=ws.UsedRange.SpecialCells(xlCellTypeLastCell).Column
#######################################################################
out=[]                                  ###MESSAGE GATHERING


c=0
r=0
val=""
if Overwrite == False :                 ####Look ahead for non-empty cells to throw error
for r, row in enumerate(Data):   ####BASE 0## EACH ROW OF DATA ENUMERATED in the 2D array #range( RowOffset, lastRow + RowOffset):
for c, col in enumerate (row): ####BASE 0## Each colmn in each row is a cell with data ### in range(ColOffset, lastCol + ColOffset):
if col.Value2 >"" :
OUT= "ERROR- Cannot overwrite"
raise ValueError("ERROR- Cannot overwrite")
##out.append(Data[0]) ##append mesage for error
############################################################################


for r, row in enumerate(Data):   ####BASE 0## EACH ROW OF DATA ENUMERATED in the 2D array #range( RowOffset, lastRow + RowOffset):
for c, col in enumerate (row): ####BASE 0## Each colmn in each row is a cell with data ### in range(ColOffset, lastCol + ColOffset):
ws.Cells[r+1+RowOffset,c+1+ColOffset].Value2 = col.__str__()


##run macro disbled for debugging excel macro
##xls.Application.Run("Align_data_and_Highlight_Issues")