读取换行符分隔的文件并丢弃换行符的最佳方法?

我正在尝试确定在 Python 中读取带换行符分隔的文件时去除换行符的最佳方法。

我想到的是下面的代码,包括一次性代码来测试。

import os


def getfile(filename,results):
f = open(filename)
filecontents = f.readlines()
for line in filecontents:
foo = line.strip('\n')
results.append(foo)
return results


blahblah = []


getfile('/tmp/foo',blahblah)


for x in blahblah:
print x
117687 次浏览
lines = open(filename).read().splitlines()
for line in file('/tmp/foo'):
print line.strip('\n')

I'd do it like this:

f = open('test.txt')
l = [l for l in f.readlines() if l.strip()]
f.close()
print l

Here's a generator that does what you requested. In this case, using rstrip is sufficient and slightly faster than strip.

lines = (line.rstrip('\n') for line in open(filename))

However, you'll most likely want to use this to get rid of trailing whitespaces too.

lines = (line.rstrip() for line in open(filename))

I use this

def cleaned( aFile ):
for line in aFile:
yield line.strip()

Then I can do things like this.

lines = list( cleaned( open("file","r") ) )

Or, I can extend cleaned with extra functions to, for example, drop blank lines or skip comment lines or whatever.

Just use generator expressions:

blahblah = (l.rstrip() for l in open(filename))
for x in blahblah:
print x

Also I want to advise you against reading whole file in memory -- looping over generators is much more efficient on big datasets.

What do you think about this approach?

with open(filename) as data:
datalines = (line.rstrip('\r\n') for line in data)
for line in datalines:
...do something awesome...

Generator expression avoids loading whole file into memory and with ensures closing the file