CSV new-line character seen in unquoted field error

小开

首先尝试对导入的 Windows 文件运行 dos2unix

小开

最佳答案

看到 csv 文件本身会很好，但是这可能对你有用，尝试一下，替换:

file_read = csv.reader(self.file)

与:

file_read = csv.reader(self.file, dialect=csv.excel_tab)

或者，用 universal newline mode打开一个文件并传递给 csv.reader，比如:

reader = csv.reader(open(self.file, 'rU'), dialect=csv.excel_tab)

或者，使用 splitlines()，像这样:

def read_file(self):
with open(self.file, 'r') as f:
data = [row for row in csv.reader(f.read().splitlines())]
return data

小开

我意识到这是一篇老文章，但我遇到了同样的问题，没有看到正确的答案，所以我将尝试一下

Python 错误:

_csv.Error: new-line character seen in unquoted field

Caused by trying to read Macintosh (pre OS X formatted) CSV files. These are text files that use CR for end of line. If using MS Office make sure you select either plain CSV format or CSV (MS-DOS). 不要使用 CSV (Macintosh) as save-as type.

我喜欢的 EOL 版本是 LF (Unix/Linux/Apple) ，但我不认为 MS Office 提供了这种格式的保存选项。

小开

对于 Mac OS X，将 CSV 文件保存为“ Windows 逗号分隔(. CSV)”格式。

小开

如果这种事情发生在你身上(就像发生在我身上一样) :

Save the file as CSV (MS-DOS Comma-Separated)

运行以下脚本

with open(csv_filename, 'rU') as csvfile:
csvreader = csv.reader(csvfile)
for row in csvreader:
print ', '.join(row)

小开

这对我在 OSX 的工作。

# allow variable to opened as files
from io import StringIO


# library to map other strange (accented) characters back into UTF-8
from unidecode import unidecode


# cleanse input file with Windows formating to plain UTF-8 string
with open(filename, 'rb') as fID:
uncleansedBytes = fID.read()
# decode the file using the correct encoding scheme
# (probably this old windows one)
uncleansedText = uncleansedBytes.decode('Windows-1252')


# replace carriage-returns with new-lines
cleansedText = uncleansedText.replace('\r', '\n')


# map any other non UTF-8 characters into UTF-8
asciiText = unidecode(cleansedText)


# read each line of the csv file and store as an array of dicts,
# use first line as field names for each dict.
reader = csv.DictReader(StringIO(cleansedText))
for line_entry in reader:
# do something with your read data

小开

这是我遇到的一个错误。我在 MAC OSX 中保存了.csv 文件。

保存时，将其保存为“ Windows 逗号分隔值(. csv)”，该值解决了这个问题。

小开

我知道这个问题已经解决了很长一段时间，但是还没有解决我的问题。由于其他一些并发症，我正在使用 DictReader 和 StringIO 读取 csv。我可以通过显式地替换分隔符来更简单地解决问题:

with urllib.request.urlopen(q) as response:
raw_data = response.read()
encoding = response.info().get_content_charset('utf8')
data = raw_data.decode(encoding)
if '\r\n' not in data:
# proably a windows delimited thing...try to update it
data = data.replace('\r', '\r\n')

对于庞大的 CSV 文件来说可能不太合理，但是对于我的用例来说却工作得很好。

小开

替代和快速解决方案: 我面临同样的错误。我在我的 lubuntu 机器上重新打开 GNUMERIC 中的“ wierd”csv 文件，并将该文件导出为 csv 文件。这纠正了问题。