CSV new-line character seen in unquoted field error

the following code worked until today when I imported from a Windows machine and got this error:

new-line character seen in unquoted field - do you need to open the file in universal-newline mode?

import csv


class CSV:




def __init__(self, file=None):
self.file = file


def read_file(self):
data = []
file_read = csv.reader(self.file)
for row in file_read:
data.append(row)
return data


def get_row_count(self):
return len(self.read_file())


def get_column_count(self):
new_data = self.read_file()
return len(new_data[0])


def get_data(self, rows=1):
data = self.read_file()


return data[:rows]

How can I fix this issue?

def upload_configurator(request, id=None):
"""
A view that allows the user to configurator the uploaded CSV.
"""
upload = Upload.objects.get(id=id)
csvobject = CSV(upload.filepath)


upload.num_records = csvobject.get_row_count()
upload.num_columns = csvobject.get_column_count()
upload.save()


form = ConfiguratorForm()


row_count = csvobject.get_row_count()
colum_count = csvobject.get_column_count()
first_row = csvobject.get_data(rows=1)
first_two_rows = csvobject.get_data(rows=5)
142369 次浏览

首先尝试对导入的 Windows 文件运行 dos2unix

看到 csv 文件本身会很好,但是这可能对你有用,尝试一下,替换:

file_read = csv.reader(self.file)

与:

file_read = csv.reader(self.file, dialect=csv.excel_tab)

或者,用 universal newline mode打开一个文件并传递给 csv.reader,比如:

reader = csv.reader(open(self.file, 'rU'), dialect=csv.excel_tab)

或者,使用 splitlines(),像这样:

def read_file(self):
with open(self.file, 'r') as f:
data = [row for row in csv.reader(f.read().splitlines())]
return data

我意识到这是一篇老文章,但我遇到了同样的问题,没有看到正确的答案,所以我将尝试一下

Python 错误:

_csv.Error: new-line character seen in unquoted field

Caused by trying to read Macintosh (pre OS X formatted) CSV files. These are text files that use CR for end of line. If using MS Office make sure you select either plain CSV format or CSV (MS-DOS). 不要使用 CSV (Macintosh) as save-as type.

我喜欢的 EOL 版本是 LF (Unix/Linux/Apple) ,但我不认为 MS Office 提供了这种格式的保存选项。

对于 Mac OS X,将 CSV 文件保存为“ Windows 逗号分隔(. CSV)”格式。

如果这种事情发生在你身上(就像发生在我身上一样) :

  1. Save the file as CSV (MS-DOS Comma-Separated)
  2. 运行以下脚本

    with open(csv_filename, 'rU') as csvfile:
    csvreader = csv.reader(csvfile)
    for row in csvreader:
    print ', '.join(row)
    

这对我在 OSX 的工作。

# allow variable to opened as files
from io import StringIO


# library to map other strange (accented) characters back into UTF-8
from unidecode import unidecode


# cleanse input file with Windows formating to plain UTF-8 string
with open(filename, 'rb') as fID:
uncleansedBytes = fID.read()
# decode the file using the correct encoding scheme
# (probably this old windows one)
uncleansedText = uncleansedBytes.decode('Windows-1252')


# replace carriage-returns with new-lines
cleansedText = uncleansedText.replace('\r', '\n')


# map any other non UTF-8 characters into UTF-8
asciiText = unidecode(cleansedText)


# read each line of the csv file and store as an array of dicts,
# use first line as field names for each dict.
reader = csv.DictReader(StringIO(cleansedText))
for line_entry in reader:
# do something with your read data

这是我遇到的一个错误。我在 MAC OSX 中保存了.csv 文件。

保存时,将其保存为“ Windows 逗号分隔值(. csv)”,该值解决了这个问题。

我知道这个问题已经解决了很长一段时间,但是还没有解决我的问题。由于其他一些并发症,我正在使用 DictReader 和 StringIO 读取 csv。我可以通过显式地替换分隔符来更简单地解决问题:

with urllib.request.urlopen(q) as response:
raw_data = response.read()
encoding = response.info().get_content_charset('utf8')
data = raw_data.decode(encoding)
if '\r\n' not in data:
# proably a windows delimited thing...try to update it
data = data.replace('\r', '\r\n')

对于庞大的 CSV 文件来说可能不太合理,但是对于我的用例来说却工作得很好。

替代和快速解决方案: 我面临同样的错误。我在我的 lubuntu 机器上重新打开 GNUMERIC 中的“ wierd”csv 文件,并将该文件导出为 csv 文件。这纠正了问题。