最佳答案
是否有一种内置的方法使用 read_csv
只读取一个文件的第一个 n
行而不提前知道行的长度?我有一个很大的文件,需要很长的时间来阅读,有时只想使用第一个,比如说,20行获得它的样本(并宁愿不加载完整的东西,采取它的头)。
If I knew the total number of lines I could do something like footer_lines = total_lines - n
and pass this to the skipfooter
keyword arg. My current solution is to manually grab the first n
lines with python and StringIO it to pandas:
import pandas as pd
from StringIO import StringIO
n = 20
with open('big_file.csv', 'r') as f:
head = ''.join(f.readlines(n))
df = pd.read_csv(StringIO(head))
It's not that bad, but is there a more concise, 'pandasic' (?) way to do it with keywords or something?