从经常更新的文件中读取

小开

我不是这方面的专家，但我认为你将不得不使用某种观察者模式来被动地观察文件，然后启动一个事件，在发生变化时重新打开文件。至于如何实现这一点，我不知道。

我不认为 open ()会像您建议的那样实时打开文件。

小开

因为您的目标是 Linux 系统，所以可以使用 Pyinotify在文件更改时通知您。

还有一个这个的技巧，可能对你很有用。它使用 file.seek来做 tail -f所做的事情。

小开

最佳答案

我建议看看大卫比兹利的 Python 的生成器技巧，特别是 第5部分: 处理无限数据。它将实时处理相当于 tail -f logfile命令的 Python。

# follow.py
#
# Follow a file like tail -f.


import time
def follow(thefile):
thefile.seek(0,2)
while True:
line = thefile.readline()
if not line:
time.sleep(0.1)
continue
yield line


if __name__ == '__main__':
logfile = open("run/foo/access-log","r")
loglines = follow(logfile)
for line in loglines:
print line,

小开

如果在 while 循环中运行读取文件的代码:

f = open('/tmp/workfile', 'r')
while(1):
line = f.readline()
if line.find("ONE") != -1:
print "Got it"

并且您正在从另一个程序写入同一个文件(在追加模式下)。一旦“一”被附加在文件中，你将得到打印。你可以采取任何你想采取的行动。简而言之，您不必定期重新打开该文件。

>>> f = open('/tmp/workfile', 'a')
>>> f.write("One\n")
>>> f.close()
>>> f = open('/tmp/workfile', 'a')
>>> f.write("ONE\n")
>>> f.close()

小开

“一个互动的会议是值得的1000字”

>>> f1 = open("bla.txt", "wt")
>>> f2 = open("bla.txt", "rt")
>>> f1.write("bleh")
>>> f2.read()
''
>>> f1.flush()
>>> f2.read()
'bleh'
>>> f1.write("blargh")
>>> f1.flush()
>>> f2.read()
'blargh'

换句话说——是的，一个“打开”就可以了。

小开

这里是一个略微修改版本的杰夫 · 鲍尔答案，这是抵抗文件截断。如果您的文件正在由 logrotate处理，那么它将非常有用。

import os
import time


def follow(name):
current = open(name, "r")
curino = os.fstat(current.fileno()).st_ino
while True:
while True:
line = current.readline()
if not line:
break
yield line


try:
if os.stat(name).st_ino != curino:
new = open(name, "r")
current.close()
current = new
curino = os.fstat(current.fileno()).st_ino
continue
except IOError:
pass
time.sleep(1)




if __name__ == '__main__':
fname = "test.log"
for l in follow(fname):
print "LINE: {}".format(l)

小开

我有一个类似的用例，我为它编写了以下代码片段。虽然有些人可能会争辩说，这不是最理想的方式来做到这一点，这可以完成工作，看起来很容易理解。

def reading_log_files(filename):
with open(filename, "r") as f:
data = f.read().splitlines()
return data




def log_generator(filename, period=1):
data = reading_log_files(filename)
while True:
time.sleep(period)
new_data = reading_log_files(filename)
yield new_data[len(data):]
data = new_data




if __name__ == '__main__':
x = log_generator(</path/to/log/file.log>)
for lines in x:
print(lines)
# lines will be a list of new lines added at the end

希望你觉得这个有用

小开

这取决于你到底想对这个文件做什么，有两个潜在的用例:

从不断更新的文件(如日志文件)中读取附加内容。
从不断被覆盖的文件(例如 * nix 系统中的网络统计文件)中读取内容

正如其他人已经详细回答了如何解决方案 # 1，我想帮助那些谁需要方案 # 2。基本上，在调用 read() n + 1 ^这个时间之前，您需要使用 seek(0)(或者您想从中读取的位置)将文件指针重置为0。

您的代码看起来有点像下面的函数。

def generate_network_statistics(iface='wlan0'):
with open('/sys/class/net/' + iface + '/statistics/' + 'rx' + '_bytes', 'r') as rx:
with open('/sys/class/net/' + iface + '/statistics/' + 'tx' + '_bytes', 'r') as tx:
with open('/proc/uptime', 'r') as uptime:
while True:
receive = int(rx.read())
rx.seek(0)
transmit = int(tx.read())
tx.seek(0)
uptime_seconds = int(uptime.read())
uptime.seek(0)
print("Receive: %i, Transmit: %i" % (receive, transmit))
time.sleep(1)

小开

即使在文件末尾返回一个空字符串，也保持文件句柄处于打开状态，并在一段睡眠时间后再次尝试读取它。

    import time


syslog = '/var/log/syslog'
sleep_time_in_seconds = 1


try:
with open(syslog, 'r', errors='ignore') as f:
while True:
for line in f:
if line:
print(line.strip())
# do whatever you want to do on the line
time.sleep(sleep_time_in_seconds)
except IOError as e:
print('Cannot open the file {}. Error: {}'.format(syslog, e))