在 pickle 文件中保存和加载多个对象？

小开

最佳答案

Using a list, tuple, or dict is by far the most common way to do this:

import pickle
PIK = "pickle.dat"


data = ["A", "b", "C", "d"]
with open(PIK, "wb") as f:
pickle.dump(data, f)
with open(PIK, "rb") as f:
print pickle.load(f)

That prints:

['A', 'b', 'C', 'd']

However, a pickle file can contain any number of pickles. Here's code producing the same output. But note that it's harder to write and to understand:

with open(PIK, "wb") as f:
pickle.dump(len(data), f)
for value in data:
pickle.dump(value, f)
data2 = []
with open(PIK, "rb") as f:
for _ in range(pickle.load(f)):
data2.append(pickle.load(f))
print data2

If you do this, you're responsible for knowing how many pickles are in the file you write out. The code above does that by pickling the number of list objects first.

小开

Two additions to Tim Peters' accepted answer.

First, you need not store the number of items you pickled separately if you stop loading when you hit the end of the file:

def loadall(filename):
with open(filename, "rb") as f:
while True:
try:
yield pickle.load(f)
except EOFError:
break


items = loadall(myfilename)

This assumes the file contains only pickles; if there's anything else in there, the generator will try to treat whatever else is in there as pickles too, which could be dangerous.

Second, this way, you do not get a list but rather a generator. This will load only one item into memory at a time, which is useful if the dumped data is very large -- one possible reason why you may have wanted to pickle multiple items separately in the first place. You can still iterate over items with a for loop as if it were a list.

小开

I will give an object-oriented demo using pickle to store and restore one or multi object:

class Worker(object):


def __init__(self, name, addr):
self.name = name
self.addr = addr


def __str__(self):
string = u'[<Worker> name:%s addr:%s]' %(self.name, self.addr)
return string


# output one item
with open('testfile.bin', 'wb') as f:
w1 = Worker('tom1', 'China')
pickle.dump(w1, f)


# input one item
with open('testfile.bin', 'rb') as f:
w1_restore = pickle.load(f)
print 'item: %s' %w1_restore


# output multi items
with open('testfile.bin', 'wb') as f:
w1 = Worker('tom2', 'China')
w2 = Worker('tom3', 'China')
pickle.dump([w1, w2], f)


# input multi items
with open('testfile.bin', 'rb') as f:
w_list = pickle.load(f)


for w in w_list:
print 'item-list: %s' %w

output:

item: [<Worker> name:tom1 addr:China]
item-list: [<Worker> name:tom2 addr:China]
item-list: [<Worker> name:tom3 addr:China]

小开

It's easy if you use klepto, which gives you the ability to transparently store objects in files or databases. It uses a dict API, and allows you to dump and/or load specific entries from an archive (in the case below, serialized objects stored one entry per file in a directory called scores).

>>> import klepto
>>> scores = klepto.archives.dir_archive('scores', serialized=True)
>>> scores['Guido'] = 69
>>> scores['Fernando'] = 42
>>> scores['Polly'] = 101
>>> scores.dump()
>>> # access the archive, and load only one
>>> results = klepto.archives.dir_archive('scores', serialized=True)
>>> results.load('Polly')
>>> results
dir_archive('scores', {'Polly': 101}, cached=True)
>>> results['Polly']
101
>>> # load all the scores
>>> results.load()
>>> results['Guido']
69
>>>

小开

Try this:

import pickle


file = open('test.pkl','wb')
obj_1 = ['test_1', {'ability', 'mobility'}]
obj_2 = ['test_2', {'ability', 'mobility'}]
obj_3 = ['test_3', {'ability', 'mobility'}]


pickle.dump(obj_1, file)
pickle.dump(obj_2, file)
pickle.dump(obj_3, file)


file.close()


file = open('test.pkl', 'rb')
obj_1 = pickle.load(file)
obj_2 = pickle.load(file)
obj_3 = pickle.load(file)
print(obj_1)
print(obj_2)
print(obj_3)
file.close()

小开

If you're dumping it iteratively, you'd have to read it iteratively as well.

You can run a loop (as the accepted answer shows) to keep unpickling rows until you reach the end-of-file (at which point an EOFError is raised).

data = []
with open("data.pickle", "rb") as f:
while True:
try:
data.append(pickle.load(f))
except EOFError:
break

Minimal Verifiable Example

import pickle


# Dumping step
data = [{'a': 1}, {'b': 2}]
with open('test.pkl', 'wb') as f:
for d in data:
pickle.dump(d, f)


# Loading step
data2 = []
with open('test.pkl', 'rb') as f:
while True:
try:
data2.append(pickle.load(f))
except EOFError:
break


data2
# [{'a': 1}, {'b': 2}]


data == data2
# True

Of course, this is under the assumption that your objects have to be pickled individually. You can also store your data as a single list of object, then use a single pickle/unpickle call (no need for loops).

data = [{'a':1}, {'b':2}]  # list of dicts as an example
with open('test.pkl', 'wb') as f:
pickle.dump(data, f)


with open('test.pkl', 'rb') as f:
data2 = pickle.load(f)


data2
# [{'a': 1}, {'b': 2}]

小开

Here is how to dump two (or more dictionaries) using pickle, and extract it back:

import pickle


dict_1 = {1: 'one', 2: 'two'}
dict_2 = {1: {1: 'one'}, 2: {2: 'two'}}


F = open('data_file1.pkl', 'wb')
pickle.dump(dict_1, F)
pickle.dump(dict_2, F)
F.close()

=========================================

import pickle


dict_1 = {1: 'one', 2: 'two'}
dict_2 = {1: {1: 'one'}, 2: {2: 'two'}}


F = open('data_file1.pkl', 'rb')
G = pickle.load(F)
print(G)
H = pickle.load(F)
print(H)
F.close()

小开

Suppose we have saved objects in the file of an Employee class. Here is the code to read all objects, one by one, from file:

 e = Employee()


with open(filename, 'rb') as a:
while True:
try:
e = pickle.load(a)
e.ShowRecord()
except EOFError:
break