训练后如何保存/恢复模型?

在Tensorflow中训练一个模型后:

  1. 如何拯救训练过的模型?
  2. 您以后如何恢复这个保存的模型?
477796 次浏览

对于TensorFlow版本<0.11.0RC1:

保存的检查点包含模型中Variables的值,而不是模型/图本身,这意味着当您恢复检查点时,图应该是相同的。

这里有一个线性回归的例子,其中有一个训练循环,保存变量检查点,还有一个评估部分,将恢复之前运行中保存的变量并计算预测。当然,如果你愿意,你也可以恢复变量并继续训练。

x = tf.placeholder(tf.float32)
y = tf.placeholder(tf.float32)


w = tf.Variable(tf.zeros([1, 1], dtype=tf.float32))
b = tf.Variable(tf.ones([1, 1], dtype=tf.float32))
y_hat = tf.add(b, tf.matmul(x, w))


...more setup for optimization and what not...


saver = tf.train.Saver()  # defaults to saving all variables - in this case w and b


with tf.Session() as sess:
sess.run(tf.initialize_all_variables())
if FLAGS.train:
for i in xrange(FLAGS.training_steps):
...training loop...
if (i + 1) % FLAGS.checkpoint_steps == 0:
saver.save(sess, FLAGS.checkpoint_dir + 'model.ckpt',
global_step=i+1)
else:
# Here's where you're restoring the variables w and b.
# Note that the graph is exactly as it was when the variables were
# saved in a prior training run.
ckpt = tf.train.get_checkpoint_state(FLAGS.checkpoint_dir)
if ckpt and ckpt.model_checkpoint_path:
saver.restore(sess, ckpt.model_checkpoint_path)
else:
...no checkpoint found...


# Now you can run the model to get predictions
batch_x = ...load some data...
predictions = sess.run(y_hat, feed_dict={x: batch_x})

这里是文档Variables,涵盖了保存和恢复。这是Saver文档

这个模型有两个部分,一个是模型定义,由Supervisor保存为模型目录中的graph.pbtxt,另一个是张量的数值,保存为检查点文件,如model.ckpt-1003418

模型定义可以使用tf.import_graph_def恢复,权重可以使用Saver恢复。

但是,Saver使用附加到模型Graph的特殊集合保存变量列表,并且该集合没有使用import_graph_def初始化,因此您目前不能同时使用这两者(这在我们的路线图中进行修复)。现在,您必须使用Ryan Sepassi的方法——手动构造具有相同节点名称的图,并使用Saver将权重加载到其中。

(或者你可以通过使用import_graph_def来破解它,手动创建变量,然后为每个变量使用tf.add_to_collection(tf.GraphKeys.VARIABLES, variable),然后使用Saver)

正如Yaroslav所说,您可以通过导入图、手动创建变量,然后使用Saver来从graph_def和检查点进行恢复。

我实现这个是为了我个人使用,所以我想在这里分享一下代码。

链接:# EYZ0

(当然,这是一种hack,并且不能保证以这种方式保存的模型在TensorFlow的未来版本中仍然是可读的。)

您还可以在TensorFlow / skflow中查看例子,它提供了saverestore方法,可以帮助您轻松地管理模型。它具有一些参数,您还可以控制备份模型的频率。

如果它是一个内部保存的模型,您只需为所有变量指定一个恢复器为

restorer = tf.train.Saver(tf.all_variables())

并使用它来恢复当前会话中的变量:

restorer.restore(self._sess, model_file)

对于外部模型,您需要指定从它的变量名到您的变量名的映射。您可以使用该命令查看模型变量名

python /path/to/tensorflow/tensorflow/python/tools/inspect_checkpoint.py --file_name=/path/to/pretrained_model/model.ckpt

inspect_checkpoint.py脚本可以在`。tensorflow源码的/tensorflow/python/tools文件夹。

要指定映射,您可以使用我的Tensorflow-Worklab,它包含一组类和脚本,用于训练和再训练不同的模型。它包括一个重新训练ResNet模型的例子,位于在这里

在TensorFlow 0.11.0RC1版本中,你可以根据https://www.tensorflow.org/programmers_guide/meta_graph直接调用tf.train.export_meta_graphtf.train.import_meta_graph来保存和恢复你的模型。

保存模型

w1 = tf.Variable(tf.truncated_normal(shape=[10]), name='w1')
w2 = tf.Variable(tf.truncated_normal(shape=[20]), name='w2')
tf.add_to_collection('vars', w1)
tf.add_to_collection('vars', w2)
saver = tf.train.Saver()
sess = tf.Session()
sess.run(tf.global_variables_initializer())
saver.save(sess, 'my-model')
# `save` method will call `export_meta_graph` implicitly.
# you will get saved graph files:my-model.meta

恢复模型

sess = tf.Session()
new_saver = tf.train.import_meta_graph('my-model.meta')
new_saver.restore(sess, tf.train.latest_checkpoint('./'))
all_vars = tf.get_collection('vars')
for v in all_vars:
v_ = sess.run(v)
print(v_)

如issue 6255所述:

use '**./**model_name.ckpt'
saver.restore(sess,'./my_model_final.ckpt')

而不是

saver.restore('my_model_final.ckpt')

你也可以用更简单的方法。

步骤1:初始化所有变量

W1 = tf.Variable(tf.truncated_normal([6, 6, 1, K], stddev=0.1), name="W1")
B1 = tf.Variable(tf.constant(0.1, tf.float32, [K]), name="B1")


Similarly, W2, B2, W3, .....

步骤2:在模型Saver中保存会话并保存它

model_saver = tf.train.Saver()


# Train the model and save it in the end
model_saver.save(session, "saved_models/CNN_New.ckpt")

步骤3:恢复模型

with tf.Session(graph=graph_cnn) as session:
model_saver.restore(session, "saved_models/CNN_New.ckpt")
print("Model restored.")
print('Initialized')

步骤4:检查变量

W1 = session.run(W1)
print(W1)

在不同的python实例中运行时,使用

with tf.Session() as sess:
# Restore latest checkpoint
saver.restore(sess, tf.train.latest_checkpoint('saved_model/.'))


# Initalize the variables
sess.run(tf.global_variables_initializer())


# Get default graph (supply your custom graph if you have one)
graph = tf.get_default_graph()


# It will give tensor object
W1 = graph.get_tensor_by_name('W1:0')


# To get the value (numpy array)
W1_value = session.run(W1)

在大多数情况下,使用tf.train.Saver从磁盘保存和恢复是您的最佳选择:

... # build your model
saver = tf.train.Saver()


with tf.Session() as sess:
... # train the model
saver.save(sess, "/tmp/my_great_model")


with tf.Session() as sess:
saver.restore(sess, "/tmp/my_great_model")
... # use the model

你也可以保存/恢复图形结构本身(详见MetaGraph文档)。默认情况下,Saver将图形结构保存到.meta文件中。您可以调用import_meta_graph()来恢复它。它恢复图形结构并返回一个Saver,你可以使用它来恢复模型的状态:

saver = tf.train.import_meta_graph("/tmp/my_great_model.meta")


with tf.Session() as sess:
saver.restore(sess, "/tmp/my_great_model")
... # use the model

然而,在某些情况下,您需要更快的方法。例如,如果您实现了早期停止,那么您希望在训练期间每次模型改进时都保存检查点(在验证集上测量),然后如果一段时间内没有进展,则希望回滚到最佳模型。如果每次模型改进时都将其保存到磁盘,则会极大地降低训练速度。诀窍是将变量状态保存到内存,然后稍后恢复它们:

... # build your model


# get a handle on the graph nodes we need to save/restore the model
graph = tf.get_default_graph()
gvars = graph.get_collection(tf.GraphKeys.GLOBAL_VARIABLES)
assign_ops = [graph.get_operation_by_name(v.op.name + "/Assign") for v in gvars]
init_values = [assign_op.inputs[1] for assign_op in assign_ops]


with tf.Session() as sess:
... # train the model


# when needed, save the model state to memory
gvars_state = sess.run(gvars)


# when needed, restore the model state
feed_dict = {init_value: val
for init_value, val in zip(init_values, gvars_state)}
sess.run(assign_ops, feed_dict=feed_dict)

一个简单的解释:当你创建一个变量X时,TensorFlow会自动创建一个赋值操作X/Assign来设置变量的初始值。我们不创建占位符和额外的赋值操作(这会使图变得混乱),而是使用这些现有的赋值操作。每个赋值op的第一个输入是对它应该初始化的变量的引用,第二个输入(assign_op.inputs[1])是初始值。因此,为了设置我们想要的任何值(而不是初始值),我们需要使用feed_dict并替换初始值。是的,TensorFlow允许你为任何操作提供一个值,而不仅仅是占位符,所以这很好。

下面是我对这两种基本情况的简单解决方案,这两种情况的不同之处在于您是想从文件加载图形还是在运行时构建它。

这个答案适用于Tensorflow 0.12+(包括1.0)。

在代码中重建图形

储蓄

graph = ... # build the graph
saver = tf.train.Saver()  # create the saver after the graph
with ... as sess:  # your session object
saver.save(sess, 'my-model')

加载

graph = ... # build the graph
saver = tf.train.Saver()  # create the saver after the graph
with ... as sess:  # your session object
saver.restore(sess, tf.train.latest_checkpoint('./'))
# now you can use the graph, continue training or whatever

还从文件中加载图形

否则Tensorflow将使名称本身是唯一的,因此它们将不同于存储在文件中的名称。在前一种技术中,这不是问题,因为名称在加载和保存时都以相同的方式“损坏”。

储蓄

graph = ... # build the graph


for op in [ ... ]:  # operators you want to use after restoring the model
tf.add_to_collection('ops_to_restore', op)


saver = tf.train.Saver()  # create the saver after the graph
with ... as sess:  # your session object
saver.save(sess, 'my-model')

加载

with ... as sess:  # your session object
saver = tf.train.import_meta_graph('my-model.meta')
saver.restore(sess, tf.train.latest_checkpoint('./'))
ops = tf.get_collection('ops_to_restore')  # here are your operators in the same order in which you saved them to the collection

我正在改进我的回答,以添加更多关于保存和恢复模型的细节。

Tensorflow 0.11版中(和之后):

保存模型:

import tensorflow as tf


#Prepare to feed input, i.e. feed_dict and placeholders
w1 = tf.placeholder("float", name="w1")
w2 = tf.placeholder("float", name="w2")
b1= tf.Variable(2.0,name="bias")
feed_dict ={w1:4,w2:8}


#Define a test operation that we will restore
w3 = tf.add(w1,w2)
w4 = tf.multiply(w3,b1,name="op_to_restore")
sess = tf.Session()
sess.run(tf.global_variables_initializer())


#Create a saver object which will save all the variables
saver = tf.train.Saver()


#Run the operation by feeding input
print sess.run(w4,feed_dict)
#Prints 24 which is sum of (w1+w2)*b1


#Now, save the graph
saver.save(sess, 'my_test_model',global_step=1000)

# EYZ0

import tensorflow as tf


sess=tf.Session()
#First let's load meta graph and restore weights
saver = tf.train.import_meta_graph('my_test_model-1000.meta')
saver.restore(sess,tf.train.latest_checkpoint('./'))




# Access saved Variables directly
print(sess.run('bias:0'))
# This will print 2, which is the value of bias that we saved




# Now, let's access and create placeholders variables and
# create feed-dict to feed new data


graph = tf.get_default_graph()
w1 = graph.get_tensor_by_name("w1:0")
w2 = graph.get_tensor_by_name("w2:0")
feed_dict ={w1:13.0,w2:17.0}


#Now, access the op that you want to run.
op_to_restore = graph.get_tensor_by_name("op_to_restore:0")


print sess.run(op_to_restore,feed_dict)
#This will print 60 which is calculated

这里已经很好地解释了这一点和一些更高级的用例。

一个快速完整的教程,保存和恢复Tensorflow模型

如果使用tf.train.MonitoredTrainingSession作为默认会话,则不需要添加额外的代码来执行保存/恢复操作。只需将检查点目录名称传递给MonitoredTrainingSession的构造函数,它将使用会话挂钩来处理这些。

这里所有的答案都很棒,但我想补充两点。

首先,详细说明@user7505159的答案,“。添加到要恢复的文件名的开头可能很重要。

例如,您可以保存没有“的图形。/"在文件名中如下所示:

# Some graph defined up here with specific names


saver = tf.train.Saver()
save_file = 'model.ckpt'


with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
saver.save(sess, save_file)

但是为了恢复图形,您可能需要在前面加上一个"。/"到file_name:

# Same graph defined up here


saver = tf.train.Saver()
save_file = './' + 'model.ckpt' # String addition used for emphasis


with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
saver.restore(sess, save_file)

你并不总是需要“。/”,但是它会根据你的环境和TensorFlow版本而导致问题。

它还想提到,在恢复会话之前,sess.run(tf.global_variables_initializer())可能很重要。

如果在尝试恢复保存的会话时收到关于未初始化变量的错误,请确保在saver.restore(sess, save_file)行之前包含sess.run(tf.global_variables_initializer())。这样你就不用头疼了。

我的环境:Python 3.6, Tensorflow 1.3.0

虽然有很多解决方案,但大多数都是基于tf.train.Saver。当我们加载由Saver保存的.ckpt时,我们必须要么重新定义张量流网络,要么使用一些奇怪且难以记住的名称,例如'placehold_0:0''dense/Adam/Weight:0'。这里我推荐使用tf.saved_model,下面给出的一个最简单的例子,你可以从服务一个TensorFlow模型中学到更多:

保存模型:

import tensorflow as tf


# define the tensorflow network and do some trains
x = tf.placeholder("float", name="x")
w = tf.Variable(2.0, name="w")
b = tf.Variable(0.0, name="bias")


h = tf.multiply(x, w)
y = tf.add(h, b, name="y")
sess = tf.Session()
sess.run(tf.global_variables_initializer())


# save the model
export_path =  './savedmodel'
builder = tf.saved_model.builder.SavedModelBuilder(export_path)


tensor_info_x = tf.saved_model.utils.build_tensor_info(x)
tensor_info_y = tf.saved_model.utils.build_tensor_info(y)


prediction_signature = (
tf.saved_model.signature_def_utils.build_signature_def(
inputs={'x_input': tensor_info_x},
outputs={'y_output': tensor_info_y},
method_name=tf.saved_model.signature_constants.PREDICT_METHOD_NAME))


builder.add_meta_graph_and_variables(
sess, [tf.saved_model.tag_constants.SERVING],
signature_def_map={
tf.saved_model.signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY:
prediction_signature
},
)
builder.save()

加载模型:

import tensorflow as tf
sess=tf.Session()
signature_key = tf.saved_model.signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY
input_key = 'x_input'
output_key = 'y_output'


export_path =  './savedmodel'
meta_graph_def = tf.saved_model.loader.load(
sess,
[tf.saved_model.tag_constants.SERVING],
export_path)
signature = meta_graph_def.signature_def


x_tensor_name = signature[signature_key].inputs[input_key].name
y_tensor_name = signature[signature_key].outputs[output_key].name


x = sess.graph.get_tensor_by_name(x_tensor_name)
y = sess.graph.get_tensor_by_name(y_tensor_name)


y_out = sess.run(y, {x: 3.0})

Tensorflow 2 Docs

储蓄检查点

改编自的文档

# -------------------------
# -----  Toy Context  -----
# -------------------------
import tensorflow as tf




class Net(tf.keras.Model):
"""A simple linear model."""


def __init__(self):
super(Net, self).__init__()
self.l1 = tf.keras.layers.Dense(5)


def call(self, x):
return self.l1(x)




def toy_dataset():
inputs = tf.range(10.0)[:, None]
labels = inputs * 5.0 + tf.range(5.0)[None, :]
return (
tf.data.Dataset.from_tensor_slices(dict(x=inputs, y=labels)).repeat().batch(2)
)




def train_step(net, example, optimizer):
"""Trains `net` on `example` using `optimizer`."""
with tf.GradientTape() as tape:
output = net(example["x"])
loss = tf.reduce_mean(tf.abs(output - example["y"]))
variables = net.trainable_variables
gradients = tape.gradient(loss, variables)
optimizer.apply_gradients(zip(gradients, variables))
return loss




# ----------------------------
# -----  Create Objects  -----
# ----------------------------


net = Net()
opt = tf.keras.optimizers.Adam(0.1)
dataset = toy_dataset()
iterator = iter(dataset)
ckpt = tf.train.Checkpoint(
step=tf.Variable(1), optimizer=opt, net=net, iterator=iterator
)
manager = tf.train.CheckpointManager(ckpt, "./tf_ckpts", max_to_keep=3)


# ----------------------------
# -----  Train and Save  -----
# ----------------------------


ckpt.restore(manager.latest_checkpoint)
if manager.latest_checkpoint:
print("Restored from {}".format(manager.latest_checkpoint))
else:
print("Initializing from scratch.")


for _ in range(50):
example = next(iterator)
loss = train_step(net, example, opt)
ckpt.step.assign_add(1)
if int(ckpt.step) % 10 == 0:
save_path = manager.save()
print("Saved checkpoint for step {}: {}".format(int(ckpt.step), save_path))
print("loss {:1.2f}".format(loss.numpy()))




# ---------------------
# -----  Restore  -----
# ---------------------


# In another script, re-initialize objects
opt = tf.keras.optimizers.Adam(0.1)
net = Net()
dataset = toy_dataset()
iterator = iter(dataset)
ckpt = tf.train.Checkpoint(
step=tf.Variable(1), optimizer=opt, net=net, iterator=iterator
)
manager = tf.train.CheckpointManager(ckpt, "./tf_ckpts", max_to_keep=3)


# Re-use the manager code above ^


ckpt.restore(manager.latest_checkpoint)
if manager.latest_checkpoint:
print("Restored from {}".format(manager.latest_checkpoint))
else:
print("Initializing from scratch.")


for _ in range(50):
example = next(iterator)
# Continue training or evaluate etc.


更多的链接

  • 详尽和有用的教程saved_model ># EYZ2

  • keras详细指南保存模型-># EYZ2

检查点捕获所有参数的确切值(tf。模型使用的变量对象)。检查点不包含模型定义的计算的任何描述,因此通常只在使用保存的参数值的源代码可用时才有用。

另一方面SavedModel格式除了参数值包括由模型定义的计算的序列化描述(检查点)。这种格式的模型是创建模型的源代码中的独立的。因此,它们适合通过TensorFlow services、TensorFlow Lite、TensorFlow.js或其他编程语言(C、c++、Java、Go、Rust、c#等)的程序进行部署。TensorFlow api)。

(重点是我自己的)


Tensorflow & lt;2


从文档中可以看出:

保存

# Create some variables.
v1 = tf.get_variable("v1", shape=[3], initializer = tf.zeros_initializer)
v2 = tf.get_variable("v2", shape=[5], initializer = tf.zeros_initializer)


inc_v1 = v1.assign(v1+1)
dec_v2 = v2.assign(v2-1)


# Add an op to initialize the variables.
init_op = tf.global_variables_initializer()


# Add ops to save and restore all the variables.
saver = tf.train.Saver()


# Later, launch the model, initialize the variables, do some work, and save the
# variables to disk.
with tf.Session() as sess:
sess.run(init_op)
# Do some work with the model.
inc_v1.op.run()
dec_v2.op.run()
# Save the variables to disk.
save_path = saver.save(sess, "/tmp/model.ckpt")
print("Model saved in path: %s" % save_path)

恢复

tf.reset_default_graph()


# Create some variables.
v1 = tf.get_variable("v1", shape=[3])
v2 = tf.get_variable("v2", shape=[5])


# Add ops to save and restore all the variables.
saver = tf.train.Saver()


# Later, launch the model, use the saver to restore variables from disk, and
# do some work with the model.
with tf.Session() as sess:
# Restore variables from disk.
saver.restore(sess, "/tmp/model.ckpt")
print("Model restored.")
# Check the values of the variables
print("v1 : %s" % v1.eval())
print("v2 : %s" % v2.eval())

# EYZ0

很多很好的答案,为了完整起见,我将添加我的2分:simple_save。还有一个使用tf.data.Dataset API的独立代码示例。

Python 3;Tensorflow # EYZ0

import tensorflow as tf
from tensorflow.saved_model import tag_constants


with tf.Graph().as_default():
with tf.Session() as sess:
...


# Saving
inputs = {
"batch_size_placeholder": batch_size_placeholder,
"features_placeholder": features_placeholder,
"labels_placeholder": labels_placeholder,
}
outputs = {"prediction": model_output}
tf.saved_model.simple_save(
sess, 'path/to/your/location/', inputs, outputs
)

恢复:

graph = tf.Graph()
with restored_graph.as_default():
with tf.Session() as sess:
tf.saved_model.loader.load(
sess,
[tag_constants.SERVING],
'path/to/your/location/',
)
batch_size_placeholder = graph.get_tensor_by_name('batch_size_placeholder:0')
features_placeholder = graph.get_tensor_by_name('features_placeholder:0')
labels_placeholder = graph.get_tensor_by_name('labels_placeholder:0')
prediction = restored_graph.get_tensor_by_name('dense/BiasAdd:0')


sess.run(prediction, feed_dict={
batch_size_placeholder: some_value,
features_placeholder: some_other_value,
labels_placeholder: another_value
})

独立的例子

< a href = " http://vict0rsch.github。io/2018/05/17/restore-tf-model-dataset/" rel="noreferrer">原创博文 . io/2018/05/17/restore-tf-model-dataset/" rel="noreferrer">

为了便于演示,下面的代码生成随机数据。

  1. 我们从创建占位符开始。它们将在运行时保存数据。从它们中,我们创建了DatasetIterator。我们得到迭代器生成的张量,名为input_tensor,它将作为模型的输入。
  2. 模型本身是从input_tensor构建的:一个基于gru的双向RNN,后面跟着一个密集分类器。为什么不呢?
  3. 损失是softmax_cross_entropy_with_logits,用Adam优化。经过2个时代(每个2批次),我们保存“训练有素”;使用tf.saved_model.simple_save建模。如果您按原样运行代码,那么模型将保存在当前工作目录中名为simple/的文件夹中。
  4. 在一个新的图中,我们用tf.saved_model.loader.load恢复保存的模型。我们用graph.get_tensor_by_name获取占位符和logits,用graph.get_operation_by_name获取Iterator初始化操作。
  5. 最后,我们对数据集中的两个批次运行推理,并检查保存的和恢复的模型都产生相同的值。他们做的!

代码:

import os
import shutil
import numpy as np
import tensorflow as tf
from tensorflow.python.saved_model import tag_constants




def model(graph, input_tensor):
"""Create the model which consists of
a bidirectional rnn (GRU(10)) followed by a dense classifier


Args:
graph (tf.Graph): Tensors' graph
input_tensor (tf.Tensor): Tensor fed as input to the model


Returns:
tf.Tensor: the model's output layer Tensor
"""
cell = tf.nn.rnn_cell.GRUCell(10)
with graph.as_default():
((fw_outputs, bw_outputs), (fw_state, bw_state)) = tf.nn.bidirectional_dynamic_rnn(
cell_fw=cell,
cell_bw=cell,
inputs=input_tensor,
sequence_length=[10] * 32,
dtype=tf.float32,
swap_memory=True,
scope=None)
outputs = tf.concat((fw_outputs, bw_outputs), 2)
mean = tf.reduce_mean(outputs, axis=1)
dense = tf.layers.dense(mean, 5, activation=None)


return dense




def get_opt_op(graph, logits, labels_tensor):
"""Create optimization operation from model's logits and labels


Args:
graph (tf.Graph): Tensors' graph
logits (tf.Tensor): The model's output without activation
labels_tensor (tf.Tensor): Target labels


Returns:
tf.Operation: the operation performing a stem of Adam optimizer
"""
with graph.as_default():
with tf.variable_scope('loss'):
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(
logits=logits, labels=labels_tensor, name='xent'),
name="mean-xent"
)
with tf.variable_scope('optimizer'):
opt_op = tf.train.AdamOptimizer(1e-2).minimize(loss)
return opt_op




if __name__ == '__main__':
# Set random seed for reproducibility
# and create synthetic data
np.random.seed(0)
features = np.random.randn(64, 10, 30)
labels = np.eye(5)[np.random.randint(0, 5, (64,))]


graph1 = tf.Graph()
with graph1.as_default():
# Random seed for reproducibility
tf.set_random_seed(0)
# Placeholders
batch_size_ph = tf.placeholder(tf.int64, name='batch_size_ph')
features_data_ph = tf.placeholder(tf.float32, [None, None, 30], 'features_data_ph')
labels_data_ph = tf.placeholder(tf.int32, [None, 5], 'labels_data_ph')
# Dataset
dataset = tf.data.Dataset.from_tensor_slices((features_data_ph, labels_data_ph))
dataset = dataset.batch(batch_size_ph)
iterator = tf.data.Iterator.from_structure(dataset.output_types, dataset.output_shapes)
dataset_init_op = iterator.make_initializer(dataset, name='dataset_init')
input_tensor, labels_tensor = iterator.get_next()


# Model
logits = model(graph1, input_tensor)
# Optimization
opt_op = get_opt_op(graph1, logits, labels_tensor)


with tf.Session(graph=graph1) as sess:
# Initialize variables
tf.global_variables_initializer().run(session=sess)
for epoch in range(3):
batch = 0
# Initialize dataset (could feed epochs in Dataset.repeat(epochs))
sess.run(
dataset_init_op,
feed_dict={
features_data_ph: features,
labels_data_ph: labels,
batch_size_ph: 32
})
values = []
while True:
try:
if epoch < 2:
# Training
_, value = sess.run([opt_op, logits])
print('Epoch {}, batch {} | Sample value: {}'.format(epoch, batch, value[0]))
batch += 1
else:
# Final inference
values.append(sess.run(logits))
print('Epoch {}, batch {} | Final inference | Sample value: {}'.format(epoch, batch, values[-1][0]))
batch += 1
except tf.errors.OutOfRangeError:
break
# Save model state
print('\nSaving...')
cwd = os.getcwd()
path = os.path.join(cwd, 'simple')
shutil.rmtree(path, ignore_errors=True)
inputs_dict = {
"batch_size_ph": batch_size_ph,
"features_data_ph": features_data_ph,
"labels_data_ph": labels_data_ph
}
outputs_dict = {
"logits": logits
}
tf.saved_model.simple_save(
sess, path, inputs_dict, outputs_dict
)
print('Ok')
# Restoring
graph2 = tf.Graph()
with graph2.as_default():
with tf.Session(graph=graph2) as sess:
# Restore saved values
print('\nRestoring...')
tf.saved_model.loader.load(
sess,
[tag_constants.SERVING],
path
)
print('Ok')
# Get restored placeholders
labels_data_ph = graph2.get_tensor_by_name('labels_data_ph:0')
features_data_ph = graph2.get_tensor_by_name('features_data_ph:0')
batch_size_ph = graph2.get_tensor_by_name('batch_size_ph:0')
# Get restored model output
restored_logits = graph2.get_tensor_by_name('dense/BiasAdd:0')
# Get dataset initializing operation
dataset_init_op = graph2.get_operation_by_name('dataset_init')


# Initialize restored dataset
sess.run(
dataset_init_op,
feed_dict={
features_data_ph: features,
labels_data_ph: labels,
batch_size_ph: 32
}


)
# Compute inference for both batches in dataset
restored_values = []
for i in range(2):
restored_values.append(sess.run(restored_logits))
print('Restored values: ', restored_values[i][0])


# Check if original inference and restored inference are equal
valid = all((v == rv).all() for v, rv in zip(values, restored_values))
print('\nInferences match: ', valid)

这将打印:

$ python3 save_and_restore.py


Epoch 0, batch 0 | Sample value: [-0.13851789 -0.3087595   0.12804556  0.20013677 -0.08229901]
Epoch 0, batch 1 | Sample value: [-0.00555491 -0.04339041 -0.05111827 -0.2480045  -0.00107776]
Epoch 1, batch 0 | Sample value: [-0.19321944 -0.2104792  -0.00602257  0.07465433  0.11674127]
Epoch 1, batch 1 | Sample value: [-0.05275984  0.05981954 -0.15913513 -0.3244143   0.10673307]
Epoch 2, batch 0 | Final inference | Sample value: [-0.26331693 -0.13013336 -0.12553    -0.04276478  0.2933622 ]
Epoch 2, batch 1 | Final inference | Sample value: [-0.07730117  0.11119192 -0.20817074 -0.35660955  0.16990358]


Saving...
INFO:tensorflow:Assets added to graph.
INFO:tensorflow:No assets to write.
INFO:tensorflow:SavedModel written to: b'/some/path/simple/saved_model.pb'
Ok


Restoring...
INFO:tensorflow:Restoring parameters from b'/some/path/simple/variables/variables'
Ok
Restored values:  [-0.26331693 -0.13013336 -0.12553    -0.04276478  0.2933622 ]
Restored values:  [-0.07730117  0.11119192 -0.20817074 -0.35660955  0.16990358]


Inferences match:  True

使用tf.train.Saver保存模型。记住,如果你想减小模型大小,你需要指定var_listval_list可以是:

  • # EYZ0或
  • # EYZ0。

根据新的Tensorflow版本,tf.train.Checkpoint是保存和恢复模型的最佳方式:

Checkpoint.saveCheckpoint.restore读写基于对象的 检查点,与tf.train.Saver的读写功能相反 基于检查点的变量。基于对象的检查点保存一个 Python对象之间的依赖关系图(图层,优化器, 变量等)具有命名的边,这个图用于匹配 恢复检查点时的变量。它可以更健壮 修改Python程序,并帮助支持创建时恢复 用于急切执行时的变量。优先选择tf.train.Checkpoint tf.train.Saver为新代码.

. #

这里有一个例子:

import tensorflow as tf
import os


tf.enable_eager_execution()


checkpoint_directory = "/tmp/training_checkpoints"
checkpoint_prefix = os.path.join(checkpoint_directory, "ckpt")


checkpoint = tf.train.Checkpoint(optimizer=optimizer, model=model)
status = checkpoint.restore(tf.train.latest_checkpoint(checkpoint_directory))
for _ in range(num_training_steps):
optimizer.minimize( ... )  # Variables will be restored on creation.
status.assert_consumed()  # Optional sanity checks.
checkpoint.save(file_prefix=checkpoint_prefix)

# EYZ0

无论你想把模型保存在哪里,

self.saver = tf.train.Saver()
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
...
self.saver.save(sess, filename)
请确保所有tf.Variable都有名称,因为您可能希望稍后使用它们的名称来恢复它们。 在你想预测的地方,

saver = tf.train.import_meta_graph(filename)
name = 'name given when you saved the file'
with tf.Session() as sess:
saver.restore(sess, name)
print(sess.run('W1:0')) #example to retrieve by variable name

确保保护程序在相应的会话中运行。 请记住,如果使用tf.train.latest_checkpoint('./'),那么将只使用最新的检查点。< / p >

你可以使用保存网络中的变量

saver = tf.train.Saver()
saver.save(sess, 'path of save/fileName.ckpt')

恢复网络以便以后或在另一个脚本中重用,使用:

saver = tf.train.Saver()
saver.restore(sess, tf.train.latest_checkpoint('path of save/')
sess.run(....)

重要的几点:

  1. sess在第一次运行和以后运行之间必须相同(一致的结构)。
  2. saver.restore需要保存文件的文件夹路径,而不是单个文件路径。

对于tensorflow 2.0,它是简单到

# Save the model
model.save('path_to_my_model.h5')

恢复:

new_model = tensorflow.keras.models.load_model('path_to_my_model.h5')

我在版本:

tensorflow (1.13.1)
tensorflow-gpu (1.13.1)

简单的方法是

拯救策略:

model.save("model.h5")

恢复:

model = tf.keras.models.load_model("model.h5")

在新版本的tensorflow 2.0中,保存/加载模型的过程要容易得多。因为Keras API的实现,一个TensorFlow的高级API。

保存一个模型: 请查阅相关文档以作参考: # EYZ0 < / p >
tf.keras.models.save_model(model_name, filepath, save_format)

加载一个模型:

https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/keras/models/load_model

model = tf.keras.models.load_model(filepath)

特遣部队。keras模型保存TF2.0

我看到了使用TF1.x保存模型的很好的答案。我想提供更多关于保存tensorflow.keras模型的提示,这有点复杂,因为保存模型的方法有很多。

这里我提供了一个保存tensorflow.keras模型到当前目录下的model_path文件夹的例子。这可以很好地与最新的tensorflow (TF2.0)一起工作。如果在不久的将来有任何变化,我会更新这个描述。

保存和加载整个模型

import tensorflow as tf
from tensorflow import keras
mnist = tf.keras.datasets.mnist


#import data
(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0


# create a model
def create_model():
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(512, activation=tf.nn.relu),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10, activation=tf.nn.softmax)
])
# compile the model
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
return model


# Create a basic model instance
model=create_model()


model.fit(x_train, y_train, epochs=1)
loss, acc = model.evaluate(x_test, y_test,verbose=1)
print("Original model, accuracy: {:5.2f}%".format(100*acc))


# Save entire model to a HDF5 file
model.save('./model_path/my_model.h5')


# Recreate the exact same model, including weights and optimizer.
new_model = keras.models.load_model('./model_path/my_model.h5')
loss, acc = new_model.evaluate(x_test, y_test)
print("Restored model, accuracy: {:5.2f}%".format(100*acc))

仅保存和加载模型重量

如果您只对保存模型权重感兴趣,然后加载权重以恢复模型,那么

model.fit(x_train, y_train, epochs=5)
loss, acc = model.evaluate(x_test, y_test,verbose=1)
print("Original model, accuracy: {:5.2f}%".format(100*acc))


# Save the weights
model.save_weights('./checkpoints/my_checkpoint')


# Restore the weights
model = create_model()
model.load_weights('./checkpoints/my_checkpoint')


loss,acc = model.evaluate(x_test, y_test)
print("Restored model, accuracy: {:5.2f}%".format(100*acc))

使用keras检查点回调保存和恢复

# include the epoch in the file name. (uses `str.format`)
checkpoint_path = "training_2/cp-{epoch:04d}.ckpt"
checkpoint_dir = os.path.dirname(checkpoint_path)


cp_callback = tf.keras.callbacks.ModelCheckpoint(
checkpoint_path, verbose=1, save_weights_only=True,
# Save weights, every 5-epochs.
period=5)


model = create_model()
model.save_weights(checkpoint_path.format(epoch=0))
model.fit(train_images, train_labels,
epochs = 50, callbacks = [cp_callback],
validation_data = (test_images,test_labels),
verbose=0)


latest = tf.train.latest_checkpoint(checkpoint_dir)


new_model = create_model()
new_model.load_weights(latest)
loss, acc = new_model.evaluate(test_images, test_labels)
print("Restored model, accuracy: {:5.2f}%".format(100*acc))

保存自定义度量的模型

import tensorflow as tf
from tensorflow import keras
mnist = tf.keras.datasets.mnist


(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0


# Custom Loss1 (for example)
@tf.function()
def customLoss1(yTrue,yPred):
return tf.reduce_mean(yTrue-yPred)


# Custom Loss2 (for example)
@tf.function()
def customLoss2(yTrue, yPred):
return tf.reduce_mean(tf.square(tf.subtract(yTrue,yPred)))


def create_model():
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(512, activation=tf.nn.relu),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10, activation=tf.nn.softmax)
])
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy', customLoss1, customLoss2])
return model


# Create a basic model instance
model=create_model()


# Fit and evaluate model
model.fit(x_train, y_train, epochs=1)
loss, acc,loss1, loss2 = model.evaluate(x_test, y_test,verbose=1)
print("Original model, accuracy: {:5.2f}%".format(100*acc))


model.save("./model.h5")


new_model=tf.keras.models.load_model("./model.h5",custom_objects={'customLoss1':customLoss1,'customLoss2':customLoss2})

使用自定义操作保存keras模型

当我们像下面的例子(tf.tile)一样有自定义操作时,我们需要创建一个函数并使用Lambda层进行包装。否则,无法保存模型。

import numpy as np
import tensorflow as tf
from tensorflow.keras.layers import Input, Lambda
from tensorflow.keras import Model


def my_fun(a):
out = tf.tile(a, (1, tf.shape(a)[0]))
return out


a = Input(shape=(10,))
#out = tf.tile(a, (1, tf.shape(a)[0]))
out = Lambda(lambda x : my_fun(x))(a)
model = Model(a, out)


x = np.zeros((50,10), dtype=np.float32)
print(model(x).numpy())


model.save('my_model.h5')


#load the model
new_model=tf.keras.models.load_model("my_model.h5")

我想我已经介绍了许多保存tf的方法中的一些。keras模型。然而,还有许多其他的方法。如果你发现你的用例没有在上面提到,请在下面评论。谢谢!

遵循@Vishnuvardhan Janapati的回答,这里是另一种保存和重载模型的方法,在TensorFlow 2.0.0下使用自定义层/标准/损失

import tensorflow as tf
from tensorflow.keras.layers import Layer
from tensorflow.keras.utils.generic_utils import get_custom_objects


# custom loss (for example)
def custom_loss(y_true,y_pred):
return tf.reduce_mean(y_true - y_pred)
get_custom_objects().update({'custom_loss': custom_loss})


# custom loss (for example)
class CustomLayer(Layer):
def __init__(self, ...):
...
# define custom layer and all necessary custom operations inside custom layer


get_custom_objects().update({'CustomLayer': CustomLayer})

通过这种方式,一旦您执行了这样的代码,并使用tf.keras.models.save_modelmodel.saveModelCheckpoint回调保存了您的模型,您可以重新加载您的模型,而不需要精确的自定义对象,就像这样简单

new_model = tf.keras.models.load_model("./model.h5"})

tensorflow - 2.0

这很简单。

import tensorflow as tf

保存

model.save("model_name")

恢复

model = tf.keras.models.load_model('model_name')

下面是一个简单的例子,使用Tensorflow 2.0 SavedModel格式(这是推荐的格式,根据docs)作为一个简单的MNIST数据集分类器,使用Keras函数式API,没有太多花哨的东西:

# Imports
import tensorflow as tf
from tensorflow.keras.layers import Input, Dense, Flatten
from tensorflow.keras.models import Model
import matplotlib.pyplot as plt


# Load data
mnist = tf.keras.datasets.mnist # 28 x 28
(x_train,y_train), (x_test, y_test) = mnist.load_data()


# Normalize pixels [0,255] -> [0,1]
x_train = tf.keras.utils.normalize(x_train,axis=1)
x_test = tf.keras.utils.normalize(x_test,axis=1)


# Create model
input = Input(shape=(28,28), dtype='float64', name='graph_input')
x = Flatten()(input)
x = Dense(128, activation='relu')(x)
x = Dense(128, activation='relu')(x)
output = Dense(10, activation='softmax', name='graph_output', dtype='float64')(x)
model = Model(inputs=input, outputs=output)


model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])


# Train
model.fit(x_train, y_train, epochs=3)


# Save model in SavedModel format (Tensorflow 2.0)
export_path = 'model'
tf.saved_model.save(model, export_path)


# ... possibly another python program


# Reload model
loaded_model = tf.keras.models.load_model(export_path)


# Get image sample for testing
index = 0
img = x_test[index] # I normalized the image on a previous step


# Predict using the signature definition (Tensorflow 2.0)
predict = loaded_model.signatures["serving_default"]
prediction = predict(tf.constant(img))


# Show results
print(np.argmax(prediction['graph_output']))  # prints the class number
plt.imshow(x_test[index], cmap=plt.cm.binary)  # prints the image


# EYZ1

它是您所选择的标记的签名定义的名称(在本例中,选择了默认的serve标记)。此外,在这里解释了如何使用saved_model_cli找到模型的标签和签名。

免责声明

这只是一个基本的例子,如果你只是想让它运行起来,但这绝不是一个完整的答案-也许我可以在未来更新它。我只是想给出一个在TF 2.0中使用SavedModel的简单例子,因为我在任何地方都没有见过这样简单的例子。

@汤姆的答案是一个SavedModel的例子,但它不能在Tensorflow 2.0上工作,因为不幸的是有一些突破性的变化。

@Vishnuvardhan Janapati的答案是TF 2.0,但它不适合SavedModel格式。

Tensorflow 2.6:现在它变得更简单了,你可以用2种格式保存模型

  1. Saved_model (tf服务兼容)
  2. H5或HDF5

以两种格式保存模型:

 from tensorflow.keras import Model
inputs = tf.keras.Input(shape=(224,224,3))
y = tf.keras.layers.Conv2D(24, 3, activation='relu', input_shape=input_shape[1:])(inputs)
outputs = tf.keras.layers.Dense(5, activation=tf.nn.softmax)(y)
model = tf.keras.Model(inputs=inputs, outputs=outputs)
model.save("saved_model/my_model") #To Save in Saved_model format
model.save("my_model.h5") #To save model in H5 or HDF5 format

以两种格式加载模型

import tensorflow as tf
h5_model = tf.keras.models.load_model("my_model.h5") # loading model in h5 format
h5_model.summary()
saved_m = tf.keras.models.load_model("saved_model/my_model") #loading model in saved_model format
saved_m.summary()

最简单的方法是使用keras api,在线保存模型和一行加载模型

from keras.models import load_model


my_model.save('my_model.h5')  # creates a HDF5 file 'my_model.h5'


del my_model  # deletes the existing model




my_model = load_model('my_model.h5') # returns a compiled model identical to the previous one

你可以使用Tensorflow中的saver对象来保存你训练过的模型。该对象提供保存和恢复模型的方法。

在TensorFlow中保存一个训练好的模型:

tf.train.Saver.save(sess, save_path, global_step=None, latest_filename=None,
meta_graph_suffix='meta', write_meta_graph=True,
write_state=True, strip_default_attrs=False,
save_debug_info=False)

在TensorFlow中恢复已保存的模型:

tf.train.Saver.restore(sess, save_path, latest_filename=None,
meta_graph_suffix='meta', clear_devices=False,
import_scope=None)