稀疏_softmax_cross_熵_with_logits 和 softmax_cross_熵_with_logits 有什么区别？

小开

最佳答案

有两个不同的函数是方便，因为它们产生相同的结果。

区别很简单:

For sparse_softmax_cross_entropy_with_logits, labels must have the shape [batch_size] and the dtype int32 or int64. Each label is an int in range [0, num_classes-1].
对于 softmax_cross_entropy_with_logits，标签必须具有形状[ batch _ size，num _ classes ]和 dtype float32或 float64。

softmax_cross_entropy_with_logits中使用的标签是 sparse_softmax_cross_entropy_with_logits中使用的标签的 一个性感的版本。

另一个微小的区别是，对于 sparse_softmax_cross_entropy_with_logits，您可以给出 -1作为标签，以便在这个标签上有损失 0。

小开

我只是想添加2件事情，以接受的答案，你也可以在 TF 文档中找到。

First:

Softmax _ cross _ 熵 _ with _ logits

NOTE: While the classes are mutually exclusive, their probabilities 不需要。所需要的只是每一行标签是一个 valid probability distribution. If they are not, the computation of 渐变将不正确。

第二:

稀疏 _ softmax _ cross _ 熵 _ with _ logits

注意: 对于此操作，给定标签的概率为 considered exclusive. That is, soft classes are not allowed, and the 标签向量必须为真实类提供一个特定的索引对于每行 logits (每个迷你批处理条目)。

小开

这两个函数计算相同的结果和与 logits 交叉熵计算交叉熵直接对稀疏标签，而不是转换他们与一次加热编码。

您可以通过运行以下程序来验证这一点:

import tensorflow as tf
from random import randint


dims = 8
pos  = randint(0, dims - 1)


logits = tf.random_uniform([dims], maxval=3, dtype=tf.float32)
labels = tf.one_hot(pos, dims)


res1 = tf.nn.softmax_cross_entropy_with_logits(       logits=logits, labels=labels)
res2 = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=logits, labels=tf.constant(pos))


with tf.Session() as sess:
a, b = sess.run([res1, res2])
print a, b
print a == b

在这里，我创建一个长度为 dims的随机 logits向量，并生成一个热编码标签(其中 pos中的元素为1，其他元素为0)。

After that I calculate softmax and sparse softmax and compare their output. Try rerunning it a few times to make sure that it always produce the same output