最佳答案
很多用户都把它作为切换到 Pytorch 的理由,但是我还没有找到一个理由/解释为什么要牺牲最重要的实用质量——速度,来满足急切的执行。
下面是代码基准测试性能,TF1对 TF2-TF1在 快了47% 到276% 的任何地方运行。
我的问题是: 在图表或硬件层面,是什么导致了如此显著的减速?
寻找一个详细的答案-我已经熟悉广泛的概念
眼镜: CUDA 10.0.130,cuDNN 7.4.2,Python 3.7.4,Windows 10,GTX 1070
返回文章页面
更新: 禁用以下代码的急切执行有助于 没有。然而,这种行为是不一致的: 有时在图形模式下运行有很大帮助,有时相对于 Eager 运行 慢一点。
返回文章页面
# use tensorflow.keras... to benchmark tf.keras; used GPU for all above benchmarks
from keras.layers import Input, Dense, LSTM, Bidirectional, Conv1D
from keras.layers import Flatten, Dropout
from keras.models import Model
from keras.optimizers import Adam
import keras.backend as K
import numpy as np
from time import time
batch_shape = (32, 400, 16)
X, y = make_data(batch_shape)
model_small = make_small_model(batch_shape)
model_small.train_on_batch(X, y) # skip first iteration which builds graph
timeit(model_small.train_on_batch, 200, X, y)
K.clear_session() # in my testing, kernel was restarted instead
model_medium = make_medium_model(batch_shape)
model_medium.train_on_batch(X, y) # skip first iteration which builds graph
timeit(model_medium.train_on_batch, 10, X, y)
返回文章页面
def timeit(func, iterations, *args):
t0 = time()
for _ in range(iterations):
func(*args)
print("Time/iter: %.4f sec" % ((time() - t0) / iterations))
def make_small_model(batch_shape):
ipt = Input(batch_shape=batch_shape)
x = Conv1D(128, 400, strides=4, padding='same')(ipt)
x = Flatten()(x)
x = Dropout(0.5)(x)
x = Dense(64, activation='relu')(x)
out = Dense(1, activation='sigmoid')(x)
model = Model(ipt, out)
model.compile(Adam(lr=1e-4), 'binary_crossentropy')
return model
def make_medium_model(batch_shape):
ipt = Input(batch_shape=batch_shape)
x = Bidirectional(LSTM(512, activation='relu', return_sequences=True))(ipt)
x = LSTM(512, activation='relu', return_sequences=True)(x)
x = Conv1D(128, 400, strides=4, padding='same')(x)
x = Flatten()(x)
x = Dense(256, activation='relu')(x)
x = Dropout(0.5)(x)
x = Dense(128, activation='relu')(x)
x = Dense(64, activation='relu')(x)
out = Dense(1, activation='sigmoid')(x)
model = Model(ipt, out)
model.compile(Adam(lr=1e-4), 'binary_crossentropy')
return model
def make_data(batch_shape):
return np.random.randn(*batch_shape), np.random.randint(0, 2, (batch_shape[0], 1))