如何在 Keras 中堆栈多个 lstm?

我正在使用深度学习库 Keras 并尝试堆栈多个 LSTM,但没有成功。 下面是我的代码

model = Sequential()
model.add(LSTM(100,input_shape =(time_steps,vector_size)))
model.add(LSTM(100))

上面的代码在第三行 < code > Exception 中返回错误: Input 0与层 lstm _ 28不兼容: 预期的 ndim = 3,发现 ndim = 2

输入 X 是一个形状的张量(100,250,50)。我正在张量流的后端运行 Keras

68369 次浏览

You need to add return_sequences=True to the first layer so that its output tensor has ndim=3 (i.e. batch size, timesteps, hidden state).

Please see the following example:

# expected input data shape: (batch_size, timesteps, data_dim)
model = Sequential()
model.add(LSTM(32, return_sequences=True,
input_shape=(timesteps, data_dim)))  # returns a sequence of vectors of dimension 32
model.add(LSTM(32, return_sequences=True))  # returns a sequence of vectors of dimension 32
model.add(LSTM(32))  # return a single vector of dimension 32
model.add(Dense(10, activation='softmax'))

From: https://keras.io/getting-started/sequential-model-guide/ (search for "stacked lstm")

Detail explanation to @DanielAdiwardana 's answer. We need to add return_sequences=True for all LSTM layers except the last one.

Setting this flag to True lets Keras know that LSTM output should contain all historical generated outputs along with time stamps (3D). So, next LSTM layer can work further on the data.

If this flag is false, then LSTM only returns last output (2D). Such output is not good enough for another LSTM layer.

# expected input data shape: (batch_size, timesteps, data_dim)
model = Sequential()
model.add(LSTM(32, return_sequences=True,
input_shape=(timesteps, data_dim)))  # returns a sequence of vectors of dimension 32
model.add(LSTM(32, return_sequences=True))  # returns a sequence of vectors of dimension 32
model.add(LSTM(32))  # return a single vector of dimension 32
model.add(Dense(10, activation='softmax'))

On side NOTE :: last Dense layer is added to get output in format needed by the user. Here Dense(10) means one-hot encoded output for classification task with 10 classes. It can be generalised to have 'n' neurons for classification task with 'n' classes.

In case you are using LSTM for regression (or time series) then you may have Dense(1). So that only one numeric output is given.

An example code like this should work:

regressor = Sequential()


regressor.add(LSTM(units = 50, return_sequences = True, input_shape = (33, 1)))
regressor.add(Dropout(0.2))


regressor.add(LSTM(units = 50, return_sequences = True))
regressor.add(Dropout(0.2))


regressor.add(LSTM(units = 50, return_sequences = True))
regressor.add(Dropout(0.2))


regressor.add(LSTM(units = 50))
regressor.add(Dropout(0.2))


regressor.add(Dense(units = 1))


regressor.compile(optimizer = 'adam', loss = 'mean_squared_error')


regressor.fit(X_train, y_train, epochs = 10, batch_size = 4096)

There's another way to do this using layers.RepeatVector.

Here is an example:

model = keras.Sequential()
model.add(layers.LSTM(32, (15,1)))
model.add(RepeatVector(10))
model.add(layers.LSTM(10, return_sequences=True))

Here repeat vector repeats the output of your previous LSTM layer as many times as it is needed. This was helpful for me creating an encoder decoder model where I did not want to return_sequences from the first layer.

You can read more in this blog: https://machinelearningmastery.com/encoder-decoder-long-short-term-memory-networks/