4 months ago

之前简单体验了一次keras,这次来进行一些深度体验
由于keras已经被tensorflow收编为高级API,果断在tensorflow的基础上使用keras

一、keras简单使用步骤


整个步骤看起来很多,其实最复杂的就是前面两步,其余步骤都是一行代码的问题

1.定义网络

先定义一个时序模型,然后就可以添加一个或多个网络层

import tensorflow as tf
model = tf.keras.Sequential()

添加各种网络层

model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(512, activation=tf.nn.relu))
model.add(tf.keras.layers.Dropout(0.2))
model.add(tf.keras.layers.Dense(10, activation=tf.nn.softmax))

全连接层之前别忘了Flatten,这是一个超级大坑

2.编译模型

指定模型使用的loss函数和optimizer函数

model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

3.拟合网络:使用训练集训练网络

model.fit(x_train, y_train, epochs=5, batch_size=32)

当然也可以手动分批次训练网络(有了上面的batch-size,相信没人这么清闲吧)

model.train_on_batch(x_batch, y_batch)

4.评估网络:使用测试集数据进行评估

model.evaluate(x_test, y_test)

5.进行预测

使用新数据进行预测

model.predict(x_test, batch_size=32)

二、keras构建一个简单的LSTM网络,预测电影类别

根据电影的描述信息,预测电影感情基调是positive或negtive
keras内置的数据集IMDB,已经转换为深度学习模式适用的格式:单词已经被替换为整数,代表每个单词在该数据集中的有序频率

import numpy
import tensorflow as tf

tf.reset_default_graph()
# load the dataset but only keep the top n words, zero the rest
top_words = 5000
imdb=tf.keras.datasets.imdb
(X_train, y_train), (X_test, y_test) = imdb.load_data(num_words=top_words)
# 使用keras进行计算时,输入数组必须拥有相同长度,实际每个序列内容不一样长
# 所以进行零填充:将每条影评限制为500词,不足500词的用0填充,长于500词的截短
max_review_length = 500
X_train = tf.keras.preprocessing.sequence.pad_sequences(X_train, maxlen=max_review_length)
X_test = tf.keras.preprocessing.sequence.pad_sequences(X_test, maxlen=max_review_length)

# create the model
embedding_vector_length = 32
model = tf.keras.Sequential()
model.add(tf.keras.layers.Embedding(top_words, embedding_vector_length, input_length=max_review_length))

model.add(tf.keras.layers.LSTM(100,input_shape=(X_train[1],1),return_sequences=True))
# 记得在全连接层之前加上Flatten层,不然一直报错
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(1, activation='sigmoid'))

model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
print(model.summary()) #收集参数

model.fit(X_train, y_train, epochs=3, batch_size=64)

# Final evaluation of the model
scores = model.evaluate(X_test, y_test, verbose=0)

print("Accuracy: %.2f%%" % (scores[1]*100))

整个训练过程参数变化比较清楚

三、LSTM之前加上卷积层

卷积层负责空间上的特征提取,LSTM负责时间序列上的特征提取

import numpy
import tensorflow as tf

tf.reset_default_graph()
# load the dataset but only keep the top n words, zero the rest
top_words = 5000
imdb=tf.keras.datasets.imdb
(X_train, y_train), (X_test, y_test) = imdb.load_data(num_words=top_words)
# truncate and pad input sequences
max_review_length = 500
X_train = tf.keras.preprocessing.sequence.pad_sequences(X_train, maxlen=max_review_length)
X_test = tf.keras.preprocessing.sequence.pad_sequences(X_test, maxlen=max_review_length)

# create the model
embedding_vector_length = 32
model = tf.keras.Sequential()
model.add(tf.keras.layers.Embedding(top_words, embedding_vector_length, input_length=max_review_length))
# 增加卷积层,卷积核尺寸(3,3,1,32)
model.add(tf.keras.layers.Conv1D(padding="same", activation="relu", kernel_size=3, filters=32))
model.add(tf.keras.layers.MaxPooling1D()) #最大池化层

model.add(tf.keras.layers.LSTM(100,return_sequences=True)) # LSTM层,100个神经元
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(1, activation='sigmoid'))

model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
print(model.summary())

model.fit(X_train, y_train, epochs=3, batch_size=64)

# Final evaluation of the model
scores = model.evaluate(X_test, y_test, verbose=0)

print("Accuracy: %.2f%%" % (scores[1]*100))

训练过程也很正常

四、卷积层后加入多层LSTM

import numpy
import tensorflow as tf

tf.reset_default_graph()
# load the dataset but only keep the top n words, zero the rest
top_words = 5000
imdb=tf.keras.datasets.imdb
(X_train, y_train), (X_test, y_test) = imdb.load_data(num_words=top_words)
# truncate and pad input sequences
max_review_length = 500
X_train = tf.keras.preprocessing.sequence.pad_sequences(X_train, maxlen=max_review_length)
X_test = tf.keras.preprocessing.sequence.pad_sequences(X_test, maxlen=max_review_length)

# create the model
embedding_vector_length = 32
model = tf.keras.Sequential()
model.add(tf.keras.layers.Embedding(top_words, embedding_vector_length, input_length=max_review_length))
model.add(tf.keras.layers.Conv1D(padding="same", activation="relu", kernel_size=3, filters=32))
model.add(tf.keras.layers.MaxPooling1D())

model.add(tf.keras.layers.LSTM(32, return_sequences=True))
model.add(tf.keras.layers.LSTM(24, return_sequences=True))
model.add(tf.keras.layers.LSTM(1,  return_sequences=False))
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(1, activation='sigmoid'))

model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
print(model.summary())

model.fit(X_train, y_train, epochs=3, batch_size=64)

# Final evaluation of the model
scores = model.evaluate(X_test, y_test, verbose=0)

print("Accuracy: %.2f%%" % (scores[1]*100))

训练过程也很正常

五、调试中的报错

在同一个文件中反复调试的过程中会出现一个报错

---> model.fit(X_train, y_train, epochs=3, batch_size=64)
TypeError: Cannot interpret feed_dict key as Tensor: Tensor Tensor("embedding_1_input:0", shape=(?, 500), dtype=float32) is not an element of this graph.     

问题出现在sess.run()时,实参赋值给形参出现问题:实参不在本次运算图形中
新建一个文件后运行代码即正常
个人理解是命名空间的问题:原来的实参运行好好的,人为中断后相关资源没有及时释放,再次运行后出现命名冲突

参考资料:
三个老外 《TensorFlow深度学习》
keras中文文档
【深度学习】Keras与TPU初体验
imdb-lstm官方代码
imdb-lstm代码解读

← 【深度学习】垃圾短信预测RNN 【深度学习】LSTM再理解 →