Article directory
- Code display
- Code interpretation
- Introduction to Bidirectional LSTM (BiLSTM)
Code display
import pandas as pd import tensorflow astf tf.random.set_seed(1) df = pd.read_csv("../data/Clothing Reviews.csv") print(df.info()) df['Review Text'] = df['Review Text'].astype(str) x_train = df['Review Text'] y_train = df['Rating'] print(y_train.unique())
<class 'pandas.core.frame.DataFrame'> RangeIndex: 23486 entries, 0 to 23485 Data columns (total 11 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 Unnamed: 0 23486 non-null int64 1 Clothing ID 23486 non-null int64 2 Age 23486 non-null int64 3 Title 19676 non-null object 4 Review Text 22641 non-null object 5 Rating 23486 non-null int64 6 Recommended IND 23486 non-null int64 7 Positive Feedback Count 23486 non-null int64 8 Division Name 23472 non-null object 9 Department Name 23472 non-null object 10 Class Name 23472 non-null object
[4 5 3 2 1]
from tensorflow.keras.preprocessing.text import Tokenizer dict_size = 14848 tokenizer = Tokenizer(num_words=dict_size) tokenizer.fit_on_texts(x_train) print(len(tokenizer.word_index),tokenizer.index_word) x_train_tokenized = tokenizer.texts_to_sequences(x_train) from tensorflow.keras.preprocessing.sequence import pad_sequences max_comment_length = 120 x_train = pad_sequences(x_train_tokenized,maxlen=max_comment_length) for v in x_train[:10]: print(v,len(v))
# Build RNN neural network from tensorflow.keras.models import Sequential from tensorflow.keras.layers import Dense,SimpleRNN,Embedding,LSTM,Bidirectional import tensorflow astf rnn = Sequential() # For rnn, first perform the word vector operation rnn.add(Embedding(input_dim=dict_size,output_dim=60,input_length=max_comment_length)) # RNN: simple_rnn (SimpleRNN) (None, 100) 16100 # LSTM: simple_rnn (SimpleRNN) (None, 100) 64400 rnn.add(Bidirectional(LSTM(units=100))) # The second layer constructs 100 RNN neurons rnn.add(Dense(units=10,activation=tf.nn.relu)) rnn.add(Dense(units=6,activation=tf.nn.softmax)) # Output the classification results rnn.compile(loss='sparse_categorical_crossentropy',optimizer="adam",metrics=['accuracy']) print(rnn.summary()) result = rnn.fit(x_train,y_train,batch_size=64,validation_split=0.3,epochs=10) print(result) print(result.history)
Code interpretation
First, let’s summarize the flow of this code:
- The necessary TensorFlow Keras modules are imported.
- A Sequential model is initialized, which means that our model will stack layers in order.
- An Embedding layer was added to convert integer indices (corresponding words) into dense vectors.
- A bidirectional LSTM layer with 100 neurons was added.
- Two Dense fully connected layers are added, containing 10 and 6 neurons respectively.
- The model was compiled using the
sparse_categorical_crossentropy
loss function. - A summary of the model is printed.
- The model was trained using the given training and validation data.
- Printed the training results.
Now, let’s unpack the code line by line:
- Import dependencies:
from tensorflow.keras.models import Sequential from tensorflow.keras.layers import Dense,SimpleRNN,Embedding,LSTM,Bidirectional import tensorflow astf
You imported the TensorFlow Keras library required to create and train RNN models.
- Initialize model:
rnn = Sequential()
You’ve chosen a sequential model, which means you can simply add layers in sequence.
- Add Embedding layer:
rnn.add(Embedding(input_dim=dict_size,output_dim=60,input_length=max_comment_length))
This layer converts integer indices into fixed-size vectors. dict_size
is the size of the vocabulary, and max_comment_length
is the maximum length of the input comment.
- Add LSTM layer:
rnn.add(Bidirectional(LSTM(units=100)))
You chose a bidirectional LSTM, which means it takes into account both past and future information. It has 100 neurons.
- Add fully connected layer:
rnn.add(Dense(units=10,activation=tf.nn.relu)) rnn.add(Dense(units=6,activation=tf.nn.softmax))
These two Dense layers are used for the output of the model, and the last layer uses the softmax activation function for 6-category classification.
- Compile model:
rnn.compile(loss='sparse_categorical_crossentropy',optimizer="adam",metrics=['accuracy'])
You choose a loss function suitable for the classification problem and choose the adam optimizer.
- Show model summary:
print(rnn.summary())
This will show the structure and number of parameters of the model.
Model: "sequential" ______________________________________________________________ Layer (type) Output Shape Param # ================================================== =============== embedding (Embedding) (None, 120, 60) 890880 bidirectional (Bidirectiona (None, 200) 128800 l) dense (Dense) (None, 10) 2010 dense_1 (Dense) (None, 6) 66 ================================================== =============== Total params: 1,021,756 Trainable params: 1,021,756 Non-trainable params: 0 ______________________________________________________________ None
- Training model:
result = rnn.fit(x_train,y_train,batch_size=64,validation_split=0.3,epochs=10)
You trained the model using the training data set, 30% of which was used for validation, and trained for 10 epochs.
Epoch 1/10 257/257 [==============================] - 74s 258ms/step - loss: 1.2142 - accuracy: 0.5470 - val_loss : 1.0998 - val_accuracy: 0.5521 Epoch 2/10 257/257 [==============================] - 57s 221ms/step - loss: 0.9335 - accuracy: 0.6293 - val_loss : 0.9554 - val_accuracy: 0.6094 Epoch 3/10 257/257 [==============================] - 59s 229ms/step - loss: 0.8363 - accuracy: 0.6616 - val_loss : 0.9321 - val_accuracy: 0.6168 Epoch 4/10 257/257 [==============================] - 61s 236ms/step - loss: 0.7795 - accuracy: 0.6833 - val_loss : 0.9812 - val_accuracy: 0.6089 Epoch 5/10 257/257 [==============================] - 56s 217ms/step - loss: 0.7281 - accuracy: 0.7010 - val_loss : 0.9559 - val_accuracy: 0.6043 Epoch 6/10 257/257 [==============================] - 56s 219ms/step - loss: 0.6934 - accuracy: 0.7156 - val_loss : 1.0197 - val_accuracy: 0.5999 Epoch 7/10 257/257 [==============================] - 57s 220ms/step - loss: 0.6514 - accuracy: 0.7364 - val_loss : 1.1192 - val_accuracy: 0.6080 Epoch 8/10 257/257 [==============================] - 57s 222ms/step - loss: 0.6258 - accuracy: 0.7486 - val_loss : 1.1350 - val_accuracy: 0.6100 Epoch 9/10 257/257 [==============================] - 57s 220ms/step - loss: 0.5839 - accuracy: 0.7749 - val_loss : 1.1537 - val_accuracy: 0.6019 Epoch 10/10 257/257 [==============================] - 57s 222ms/step - loss: 0.5424 - accuracy: 0.7945 - val_loss : 1.1715 - val_accuracy: 0.5744 <keras.callbacks.History object at 0x00000244DCE06D90>
- Show training results:
print(result)
<keras.callbacks.History object at 0x0000013AEAAE1A30>
print(result.history)
{<!-- -->'loss': [1.2142471075057983, 0.9334620833396912, 0.8363043069839478, 0.7795010805130005, 0.7280740141868591, 0.69339 3349647522, 0.6514003872871399, 0.6257606744766235, 0.5839114189147949, 0.5423741340637207], 'accuracy': [0.5469586253166199, 0.6292579174041748, 0.6616179943084717, 0.6833333373069763, 0.7010340690612793, 0.7156326174736 023, 0.7363746762275696, 0.748600959777832, 0.7748783230781555, 0.7944647073745728], 'val_loss': [1.0997602939605713, 0.9553984999656677, 0.932131290435791, 0.9812102317810059, 0.9558586478233337, 1.01973080635070 8, 1.11918044090271, 1.1349923610687256, 1.1536787748336792, 1.1715185642242432], 'val_accuracy': [0.5520862936973572, 0.609423816204071, 0.6168038845062256, 0.6088560819625854, 0.6043145060539246, 0.599914848804 4739, 0.6080045700073242, 0.6099914908409119, 0.6019017696380615, 0.574368417263031] }
This will show information such as loss and accuracy during training.
Introduction to Bidirectional LSTM (BiLSTM)
example: