Well, hello there — Citizen of Jupyter! It hasn’t been too long since our last meeting, but we’re here again.

This time though, it’s a bit more interesting; we’re going to take a dive into the world of RNNs. Today we’re going to work on a model to generate text from some text-based data. Let’s get started.

`#As usual, let's start by importing all our required projects. Please be sure to run pip install tensorflow numpy`

import tensorflow as tf

import numpy as np

Alright, now we can load in our data. The data set in this case is quite different from everything else we’ve dealt with. It’s a text file that from Project Gutenberg that contains work by Shakespeare. Let’s take a look at it.

`with open('shakespere_data.txt', 'r') as f:`

text = f.read()

The complete source code and data for all four parts of this series can be found on my github page.

Alright, now we have to do some modification of the data for our model to understand it. This time, though, it will be slightly trickier than what we did for the model in Part 3.

We’re going to need to convert all unique characters in our data to numbers. This is because our models can only analyze and produce numerical data. Once we’ve converted the data into numbers, we can train our model, and it will also output numbers which we shall need to decode into the corresponding words.

To assist in this, we shall create two dictionaries; one which maps words to numbers and another which maps numbers back into their corresponding characters.

`characters = sorted(list(set(text))) #pick out unique characters and place them in a list`

char_to_int = {ch:i for i, ch in enumerate(characters)}

int_to_char = {i:ch for i, ch in enumerate(characters)}

Alright, now we have our dictionaries. Next we need to create our X and Y. To do this, we need to get a little bit creative. We want our model to be able to produce Shakespeare-like text. Where the input is a starting statement and the output is the next statement (what the model thinks Shakespeare would say next).

To do this, we’re going to split our data into groups. The input will be the current group, and the output will be the next group. With this method, our model can learn what Shakespeare would say next.

`max_len = max([len(s) for s in text]) # get the length of the longest sentence`# Now we loop through all our sentences and save the inputs and outputs

X = []

Y = []

for i in range(0, len(text)-max_len, 1):

X.append([char_to_int[ch] for ch in text[i:i+max_len]])

Y.append(char_to_int[text[i+max_len]])

Cool, now X contains a list of phrases whose follow-ups are contained in the same index in Y. We’re getting close guys!

Now, normally in machine learning, we expect all items to have the same features. The issue we have here is some phrases are longer than others, which can make it harder to train a model on the data.

To overcome this problem, we want to add some padding to our data to make sure that we have equal length phrases.

Also, because at the moment our Y is simply a collection of phrases, we want to use one-hot encoding for its data. This helps transform this data so that our model can train on it.

`# Pad the examples`

X = tf.keras.preprocessing.sequence.pad_sequences(X, maxlen=max_len, padding='post')# Convert labels to categorical format

Y = tf.keras.utils.to_categorical(Y)

Alright, now it’s time to create our model. We’re going to use TensorFlow’s Sequential API to LSTM model with three layers.

- The embedding layer will translate characters into continuous vector representations, enabling our model to capture character semantics and relationships.
- We’ll add an LSTM layer that will model sequential dependencies in the input data, allowing us to detect patterns in our data.
- Finally, the dense layer with softmax activation will be the output layer responsible for predicting the next character in a sequence. It will produce a probability distribution over characters, allowing the model to generate text probabilistically.

`model = tf.keras.Sequential() #Creating our model`

model.add(tf.keras.layers.Embedding(input_dim=len(characters), output_dim=64)) #add an embedding layer

model.add(tf.keras.layers.LSTM(units=128)) #add the LSTM layer

model.add(tf.keras.layers.Dense(units=len(characters), activation='softmax')) #Dense layer

Alright, cool, now that we’ve created our model, let’s compile our model and get started with training it.

`model.compile(optimizer="adam", loss="categorical_crossentropy", metrics=['accuracy'])`# Train our model

model.fit(X,Y, epochs=100, batch_size=64)

Well, it’s our last day here and I think I’m getting a bit emotional. Nonetheless, let’s finalize our journey on Jupyter. Now that we’ve trained our model, we need to ask it to generate some Shakespere-like text for us.

We need to give the model a starting point from which can come up with new words to say. To do this we shall take in our starting point, encode it for the model and then keep prompting the model until we’re satisfied. So, becuase this process is quite long, we’re going to create afunction to help us with this

`def generate_text(seed, num_chars):`

# Start the text

result = seed#Encode our seed

e_seed = [char_to_int[ch] for ch in seed]

#add padding

padded_seed = tf.keras.preprocessing.sequence.pad_sequences([e_seed], maxlen=max_len, padding='post')

#from model

for i in range(num_chars):

#Get the next character probabilities

probs = model.predict(padded_seed)[0]

#add the character we get

result+= int_to_char[np.argmax(probs)]

#Update our seed

padded_seed = np.append(padded_seed[0][1:], index)

padded_seed = tf.keras.preprocessing.sequence.pad_sequences([[padded_seed], maxlen = max_len], padding='post')

return result

Perfect, now we can use our mode to generate some text for us.

`# Generate text`

generated_text = generate_text('As usual, in the morning ', 100)

print(generated_text)

There we go! We’ve come to the end of our journey everyone. It’s been a super fun ride and I hope you’ve familiarised yourself with Jupyter enough to travel on without. Hope to see you in my next series, catch you soon!