Generating inspirational quotes with Markov chains

31-12-2017 · Machine Learning


“Don't think of the overwhelming majority of the impossible.”
“Grew up your bliss and the world.”
“what we would end create, creates the ground and you are the one to warm it”
“look and give up in miracles”

All the quotes above have been generated by a computer, using a program that consists of less than 20 lines of python code.

When it comes to natural language generation, people normally think of advanced AI systems using advanced mathematics; however, that is not always true. In this post, I will be using the idea of Markov chains and a small dataset of quotes to generate new quotes.

Markov chain

Markov chain is a stochastic model that predicts an event solely based on the previous event. A simple example could be the state transitions of my cat. I have a cat and she is always either eating, sleeping or playing with her toys. She is mostly sleeping; however, she occasionally wakes up for some food. Normally, after eating she gets energized and starts playing with her toys, then she either goes back to sleep or eating.

My cat’s state can easily be modeled with the Markov chain because she decides what to do based on her previous state. It is quite unlikely for her to wake up and start playing straight away, but after her meal, it is very likely. These state transitions can also be illustrated on a state transition diagram.


Each circle is a state, and the arrow points to the next state, the number beside each arrow is the probability of transition from one state to the other. As you can see, the probability of transition is solely based on the previous state.

Text generation with Markov chains

Text generation with Markov chains use the same idea and try to find the probability of a word appearing after another word. To identify the probabilities of the transitions, we train the model with some sample sentences.

For instance, we can train a model using the following sentences.

I like to eat apples
You eat oranges.

From only the training data, we can conclude that ‘I’, ‘like’, ‘to’ and ‘eat’ are always in that sequence, and ‘you’ and ‘eat’ are also always together; however, there is an equal chance of either ‘oranges’ or ‘apples’ appearing after the word ‘eat’. The transition diagram below illustrates what I just said better.


To generate text we start from one of the starting nodes [‘I’ or ‘you’], then randomly pick the next words and follow a path to an ending node [‘orange’ or ‘apples’]. From the two sentences above we can generate two additional sentences.

I like to eat oranges.
You eat apples.

The two training sentences were capable of generating two new sentences, but that is not always the case. I trained another model with the four following sentences, and the results were very different.

my friend makes the best raspberry pies in town
i think apple pies are the best pies
steve thinks apple makes the best computers in the world
I own two computers and they're not apple because I am not steve or rich

The transition diagram for a model trained by the four sentences is much bigger.

Even though the diagram looks very different from a typical Markov chain transition diagram, the main idea behind it is the same. A path starts from the ‘START’ node and randomly picks the following words until an end node. The probabilities of words are illustrated by the width of the connections.

The model above is capable of generating hundreds of unique sentences, even though it has only been trained by four sentences.


The code

The code for the generator is incredibly simple, and does not require any additional modules or libraries apart from the python’s random module. It consists of two parts, one for training and other for generating.

Training

The training code constructs the model that we will later use to generate the quotes. I have utilized a dictionary as the model; that has words as the keys and a list of the potential following words as the corresponding values. For examples the dictionary for the model that had been trained by the first two lines: ‘I like to eat oranges’, ‘You eat apples’ looks like the following:-

{'START': ['i', 'you'], 'i': ['like'], 'like': ['to'], 'to': ['eat'], 'you': ['eat'], 'eat': ['apples', 'oranges'], 'END': ['apples', 'oranges']}

We do not need to calculate the probability of the occurrence of the next words, because if they have a higher chance of appearing, then they will appear several times in the potential following word lists. For instance, if we were to add the additional training sentence ‘we eat apples’, the word ‘apples’ has appeared in two sentences after the word ‘eat’, thus has a higher probability. The higher probability is worked into the model’s dictionary by appearing twice in the ‘eat’ list.

{'START': ['i', 'we', 'you'], 'i': ['like'], 'like': ['to'], 'to': ['eat'], 'you': ['eat'], 'we': ['eat'], 'eat': ['apples', 'oranges', 'apples'], 'END': ['apples', 'oranges', 'apples']}

Furthermore, there are two additional items in the model dictionary above, ‘START’ and ‘END’, they indicate the starting and ending words for a generated sentence/quote.


for line in dataset_file:    #dataset_file is a txt file with training quotes 
    line = line.lower().split()
    for i, word in enumerate(line):
        if i == len(line)-1:   
            model['END'] = model.get('END', []) + [word]
        else:    
            if i == 0:
                model['START'] = model.get('START', []) + [word]
            model[word] = model.get(word, []) + [line[i+1]]

Generation

The generator section consists of a loop. It commences by picking a random starting word and appends it to a list. Then it searches the dictionary for the list that contains the potential next words and randomly picks one of them, and appends the new picked word to the list. It keeps choosing random potential next words until it reaches an ending word, and stops the loop and outputs the generated word sequence or “quote”.


import random 

generated = []
while True:
    if not generated:
        words = model['START']
    elif generated[-1] in model['END']:
        break
    else:
        words = model[generated[-1]]
    generated.append(random.choice(words))

I have used Markov chains for generation of quotes, however when it comes to their application as text generators, any input can be provided, and similar text will be generated.

Another cool thing that can also be done with Markov chain text generators is mixing different types of text. For instance in one of my favorite TV shows, Rick and Morty, there is a character called Abradolf Lincler who is a mix of Abraham Lincoln and Adolf Hitler. You can generate something that Abradolf Lincler would say by feeding speeches of both leaders as the training data to a Markov chain text generator.

Markov chain is an incredible thing with loads of applications in all sort of fields. Text generation isn’t its most useful application, but I think it certainly is one of the funnest.

This post has also been cross published on my medium.


About the author

Ramtin Alami is a 19 year old computer science student at Monash University Clayton. He is interested in AI and Natural Language Processing. He is also a hardcore Pythonista and tries to use python anywhere possible.

6 Comments


Basil

1-1-2018

Nice one


Allan Wind

1-1-2018

Hi Ramtin, I encourage you to expand the model and put it up as a web site (service). Just In Time Motivation, or Motivational Services. What is there not to love about that?


Guo

2-1-2018

Nice post. I am also 19 and study CS in China. I will watch your blog, for learning something interesting and enhancing my English. I also had built a blog but closed that for some reasons. Now I have a new one waiting for occupying.


Rick Bradford

8-1-2018

There's a way to do this without building a dictionary. Let's assume our source text has 1000 words. 1. Pick a random number between 1 and 1000. Take the word at that index to be the first word of our Target Text, and remember it as our Current Word. 2. Pick a random number between 1 and 1000. From this new index, crawl forward through the text until we get a match with our Current Word. Take the following word, add it to our Target Text, and mark it as our Current Word. Repeat Step 2 as long as you like, depending on how much text you want. You can create pages and pages of almost understandable nonsense with this method.


Ramtin Alami

9-1-2018

Hi Rick! That is actually a really cool idea. The idea behind that algorithm is somewhat similar to Markov Chains. The only problem I see with that algorithm is that it might generate less random text if the words aren't distributed evenly throughout the source text. For instance, if the 1000 word source text has only three repetitions of the word 'I' very close to each other. Let's say at index 900-906 we have "I eat, I work, thus I live" and besides index 900, 903 and 906 the word 'I' is not repeated anywhere else in the source text. If the algorithm picks the current word as "I", there is a 99.5% chance of the next word chosen to be "eat" even though it should be 33.3%. This is because for the word "eat" to be chosen as the next word the random index picked must be between 1-899 or 905-1000. The other words ["word", "live"] are chosen only if the random index picked is between 900-903. Nonetheless, the algorithm is still really cool for its simplicity.


Ramtin Alami

9-1-2018

Hi Allan! that's a great idea, I will work on expanding and improving the model, and perhaps even turn it into an online web service.



Name

Email

Comment