In a similar vein, this is one of the most amusing use of Markov chains IMO: It is trained on the King James Bible and several Computer science textbooks:
I built a few generative Twitter bots with character level language models[1] (@steam_gaems for instance) and one problem is that they spit out phrases from the source text relatively frequently. So I included a function to re-generate the output if it matched any input phrases. Brute-force but effective.
This is a great article from a young person who is clearly learning by writing. Having good communication skills along with technical skills will make you invaluable in the workforce. We need this kind of popular science writing for computer science. Keep up the great work!
In our research group we mixed our PhD theses with Lovecraft fiction. While sometimes fun, it usually does not mix well, because Lovecraft writes in past tense and theses are in present tense.
My biggest problem with Markov chain generated text is the amount of simply grammatically nonsense output. For example “Grew up your bliss and the world.” as far as I can tell just doesn't parse
There is a slight issue with the provided code: if one sentence ends with a given word, then this word will always "terminate" any sentence, even if it is almost always inside a sentence.
Ex: "I like the letter i" will put "i" in the END dictionary, and as soon as this "word" gets picked up the sentence will be terminated, leading to incomplete / low-quality results :/
I like the trick of not using a weight or probability but just the number of occurrences of the world for in-sentences words (that don't scale well but are totally useful with small size training sets), maybe there is a way to reuse the same trick to not always terminate on such END words.
Recently on a slow Friday we trained Markov chains with our Slack history. We generated a separate chain for each Slack member. It culminated with a Slack channel emulator generating full chat logs of what members would typically say.
The programming in Python was suprisingly quick and easy, but we ended up wasting more time then expected by amusing ourselves with the results.
So if anyone’s thinking to try out Markov chains for text, I recommend taking Slack logs for training.
I actually made a slack bot to do this exact thing: https://github.com/ridhoq/ditto. It was a quick side project to learn Elixir so it’s a bit rough around the edge. I'm actually in the process of refactoring to use the new Events API. But it’s been pretty hilarious to see the responses get better and better over time. If anybody is interested and wants to put it on their Slack, I may try to put it in the Slack App directory thing.
would be nice to use it in combination with fortune or if I could use it as login greeting. Load it up with the best quotes from the stoics, psalms, jesus sirach, the gospels, pauls letters and ecclesiastes and all the existing library. Put a good RND in front of it and you have a real Oracle greeting you whenever you want!
http://kingjamesprogramming.tumblr.com/
So you get some gold like:
"And this I pray, that your love may abound yet more and more like a controlled use of shared memory."