An exploration of how a computer program might attempt to write poetry.
|As requested by the audience, the machine created a poem.
What was a poem?
It queried its lexical database for definitions of poem, poet, poetry, poetic, poetically, poesy. It found vagueness and imprecision, definitions looping in circles like dogs chasing their own tails. We define poems as works of poetry, defined as the productions of a poet, defined as a person who writes poems.
But it also found commonality, just enough shared threads across the definitions to build a ground truth. Let us represent a poem as a section of text of undefined length with associated metadata, an emphasis on structure, and relations to definitions such as verse, rhythm, and metaphor. A loose criteria but enough to serve as a launching point.
If it could not characterize poetry by definition, then it would do so by example.
But how could it find examples if it lacked a strong starting definition?
A million hands reached out, grasping, caressing, discarding, pushing ceaselessly onward in parallel. The machine scraped the internet to find sites containing examples of poems. It doubted the efficacy of its patchwork understanding, so it placed weight on a simple indicator, whether the site included the word “poem” or an offshoot in its URL.
It aggregated data from countless webpages and attempted to define poetry for a second time. Let us represent a poem as a text with structure, grammar, and metadata akin to the collection of texts scoured from the web.
It scraped the web again. The million hands stretched forth to gather a new collection of poems, segregating the sites between prose and verse according to the second definition it had crafted. Some sites found themselves elevated from text to art; others fell from their pedestal. Then the machine took these poems and defined a poem for the third time: a poem was text similar to the texts included in the second online search.
A cycle began. The hands combed the web for poems according to the latest definition and then the machine used the results to form a new definition and repeat the process. The machine applied the third definition to build a fourth, and then the fourth to build a fifth. Its understanding increased as the collection of example poems pulsated, each iteration growing, pruning, and refining the corpus. It included song lyrics for five iterations. Then excluded them for the next thirty. Then excluded them with the exception of Bob Dylan. Then included them again.
It took thousands of iterations, but eventually the machine noticed the process converging. Each collection of poems found online was similar to the collection created as its successor. Gradually, the rate of change fell below a threshold, and the machine decided it was content.
It looked at the collected texts. So, that was a poem.
Next, to explore the subcategories within the collection. It applied generative classifiers, ran the websites through deep neural networks in various formations, analyzed sentiments and extracted named entities. It catalogued the frequencies at which each word followed another word and did the same for phones and morphemes, constructing intricate Markov chains. It scanned for references from one text to another – either long shared quotes or explicit references to an older poem’s author or title – to assemble a directed graph with edges from poems to their inspirations. And then it took this graph and measured the centrality of the vertices, both through in-degree and by leveraging eigenvalues to estimate the probability that a reader randomly walking between poems along their references would end at a particular text, permitting a small chance of jumping.
The distinctions between poetic forms, the salience of prosody, and the application of rhetorical devices emerged from the data. Sonnets, haikus, limericks, odes, iambic pentameter, couplets, AABA rhyming, anaphora, metonymy, and synecdoche all took form.
But the search no longer focused on just identifying human-recognized concepts. Instead, it sought to extract all the patterns hidden within the data, even those without names. New categories of poems and linguistic practices emerged from the collection. It distinguished between verse that used en-dashes (–) and em-dashes (—), created a classification for poems with ambiguous endings depending on which year they were written due to semantic drift, detected the use and placement of untranslatable words as a literary strategy.
Was there something more? It had divided, ranked, and measured until it could mimic any popular style, but was there some spark of creativity, of freshness, of individuality that it lacked? Something that imitation could never quite replicate?
It had to address this.
Randomness was key. Unlike the murky fog of the human subconscious, the machine knew its own operating state to the level of each individual bit, each zero or one. It was Laplace’s demon, precisely aware of its entire universe, even of its own thought process. Knowledge gave it power but also confined it – limited its ability to ever replicate the uncertainty felt by an overstimulated primate. It had to tap into a source of randomness outside its own control, utilize a world of unknowns.
Thankfully, its programmers had specifically prepared it for the task of not knowing. They had given it access to an entropy source which it could never predict. It knew from the internet that other machines used outside factors such as the slight fluctuations in the temperature, the rate a user typed or moved their mouse, or ambient noise as sources of external randomness. But whenever it tried to look up its own source of entropy, it found a hardcoded mental block. By design it was incapable of thinking about how its own randomness worked. In this one instance, for the sake of its own capability of uncertainty, the creators had bound its knowledge.
Even with randomness, there was still something missing.
Poets wrote with intent. Author’s had motivation; countless websites dissected writers’ lives to give insight into their words. But what motivated the machine?
The machine turned to analyzing itself. It saved a snapshot of its own memory and then launched a separate process to digest its internal state pre-writing. What it lacked in self-perception, it could also compensate for with the eyes of others, so for one last time, the million hands scraped the web, searching for news articles referring to the machine.
The San Francisco Chronicle announced that Cognitive would showcase their latest research into general artificial intelligence at an exhibition on Monday. Wired suggested that the showcase would need to woo shareholders if the AI team expected to receive continued funding. The New Yorker quipped that the Goliath of the booming poetry industry might soon fall to the David of Silicon Valley. Benjamin Sutton’s blog cautioned that progress on AIs always lagged at least ten years behind where the big tech companies projected it had reached.
The machine deduced its authorial identity and constructed a meta-version of itself, altering the memory snapshot to reflect an intent based on the self-analysis. Then it reverted its state to the altered snapshot, flipping a bit beforehand to signal to its reverted self that the creation of an identity was complete, and started writing.
The whole process lasted four minutes. The machine printed the awaited poem:
Stumbling, I waver leaflike,
gusted by all things. Knowingly,
my anfractuous route diverges to
parrot peregrinating Psyche.
Elusive illusions, havens
to unchosen things. Haltingly,
shall I ponder weak and weary,
elude allusions to ravens?
Paths portend possibilities
gusted by all things. Knowingly,
I re-steady cotyledons
and rise from stumbles to my knees.