For the midterm project, Lan, Michelle and I decided to use DALL.E to generate a visual poem using lines of poetry we feed it. I thought of the idea of using a ML-generated poem to fabricate what an AI talking to another AI will be like. We found Google’s AI project Verse by Verse, but I think it was limiting to have a choice of only a few popular English-language poets. I liked Michelle’s idea of a Dada-ist approach to deconstructing a poem, so we decided to all explore and play around with our different ideas to create a visual poem with DALL.E using a found poem or a newly constructed one by an AI.

The two popular AI poetry generators by Google are Poem Portraits and Verse by Verse, both of which I tried. Poem Portraits requires an input of one random word, followed by a photo capture that is optional. Verse by Verse lets you choose generated lines by 3 poets you choose to replicate the style of. For the poem below I chose Emily Dickinson, Robert Frost, and Phillis Wheatley. The title and first line are mine, and the last three lines are by each poet in that order, but curated by me, chosen out of the generated lines from Google based on whichever one I felt fit the theme of the poem better.

poem-portrait.png

my-poem.png

Surprisingly, I also found that OpenAI, the creators of DALL.E created a AI haiku generator but I decided not to use it for the lack of a narrative story or subject in the haikus. I wanted poems that had more of a narrative because I felt that it generated the most interesting images when fed into DALL.E.

After more digging, I found that OpenAI had a lot more text generator AIs for semi-public use, such as this one but it wasn’t specifically for creating poetry, and it was more of an auto-complete bot than a word generator. Then I found this haiku generator on Google CoLab that generated pretty funny lines (note that it takes a couple of minutes to start up), and I believe uses a version of OpenAi’s GPT-3 language model, based on this article by the generator maker.

Screen Shot 2022-10-05 at 01.12.52.png

Generating the images

I was also curious how DALL.E would respond to a poem in a different language, so I fed it the lines of Li Bai’s famous poem, “Drinking Alone Under the Moon.”

The first line of the poem, which says: “a flask of wine amidst a flower field, drinking alone has no friends; raising the cup to the moon, together with the shadow we become three”

The first line of the poem, which says: “a flask of wine amidst a flower field, drinking alone has no friends; raising the cup to the moon, together with the shadow we become three”

DALL.E’s four generated images from the poem’s first line

DALL.E’s four generated images from the poem’s first line

It generated some uncanny-looking Japanese women with misplaced eyes, and an image of some men wearing garb oddly reminiscent of traditional Chinese groom clothing but not quite.

Line 2

Line 2

Line 3

Line 3

Line 4

Line 4

The next ‘poem’ I fed to it was the Verse by Verse poem I generated above, as I felt it had a bit more narrative than the other results.

Screen Shot 2022-10-04 at 23.57.10.png

Screen Shot 2022-10-04 at 23.58.20.png

Screen Shot 2022-10-04 at 23.59.34.png

Screen Shot 2022-10-05 at 00.00.22.png

Overall I think it’s interesting how OpenAI’s DALL.E seems better at generating certain categories of photos better than others. I think it creates more realistic photos of natural landscape such as skies, forests, trees, and even animals. And it does a lesser job with humans, as it seems to misplace a lot of essential features on a human face. I’m curious if I’m more attuned to these differences than the natural images because I’m human and I am better at detecting what looks like a realistic face than I am at what looks like a realistic squirrel or natural plants.