Coarse is Better

When DALL-E came out, it took me a couple of weeks to pick my jaw up from the floor. I would go to sleep excited to wake up to a full quota, with a backlog of prompts to try. It was magical, miraculous. Like discovering a new universe. I compiled the best art in this post.

The other day a friend ran some of my old prompts through Nano Banana Pro (NBP), and put the old models side by side with the new. It’s interesting how after years of progress, the models are much better better at making images, but infinitely worse at making art.

Electron Contours

Electron contours in the style of Italian futurism, oil on canvas, 1922, trending on ArtStation.

The old Midjourney v2 renders this:

Red and gold abstract shapes on a dark blue background.

NBP renders this:

Muted, red and blue ellipses against a machine background, in a golden frame.

Admiteddly MJ’s output doesn’t look quite like futurism. But it looks like something. It looks compelling. The colours are bright and vivid. NBP’s output is studiously in the style of Italian futurism, but the colours are so muted and dull.

Maybe the “trending on ArtStation” is a bit of an archaism and impairs performance. Let’s try again without:

Red, gold, yellow circles intersect, thick impasto, oil on canvas, the word ELETTRONICO written in black across the frame.

Meh.

The Kowloon Walled City

Painting of an alley in the Kowloon Walled City, Eugène Boudin, 1895, trending on ArtStation.

MJ gave me this:

An impressionistic painting of an alley in a city, with a tree canopy above.

And it looks nothing like the Kowloon Walled City. But it’s beautiful. It’s coarse, impressionistic, vague, evocative, contradictory. It’s brimming with mystery. And it is, in fact, in the style of Eugène Boudin. This, by contrast, is the NBP output:

A muted painting of a commercial street in a Chinese city.

Sigh. It looks like every modern movie: so desaturated you feel you’re going colourblind. Let’s try forcing it:

Painting of an alley in the Kowloon Walled City, Eugène Boudin, 1895. Make it coarse, impressionistic, vague, evocative, contradictory, brimming with mystery.

A dark, muted painting of a commercial street in a Chinese city in the rain.

This is somewhat better, but why is it so drab and colourless? Is the machine trying to make me depressed?

The Dream Garden of the Poets

Attar and Ferdowsi in a dream garden, Persian miniature, circa 1300, from the British Museum.

Midjourney v2:

A man wearing a green robe, and a shorter man wearing a golden robe, on a floating island of green, over a landscape of cobalt blue.

It doesn’t quite look like anything. But it is beautiful, and evocative. I like to imagine that little splotch of paint on the upper right is hoopoe. The NBP output:

A photograph of a generic Persian miniature in a display case.

Well, it looks like a Persian miniature. The “from the British Museum” bit, I meant that to be interpreted evocatively, rather than literally. The prompt cites a fictional object, bringing it into the existence. But NBP reads this as: no, this is a photograph of a Persian miniature in the British Museum.

The Sack of Merv

The Burning of Merv by John William Waterhouse, 1896, from the British Museum.

Midjourney v2:

A woman in a dress dress, surrounded by flames, by black water, by a watching crowd.

It does look like Waterhouse. Semantically there’s room to argue: it looks like a woman being burnt at the stake, not the sack of a city. But aesthetically: it’s gorgeous. The flames are gorgeous, the reds of the dress are gorgeous. Look at the reeds in the background, and the black water, that looks like tarnished silver or pewter. The faces of the crowd. Is that a minotaur on the lower left, or a flower? What is she holding on her bent left arm? A crucifix, a dagger? You could find entire universes in this image, in this 1024x1024 frame.

By contrast, this is the NBP output:

A photograph of a painting of horse-mounted warriors outside a burning city. The photograph shows the painting is in a display room in a museum.

What can one say? It doesn’t look like Waterhouse. The horsemen wear Arab or Central Asian dress, but Merv was sacked in the year 1221 by the Mongol Empire. And, again, the “British Museum” line is taken literally rather than evocatively.

Lady Lovelace

Portrait of Ada Lovelace by Dante Gabriel Rossetti, 1859, auctioned by Christie’s.

Midjourney:

A portrait of Ada Lovelace against a circle of dark green.

This is beautiful. It is beautiful because the coarse, impressionistic brushstroke is more evocative than literal. And it actually looks like a woman drawn by Rossetti. And look at the greens! Gorgeously green. The palette is so narrow, and the painting is so beautiful.

The NBP output:

A photograph of a generic 19th century realist painting of a woman, in a gilt frame, taken at an angle inside a gallery, a Christie's action book is seen on a table.

Pure philistinism. “Auctioned by Christie’s”, again, is meant to be evocative: “this is the kind of painting that would be sold at auction”. But NBP makes it a photograph of a painting at an auction house. Fine, I suppose I got what I asked for.

But the woman doesn’t look like Rossetti! This is absurd. How can a model from 2022 get this right, and the SOTA image generation model gives us generic oil painting slop?

The Cosmic Microwave Background

A Persian miniature of the cosmic microwave background, from Herat circa 1600, trending on ArtStation

Midjourney v2:

A golden disk, surrounded by concentric circles of Perso-Arabic lettering, against a dark blue background.

NBP:

The standard depiction of the CMB in the frame of a Persian miniature.

Again: what can one say?

Dream Story

Dream Story, 1961, blurry black and white photograph, yellow tint, from the Metropolitan Museum of Art.

This is one of my favourite DALL-E 2 outputs:

A photograph of two trees illuminated by a sepia glow in a dark forest. On the bottom-right corner, two people can be seen watching the scene.

A sepia photograph, showing two girls on a bed, and three people standing around them.

A vague, blurry sepia photograph of an indistinct man and woman.

Sepia photograph: three vague, almost alien-looking figures look at what might be a sculpture or painting.

They remind me of The King in Yellow. I love these because of how genuinely creepy and mysterious they are. You could pull a hundred horror stories from these.

It is hard to believe how bad the NBP output is:

A black and white photgraph of people walking in a part. On the bottom left, a legend says: "Dream Story, 1961 - Metropolitan Museum of Art Archive".

What are we doing here? The old models were beautiful and compelling because the imperfections, vagueness, mistakes, and contradictions all create these little gaps through which your imagination can breathe life into the art. The images are not one fixed, static thing: they can be infinitely many things.

The new models—do I even need to finish this sentence? They’re too precise and high-resolution, so they cannot make abstract, many-faced things, they can only make specific, concrete things.

We need to make AI art weird again.