AI-art isn't art
DALL-E and other AI artists offer only the imitation of art
It is a tale told by an idiot, full of sound and fury, signifying nothing. —Macbeth
In Japan there’s a museum that collects rocks that happen to look like faces. Big rocks, small rocks, all rocks. Two holes and a line are generally enough for humans to perceive a face, and the rocks squint and smile and frown and frame their faces for all occasions to the humorous oooos and ahhhhs of visitors.
But, of course, the rocks are not smiling. There are no faces in rocks—the scientific term for this is pareidolia. No matter how face-like a rock looks, it’s an illusion. No one carved the face, no one cogitated over the face, no one intended a face. It just sort of happened that way. The key difference being that a sculptor is conscious, but when it is instead the wind or rain or the deep fermentation of geological processes acting as sculptor, these things are not conscious. Lacking consciousness, they lack intentionality, and therefore their products lack meaning.
The running joke in the rock-art museum is that everyone, from the visitors to the staff, knows the faces are unintentional—it would be a category error to think that these rocks are part of an art museum in the way that, say, the exhibits at the Mori Art Museum in Tokyo are part of an art museum.
AI-generated artwork is the same as a gallery of rock faces. It is pareidolia, an illusion of art, and if culture falls for that illusion we will lose something irreplaceable. We will lose art as an act of communication, and with it, the special place of consciousness in the production of the beautiful.
First, I am no Gary Marcus—I can tell when talent is in front of my face. AI-art (as it’s called) by the new DALL-E and its recent ilk is impressive. Over the last few months it has become clear we’re at the point where AIs can, with minimal prompting, produce an incredible array of images in all sorts of styles. Which has led to legitimate concern for working artists.
The talent of the new DALL-E is most obvious in its surprising displays of understanding: when asked to draw the creature Gollum, DALL-E does not simply create a simulacrum of the movie-version of Gollum (and surely such movie posters were in its training set)—incredibly, the result looks like what an illustrator might draw from the written description of Gollum after reading The Hobbit.
This uncanny intelligence can be seen in DALL-E’s response to ambiguity—when asked to give a robot hand drawing, it makes sure sure all possible interpretations are covered.
This is an Earth-shifting technology, and as the landscape moves under artists’ feet it’s worth considering the best-case and the worst-case scenarios.
In the best-case scenario, AI-art will always be limited in fundamental ways, just like how the current version of DALL-E is indeed limited: it has trouble once there are multiple characters in the same frame, it can’t count, it has difficulty placing objects inside other objects, it can’t follow extremely complex directions, it still has trouble with fine details like hands or splashing water, and occasionally it outputs images that break the laws of physics or object permanence in obvious ways. Perhaps these limitations will never be overcome, reflecting AI-art’s nature as simply a statistical mimic of real art. If so, high-level artistic output will generally still need the guiding hands of real artists (and their real consciousness) somewhere in the process. In this best-case scenario human artists use AI-artists mostly as a tool, like a more complex paintbrush.
But it’s also possible that the technology evolves beyond problems like “it messes up some fine-details” or “it can’t put things inside other things.” Already, many of its products are so good they simply don’t need any further input from human artists. There are now fully-rendered images of naturalistic paintings from a thing that never saw flowers, nor tasted rain.
In this worst-case scenario, DALL-E-esque AI-art will replace many working human artists, like illustrators or digital artists, and need no guiding human hand.
Such a pure replacement would affect so many. As a response over the coming years, you will likely hear things like “prompting is an artform in itself” to lessen the impact. Yet let’s look at how incredibly, gobsmackingly easy it is. In response to the prompt:
a painting by Grant Wood of a robot head with flowers growing out of the top
DALL-E did all these:
You might think that asking it for the style of an artist is cheating. But it’s just as easy without that. You just say what you want.
A cyberpunk style illustration of a happy robot head with flowers growing out of the top with a rainbow in the background, digital art
generated all this:
Given how simple this is, only the most self-congratulatory would refer to themselves as an “artist” simply for typing a one-sentence description of an image and letting DALL-E do the rest. Imagine someone commissioning an artist to do an artwork and then referring to themselves, in strict seriousness, as “the real artist” since they were the one who paid the commission (in fact, it’s exactly the same, as DALL-E will require fees to whatever Big Tech company OpenAI licenses it out to. Probably Microsoft).1
Which means the worst-case scenario, wherein AI-art replaces a significant portion of human-art, will herald the arrival of a world in which much of the “art” that you see, especially online (where most eyeballs are) is generated by non-conscious machines with minimal human input.2 Behind the entire aesthetic of our civilization there will be a vast emptiness, a void communicating nothing. In such a world the art isn’t art, in the same way that a photograph of a hurricane doesn’t get anything wet.
I’m well-aware that since the dawn of time whatever the latest wave of art is it has been described as “not art” by some crotchety elder. But all those previous times the artist wasn’t being replaced by gigantic spreadsheets trained as autoencoders, so perhaps historical analogies aren’t very useful here.
And the close connection between art and human consciousness is older than civilization itself. The early hand paintings from places like Chauvet cave, back to 30,000 BC are not merely images, but communications: “I was here.”
This communicative property of art is irreducible to its extrinsic properties (like color, form, etc), and instead concerns the significance of the artwork to the artist and their communication of that significance to the viewer. To steal some terminology from analytic philosophy, let us call these latter properties the “intrinsic” properties of art to separate them from the merely extrinsic properties of art.
The highlighting of an intrinsic quality to art is an ancient view, e.g., as historian Will Durant writes about Aristotle in his The Story of Philosophy:
Artistic creation, says Aristotle, springs from the formative impulse and the craving for emotional expression. . . the aim of art is to represent not the outward appearance of things, but their inward significance; for this, and not the external mannerism and detail, is their reality.
An idea that has held for two millennia, e.g., the 20th-century art critic John Berger wrote in his seminal work Landscapes that:
The function of the work of art is to lead us from the work to the process of creation which it contains. . . By looking at it, we are, in effect, looking through an artist’s eyes, entering into a concretized instance of their gaze. We are looking at a looking. And from within an artist’s looking, we learn about the capacities of our kind and the possibilities of our future. . .
It’s a view perhaps most clearly articulated in Tolstoy’s 1897 classic monograph What is Art? Tolstoy, after clinically dissecting theories of art that focus merely on aesthetics as being unsatisfactory definitions (for him, aesthetic theories of art are those that focus on extrinsic properties, or reactions to those extrinsic properties), arrives instead to the definition that:
. . . the aim of works of art is to infect people with the emotion the artist has experienced.
For Tolstoy, art is a virus, a contagion of what-it-is-likeness, or what a contemporary philosopher would call “qualia.” Although it is not always a veridical representation of the original consciousness, for the shared consciousness of art may mutate in the sharing, almost like a meme propagating online.
Art is a human activity, consisting in this, that one person consciously, by certain external signs, conveys to others feelings he has experienced, and other people are affected by these feelings and live them over in themselves.
With AI-art, this communication, this living over, this looking at a looking, vanishes. With the loss of this critical intrinsic property, what was a dialog becomes a lecture.
The response of AI-art enthusiasts, like those at the tech companies standing to make trillions off AI, or those who simply enjoy nerding out over new technological toys, will be to suggest that what matters in art is solely its extrinsic properties. That is, a painting is a painting, an image an image—in this view the definition of art is expanded to be such that if an image strikes someone as art, then it’s art. Case closed.
Consider the banality of this view, a view that reduces art to merely what’s in front of us. Without taking into account the consciousness of the artist, the word “art” loses all meaning, becoming merely a synonym for “beautiful.” We may find something pretty, or interesting, or striking, or pleasing, but none of these mean that it is art. We may find a natural vista affecting, we may even weep, but to say that it is “art” implies a cosmic consciousness working behind the scenes, an intrinsic property like a teleological origin, or a purpose that goes beyond the mere material. Without the intentionality of the artist taken into account, the definition of “art” is bled of all meaning, all differentiation, all usefulness as a term—such a move is really a defeat claimed in the name of victory. This is why deflationary theories of art, like how art is “whatever is in an art gallery” or “whatever anyone says” or “whatever pleases the senses” are all unsatisfying as definitions, for they strip the word “art” of all capacity to do the job of a word, which is to differentiate.
And it’s also worth pointing out to the AI-art enthusiast: if your definition of art is solely based on its extrinsic properties, this necessarily cannot be true, since changes in intrinsic properties do affect how art is perceived, its impact and nature. In the classic 1935 The Work of Art in the Age of Mechanical Reproduction by Walter Benjamin, he wrote:
. . . that which withers in the age of mechanical reproduction is the aura of the work of art.
By “aura” Walter Benjamin meant an intrinsic property, something irreducible to extrinsic properties like merely form, or color, i.e., the way something looks on its surface. His main example of aura is whether the art is the original or a facsimile.
Even the most perfect reproduction of a work of art is lacking in one element: its presence in time and space, its unique existence at the place where it happens to be. . .
That is, even if there’s no discernible difference in the extrinsic properties between an original work of art and a mechanical reproduction of it (like a photograph of it, or a postcard of it, or even an indistinguishable forgery, like the fake scaled replica of Chauvet cave that one must tour instead of the real one), the original is still the one we value and covet for its aura (which is the intrinsic property of indexical identity).
And the loss of this aura changes the nature of art, and has consequences w.r.t. art’s ultimate ends:
From a photographic negative, for example, one can make any number of prints; to ask for the ‘authentic’ print makes no sense. But the instant the criterion of authenticity ceases to be applicable to artistic production, the total function of art is reversed. Instead of being based on ritual, it begins to be based on another practice—politics.
Just as how something being either an original Da Vinci or a forgery does matter, even if side-by-side you couldn’t tell them apart, so too with two paintings, one made by a human and the other by an AI. Even if no one could tell them apart, one lacks all intentionality. It is a forgery, not of a specific work of art, but of the meaning behind art.
That may seem a highly abstract and perhaps merely philosophical point, but art’s aura is not an epiphenomenon: it has real functional effects in how we treat, value, and create technology around art. Consider how aptly Walter Benjamin’s essay predicts today:
A film operator shooting a scene in the studio captures the images at the speed of an actor’s speech. Just as lithography virtually implied the illustrated newspaper, so did photography foreshadow the sound film. The technical reproduction of sound was tackled at the end of the last century. These convergent endeavors made predictable a situation which Paul Valéry pointed up in this sentence: “Just as water, gas, and electricity are brought into our houses from far off to satisfy our needs in response to a minimal effort, so we shall be supplied with visual or auditory images, which will appear and disappear at a simple movement of the hand, hardly more than a sign.”
Such precognition! And the production of art via AI, and the concomitant loss of art’s aura of intentionality, accelerates these trends. In the future world, at the movement of a hand or the giving of a sign, images or audio or text will appear as if pulled from the non-conscious ether. Therefore, rather than talking about the “mechanical production” of art, in our new age we must be concerned with the “non-conscious production” of art.
Art, having lost its aura of originality through mechanical reproduction and now its aura of communication through AI-art, will change, will warp, under the influence of these AI-artists. What should we expect art’s future to be under this new regime?
Some film scores are instantly memorable: Jurassic Park, the Harry Potter series, Star Wars. You could hum them aloud. Others, like Marvel movies (which are the best-performing movies of all time, btw) you couldn’t hum to save your life. Marvel movies fail on this for the most common reason that movies fail, ending up with unsuccessful and unmemorable scores. Which is an industry practice called “imitating the temp,” with “temp” referring to the temporary film score used for the first-cut of the movie. See, new digital editing tools have allowed editors and directors to easily click and drag scores from previous successful and similar movies to match them to scenes. The danger is that the director can no longer imagine scenes without the emotionality and tone of scores copied from other movies. They make a demand of the composer: imitate the temp. Circumscribed by the bounds of what came before, copying the original with just enough changes to avoid copyright infringement, the result is always lukewarm mediocrity.
AI-art imitates the temp. It’s a movie scored to a different movie’s music. It’s art in the style of someone long dead. It’s writing that’s purely parroting what came before. Consider how the AI is trained: all art is weighted equally, and nothing is picked out as special, or interesting, or of value. It is learning without a point-of-view. In What is Art? Tolstoy names such imitation, i.e., artistic mimicry, as the enemy of true art.
Many conditions must be fulfilled to enable a man to produce a real work of art. It is necessary that he should stand on the level of the highest life-conception of his time, that he should experience feeling and have the desire and capacity to transmit it, and that he should, moreover, have a talent for some one of the forms of art. It is very seldom that all these conditions necessary to the production of true art are combined. But in order—aided by the customary methods of borrowing, imitating, introducing effects, and interesting—unceasingly to produce counterfeits of art which pass for art in our society and are well paid for, it is only necessary to have a talent for some branch of art. . . To produce such counterfeits, definite rules or recipes exist in each branch of art. So that the talented man, having assimilated them, may produce such works à froid, cold drawn, without any feeling.
For Tolstoy, only returning to the well of original conscious experience allows for the renewal of art, since if
the young artist sets to work to copy those who are held up for his imitation, and he produces not only feeble works, but false works, counterfeits of art.
In Tolstoy’s time the biggest culprit of “counterfeit art” were art schools.
The one thing these schools can teach is how to transmit feelings experienced by other artists in the way those other artists transmitted them. And this is just what the professional schools do teach; and such instruction not only does not assist the spread of true art, but, on the contrary, by diffusing counterfeits of art, does more than anything else to deprive people of the capacity to understand true art.
What these new technologies will lead to (and we should never forget that AI-art comes from for-profit corporations) is an explosion of such counterfeit art. Even if they are not better than real human illustrators or digital artists, they will be cheaper. And the flood of cheap counterfeit art will drive out good art, eventually creating a situation similar to the Grossman-Stiglitz paradox from finance. A paradox which goes: if markets are efficient, you should just buy index funds, but if everyone just buys index funds, then how can markets be efficient, since no one is left picking winners or losers based on their relative strengths? Once everyone who creates culture is put out of business, where does the culture for training data come from?
The answer, of course, is from other AI-artists. Normally it is considered a faux pas to train your AI using data it itself generated—in the industry this is called “leakage.” But if it does come to pass that AI-artists, by dint of being cheaper, drive out original artists, such leakage will become unavoidable in the coming years and decades when so many images (and music, video, text, etc) are produced by AIs.3 Bestselling music and art and writing will be AI-art, or at least supplemented by AI inputs, which will lead to AIs being trained on the outputs of other AIs, even their own previous iterations, generating counterfeits of counterfeits of counterfeits.
Now, perhaps this worst-case replacement scenario won’t come to pass. Perhaps AI-art will merely be a tool for human consciousness to wield. I hope so.
But if the replacement does come to pass then it will exsanguinate so much meaning from our lives. For you were born into a world where most things were made by human consciousness. You may die in a world where nothing is made by human consciousness. AI-art, in the future summoned non-consciously at the wave of a hand, expressing no original emotionality or feeling, transmitting nothing, will create only an overwhelming surplus of counterfeit art. A score imitating the temp endlessly. Corporations will photocopy our culture over and over until it becomes only a grainy image.
Some might argue that super-famous artists like Damien Hirst occasionally get away with taking credit for their atelier’s work, and therefore are basically artists-as-commissioners, but (a) all these people have done a huge amount of other original work themselves, and (b) once everyone does Damien Hirst’s schtick, it loses its luster. The readymades of Duchamp were art because of their audacity, because of the context of the other art around them, and because of their arrangements—when everything is a readymade, it’s not an art gallery, it’s a scrapyard.
Current machine learning techniques almost certainly lack all consciousness. Although we cannot prove this for certain, there is likely nothing it is like to be DALL-E. This is certainly what the leading neuroscientific theories of consciousness would tell us—there is no “complex” of activity that forms a global workspace, no reentrant connections, no information integration, nor is there any “fame in the brain,” nor a “remembered present.” Unless one adopts the loosest of panpsychist theories, artificial neural networks will need to become a lot more brain-like in their architecture before we start ascribing to them a stream of consciousness (consider, for instance, how even the description of consciousness as a “stream” doesn’t fit DALL-E, since it is only active when queried for input, which triggers an instantaneous feed-forward cascade, and then it falls into its eternal quiescence once again). And even if by some small chance such AIs did have a primitive consciousness, it would not involve the intentionality of human consciousness.
I’m aware there are current methods to distinguish AI-art from human art based on statistical signs of pixel distribution, but (a) once human-modified AI-art takes off it seems likely these signs will be obscured, sometimes purposefully, (b) it’s likely that as the programs get better their signatures will be harder to detect, (c) each program is different and there will be thousands of programs, and (d) this is even harder to keep straight for things like music, harder still for text generated by GPT-3.