Completely unexpected here, yet very much appreciated.
Entrepreneurship could probably fit well in here too.
Started the book with this chapter for whatever reason, and this is a great point.
Although most of us would appreciate greater control over technology, this chapter considers the (probably more rare) occasions when we might enjoy having a machine doing something unexpected. That is, situations where the machine does something beyond, or different to, what the user had specifically asked for. This possibility is related to the questions that AI philosophers ask about whether a machine can be creative, although I am going to argue that “creativity” is the wrong word in relation to Moral Codes, both philosophically and technically. What I’m actually going to talk about is when a machine does something surprising, in the information-theoretic sense already introduced in chapter 4, especially where it is surprising in ways that we might enjoy.
That distinction follows the work of AI pioneer and philosopher Margaret Boden[1]. As Boden analyses in far greater detail, there is an everyday relationship between creativity and surprise. For example, when I watch somebody creatively solve a problem, that solution seems creative precisely because I hadn’t thought of it myself. If I already had the same idea, I won’t find the solution surprising, despite the fact that the inventor might feel surprised by what they have done. Among many other important distinctions, Boden refers to this subjective experience as personal P-creativity, versus historical H-creativity to describe the first time that anyone had a particular idea. I should note at this point that much of Margaret Boden’s work, and in particular her wonderful book The creative mind: Myths and mechanisms, is primarily concerned with using computers to understand human creativity[2]. She pays close attention to the role of programming, but not much to the question of whether computers might be creative independently of their programming, which she refers to as the “fourth Lovelace-question”[3], and discusses only briefly at the end of her book, considering it the least important in relation to human creativity.
Much recent philosophical discussion about whether machines can be creative has focussed on analysis of a single famous instance of H-creativity: move 37 of the Go match between AlphaGo and Lee Sedol in 2016. It seems none of the professional Go players or commentators on the match had ever seen, or even considered, that particular move before. This single move has been described as if it introduced the dawn of a new age of machine creativity[4]. But here I’m going to argue that move 37, although clearly surprising to those directly involved, was not especially creative in the sense that most of us will find interesting in future.
In chapter 4, I reviewed the principles of Claude Shannon’s information theory[5], explaining how LLMs do not create information, but only transfer it from elsewhere, representing signals coming through a data channel. I explained that LLMs can add noise to the signal, in the form of unexpectedly random variations. When Bender, Gebru and their colleagues described LLMs as “stochastic parrots,” they were referring to these two aspects of information theory, where the “parrot” part is the signals copied from the training data, and the “stochastic” part is the random noise used to generate new output text.
I follow the argument of Boden and others, that what we call “creativity”, when done by a machine, is more precisely a measure of how much we are surprised by what the machine does. Pioneer of Bayesian decision methods Myron Tribus coined the nice word surprisal (as an alternative to the physical quantity of entropy used by Shannon) to quantify the amount by which we are surprised by something we observe[6]. If somebody tells me something I already knew, I have not gained much information, and I may not consider this person very creative. I find things creative when they are new to me, when I didn’t expect them. The amount of surprisal, in a message I wasn’t expecting, could regard the information I have gained from attending to it as a measure of perceived creativity.
To consider the example of the Go-playing robot, if it is being observed by an expert human Go-player, and every move is exactly the one expected, there are no surprises, no subjective impression of creativity, and little measurable information received by the expert. Another person, having less expertise in Go, might be surprised by every move they see, and learn a great deal – but this is not because the robot has become more creative, just that this audience is more easily impressed. In many fields of human creativity, we call an ignorant person who is easily impressed undiscriminating, while an expert or connoisseur is discriminating. The concept of discrimination is another technical term in both information theory and machine learning, and gives mathematical substance to the common understanding that creativity, like beauty, is in the eye of the beholder.
We see this all the time in human art forms, and especially music, which has been described for centuries in terms of mathematical relations. Some sequences of notes are very predictable if you are familiar with that style. Perhaps the most predictable is if I listen to the exact same digital recording of my favourite song many times[7]. It’s comforting to listen to a piece of music that I know really well, although also boring to hear it too many times. When people listen to music, there is a sweet spot between things that are comforting, familiar (or possibly boring), and things that are interesting and surprising. I like blues music where each chord is determined by its place in the 12-bar sequence. It’s the familiarity of this sequence that makes it satisfying. When someone listens to a lot of blues, they can hum along and make a good guess at what note will come next, or even improvise their own new part to follow the rules they have internalised through long repetition.
If you aren’t a musician, but have access to a piano keyboard, you can try a simple experiment. Play up and down the white keys only, one note at a time, changing direction when you feel like it. You are playing a tune in the key of C Major. After a minute or so, press one black key. You will hear this as a kind of surprise, different from the comforting (perhaps boring) routine of the C Major tune. Almost all Western music relies on managing the amount of surprise, just the right number of black notes, at slightly unpredictable moments, to maintain the interest of the listener. In contrast, if a pianist plays any old combination of white and black notes at random, the result will not sound pleasant, or even like a tune. In the mid 20th-century, “serialist” composers set themselves the challenge to use all 12 white and black notes before repeating any of them. The result has not caught on, and has not become the basis of many familiar pop songs or comforting lullabies.
Great musical performers, in blues or any other genre, follow the rules of that genre, but add some spice with a surprising note from time to time. Pop music has its own set of rules, and every pop song sounds more or less like another song, because that’s what makes pop music comforting and familiar[8]. If you listen to heavy metal, there is a different set of conventions, and we each learn to like what we like. Metal fans enjoy sounds that might not be good background listening in a restaurant. They may think they are special because of their extreme tastes, but a soprano coming out to sing O mio babbino caro in the middle of a concert by the notorious death metal band Vile Pürveyors of Execration[9] would be even more surprising to the audience than a blood-curdling scream.
The lesson from this is that we all like certain kinds of surprise, but not others, even within art forms supposedly dedicated to the disruption of comfortable mainstream routines. I pointed out, in my discussion of stochastic parrots in chapter 4, that the greatest amount of information you can send over a communication channel is a completely random sequence of numbers. If a message of zeros and ones were composed by somebody flipping a coin to choose each bit, a distant receiver would have no way to predict what the next bit is going to be. That message is as surprising as it could possibly be, but we don’t perceive it as being creative. On the contrary, listening to a sequence of completely random bits, sounds or pitches is heard as noise (a technical term in information theory, as well as a critical term in music).
I will note in passing that we can sometimes be surprised by things even though they could in principle be predicted by calculating them in advance, such as the mathematical functions associated with chaos theory and fractals[10]. The most interesting output from LLMs results from the huge complexity of their language models, but there are also mathematically simple models that produce complex results. Many visual artists and musicians use chaotic and fractal functions to generate satisfying artworks, especially if they are programmers. Live coder Alex McLean loves to find simple functions that generate interestingly complex rhythmic patterns, which he relates to the observation by South Indian percussionist and composer B C Manjunath that in a virtuoso performance “the complexity can be for others, but for me it should be simple.”[11] It is also worth noting that I will refer later in this chapter to coding with “random” numbers, although computing experts will be well aware that there is no such thing, and that the “random” keyword in many programming languages simply invokes a chaotic algorithm whose output is very hard to predict.
A theory of how to code creativity must distinguish between signal and noise. The signal is the message we want to hear (a meaningfully creative novelty), and the noise is random stuff that is not interesting. We are interested in messages that relate to our expectations, and to things that are familiar, whether a game of Go, 12-bar blues music, or photographs of our families. When something completely random gets added, for example noise corrupting a digital photograph, it is annoying and upsetting, not an act of creativity on the part of the camera. We do enjoy surprising messages, for example when an old friend contacts us out of the blue. But we don’t like very “surprising” data such as a loud burst of random static or corruption on a backup disk. The kinds of surprise that we recognise as being creative depend on two things – somebody who wants to tell us something worthwhile, and our expectation of what we are going to hear.
Being exposed directly to random noise is not aesthetically pleasing - it is too surprising, not at all comforting, and there is no message (even ‘white noise’ sounds comforting precisely because it is not random). However many music composers and other artists use coin tosses or rolls of a dice to develop their ideas in a new and unexpected direction. 20th century composers like John Cage were famous for aleatoric compositions incorporating a random process[12]. Brian Eno’s oblique strategies[13] include all kinds of disruptive suggestions, intended to help a creative artist discard comfortable but boring habits and try something new. Painters and sculptors find it interesting to surprise themselves through conversations with their material, where random paint splashes, unexpected turns in the wood grain, or slips of the chisel might help explore possible ideas beyond their conscious intentions[14].
We should be very clear about the difference between the use of randomness as a compositional strategy, and the presentation of randomness as an artwork in itself. I’ve already explained that the most surprising message (in an information theory sense) is a sequence of completely random numbers or coin tosses, since there is no way to predict any number from the ones that came before. Completely random messages are very surprising, but paradoxically very uninteresting, because they are just noise, communicating nothing at all. There is no hidden message in a series of coin tosses or dice throws, no matter how much we might want to find one. And of course, the human desire to find meaning has resulted in many superstitious practices where people do look for meaning in coin tosses, dice throws, or cards drawn from a deck. Tarot readings, gambling, and other entertaining performances, just like the compositions of John Cage, use random information as a starting point for human creativity, although the artistic performance in these cases will be given by a fortune teller or a croupier in a casino rather than in a concert hall.
Genuinely random information is not communicating anything. It will be surprising to any observer precisely because there is no message that could be anticipated. We might enjoy the performance of a Tarot reader improvising on the basis of the cards that have been dealt, but the random shuffling is not the source of the message. An AI system can certainly produce surprising output if its design includes random elements, but the only message comes from its training data or the operator prompts. Random elements in a digital sequence are not signal, but noise. As described by philosophers of mind, a random message has no meaning because there is no intention behind it[15].
None of this is to say that randomness (or unpredictable mathematical functions) can’t be a valuable aid to human creativity. Just as with John Cage’s aleatoric composition methods, it can be exciting to experience art works with the right mix of comfort and surprise. We don’t like music (or any other artwork) to be toocomforting, because that is simply boring. Small amounts of surprise, at the right time, add spice to our mental worlds. Sometimes, a random event from our own computer, or even an unexpected suggestion from a predictive text keyboard, might be a happy accident that we enjoy responding to, and perhaps even repeating to our friends, just as when John Cage saw a dice throw that he liked, and included that note in his composition. I show some examples in figure 12.1, and by including them in this book, they have even become ‘found’ artworks of a kind, like Duchamp’s Fountain.
In future, I expect that I will use more powerful generative language models to save myself typing. The one I am using to write this book can already complete full sentences. Usually in a rather boring way, reflecting the most expected, least surprising, thing that I could say, according to the model. That’s OK, because I often do need to write boring and repetitive text. I expect I might enjoy turning up the temperature, for more surprising randomness, on a day when I’ve bored myself, and trying to think of something new to say[16].
These kinds of tools are helpful labour-saving aids, and I look forward to using them more, but we can’t confuse the dice thrown inside the neural network with my own creative decision to throw the dice, or my considered choice of which random variant is most interesting. Duchamp’s urinal only became an artwork after he signed and exhibited it. And it is Duchamp who is famous, not the factory employee who actually created the urinal. When I throw a dice to choose the next word in this hippopotamus[17], I’m the one being creative, sending a message through my use of randomness. When I use a random method to generate text here, this is a message from me, to you, the reader. If instead of reading this book, you threw a dice to choose random words from a dictionary, or if you spent the time reading stochastic output from a chatbot, the results may be surprising. But although that could perhaps be an entertaining game, nobody would be sending you a message, and you might feel reluctant to spend hours of your conscious lifetime attending to it.
To bring this back to the theme of Moral Codes, and in particular the aspect of Control Over Digital Expression, it’s useful to consider the kinds of programming languages digital artists use. Because artists appreciate opportunities to incorporate surprises in their work, they are often more interested in programming languages that sometimes behave in random, or at least chaotic, ways. In other circumstances, this is not what we want from a programming language. Certainly not a program that is managing the cruise control on a car, or the thermostat on an oven, or a nuclear power station.
In the hands of an expert user, slightly unpredictable tools can be surprisingly valuable. Aeronautics engineer Walter Vincenti (later director of the Stanford university program in Science, Technology and Society), described the long-term research effort to create more aerodynamically stable and predictable aeroplanes, which when achieved, resulted in planes that pilots hated flying[18]. It turned out that an expert pilot prefers a plane that is just slightly unstable, over-reacting rather than under-reacting, just as a Formula 1 driver performs best in a car that is always on the edge of going out of control (a car that a normal person is likely to crash within seconds).
I spent several years working with a team of expert violin researchers, analysing the acoustic vibration and perception of that surprisingly complex artefact[19]. It turns out that violins, like aircraft and racing cars, are designed to be unpredictable. The mode of vibration for a string being driven by a rosined bow has an incredibly complex variety of frequency components, amplified by a wood body that is specifically designed to vibrate in many different directions at once. The resulting rich timbre can be shaped by an expert player to create a huge range of different sounds. On the other hand, when played by a beginner, the range of unpredictable noises that can come from a violin make it a relatively unpleasant choice for a child’s instrument.
It is possible to tweak a string instrument to emphasise some kinds of vibration more than others, and our research team interviewed Robin Aitchison, a specialist luthier who is often paid to shift around a few internal components of the instruments played by top cellists. He showed me how he could adjust a cello to make a more reliably beautiful and comforting sound, which sounded good even in my bass-player’s hands[20]. But as he explained, this adjustment, even though it made my own performance sound better, is the opposite of what an expert player wants. Professional music gets interesting when it has an element of sonic surprise, so this must be available when the player needs it. And to recapitulate, the violin itself is not being creative. The unpredictable aspects are just a creative tool for the human player.
So how does this relate to computer-based tools? Just as with violin players and aeroplane pilots, professional computer artists appreciate tools that have the capacity to surprise them. I’ve spent over a decade working in the musical genre of live coding[21], where music is synthesised by an algorithm that the performer creates on stage, writing the code in front of an audience. In its most popular incarnation, this is the algorave[22], where a nightclub full of people may be dancing to the algo-rhythms.
Live coder Sam Aaron was working in my research group when he created his popular Sonic Pi language[23]. When Sam and I were discussing the features to be included, we knew some kind of randomising function was needed. For example, at a micro-level, a small amount of randomness makes a repeated drumbeat less robotic, or the frequency components of a synthesised sound richer, like a violin. Alternatively, random walks can also be used to invent a tune, as when wandering over the white notes of a piano. In Sonic Pi, a straightforward tune-generating program might operate in the scale of C-major, moving up and down that scale at random. To make the tune slightly more interesting, the program might jump two notes rather than one, or add an occasional black note, also at random occasions.
Sonic Pi programmers certainly do these things, and many live coding performers use similar strategies, making use of the “random” mathematical function available in most programming languages. Carefully choosing the amount of randomness in your code, just like choosing the temperature in a large language model, can be a starting point for surprising inspiration to the human artist.
However, Sam noticed an interesting thing after a year or two performing and composing with Sonic Pi, and talking to the other musicians and school students who it was designed for. Although random surprising notes can be interesting in moderation, these random tunes pretty quickly get boring, even if they follow rules of western harmony such as a blues scale, because there is no artistic intention. Even worse, a randomising program would occasionally produce something surprisingly beautiful, but then vanish, never to be heard again.
Sam therefore changed the random function in Sonic Pi so that it is not really random at all. Ordinary programming languages have sophisticated random number generators, to guarantee (for example) that there is no way a code-breaker would be able to predict the random key to an encrypted message. The new Sonic Pi randomising function is an artistic tool that produces an unexpected result when first used, but then does the same thing again if you ask for it.
Expert live code performers, like all composers and improvisers, play with the expectation systems of the human brain, leading us to expect one thing, surprising us with something else, then comforting the listener by repeating the same thing again. The verse and chorus in pop songs, the repeated passages in a Vivaldi concerto, and the thematic motifs of a Mahler symphony or Wagner opera all draw us in by repeating variations of an intriguing idea until it becomes familiar.
If we integrate machine learning models into systems for creative coding, there are new opportunities and challenges beyond those I described in chapters 4 and 5 when discussing large language models. The uncreative use of elements from existing artwork to create new ones already has a name in art criticism: pastiche. The term comes from the Italian cooking term pasticcio, meaning a bunch of different ingredients mixed into a single dish (Italian-speaking friends tell me that lasagne would be a classic example of a pasticcio).
Pastiche in artwork originated long before mechanical reproduction, when people liked to decorate the walls of their homes with visual art, but not all could afford to hire skilled artists[24]. Instead, a decorator would imitate and mix up elements of popular artworks, creating visual material that filled the walls, but did not involve a great deal of creative consideration or artistic coherence. The result might be a pleasant alternative to plain walls, but would never be exhibited in a gallery, or even lead to anyone remembering the name of the painter responsible. Such decorations might be comforting enough, just as it is when I buy a premade lasagne from a supermarket. However, I would be disappointed to be served such a standardised mixture at an expensive restaurant run by a celebrity chef.
After these humble origins, the term pastiche became more widespread among art critics and teachers, used to describe any work that imitates a particular style or artist by using familiar elements, but without any clear original message other than an affected stance of “look at who I’m pretending to be”. In the classical definition quoted by art historians, a pastiche is something that is neither original nor a copy[25]. In fact, that formulation, of being neither original nor a copy, seems like a very appropriate way to describe the output of an LLM. In artistic terms, the output of an LLM really is a kind of pastiche.
Pastiche started to be used more self-consciously in the aftermath of the modernist movement. Modernism was itself an internationally popular style, famously promoted by the designers and architects of the Bauhaus, who advocated simple geometry and mathematical logic rather than the more elaborate decorative styles of earlier centuries. Postmodernism reacted against the asceticism of modernist design with visual jokes, mixing up ornamental elements of different historical periods in incongruous ways, intentionally ignoring the sensible functionality that was central to the Bauhaus.
It’s important to consider the critical reception of computer art in the light of the way that pastiche has become a core element of ironic entertainment in postmodern culture. The mathematical logic of modernist design is inherently appealing to computer scientists, who are trained to appreciate elegant solutions where unnecessary detail has been stripped away. Apple’s lead designer Jonathan Ive, the person most responsible for the original iMac, iPod and iPhone, as well as much that has followed these design classics, was famously a fan of the modernist Bauhaus philosophy, and of contemporary functionalist designer Dieter Rams[26]. Steve Jobs’ personal style of plain black turtlenecks and round spectacles, imitated by so many other technology entrepreneurs, could easily have been the uniform of a German art school professor[27].
The potential for generative models to change those aesthetics of computer interaction can be compared to the use of pastiche in postmodernism. Although simple geometric design does have an elegant appeal, minimalist bachelor apartments and cleancut shift dresses are not for everyone. Sometimes we just love a sensory feast - a thick and satisfying lasagne, or a riotous party of texture and detail. In the same way that a cut-price fresco painter covers your walls by assembling artistic ideas they have collected from all over the place, so a generative model produces an artwork that might be comforting for the way it uses recognisable elements to decorate rather than convey any specific coherent message.
Could generative models be used together with creative coding, so that the comforting references of shared culture are combined in intentional ways, rather than at random? In fact, this is already happening. The intentional creative input to a generative model comes from the prompt it is given. Artists working with such models devote a lot of time to refining their prompts, getting the model to produce the effect that they are looking for, as described by prize-winning photographer Boris Eldagsen[28]. The process of creating exactly the right prompt for a generative model is already described as “prompt engineering”, or even “prompt programming”. It is the prompt language itself that is a Moral Code.
When thinking of creative art in terms of Shannon’s information theory, the relationship between comfort and surprise, and the extent to which a piece of art is communicating a message from the artist, the Moral Code of the prompt language is an essential counterpart to the contents of the model itself. Early experiments with generative networks, such as the Deep Dream demonstrations[29], were literally stochastic parrots. They produced pure pastiche, with no original message beyond the selection of the training examples that had been stylistically appropriated or plagiarised. In newer approaches to prompt-driven output, whether from language models, visual artworks or music, the creative intent is in the prompt, and the rest is pastiche.
If we think of the prompt text as a Moral Code, available for experimentation and refinement as the artist explores creative ideas for communication to an audience, the opportunities to design better tools will apply principles of live coding that I’ve already discussed. Live coders, like many other artists, develop their ideas incrementally, having a conversation with the material, in which they may try one thing, consider what it looks or sounds like, and then adjust their actions[30]. The unpredictability of rich tools and materials constantly generates new alternatives for the artist to consider and choose, including phenomena that may be complexly chaotic, or even aleatoric random noise. Each LLM has a characteristic style of output that users come to recognise, and I can imagine these being used for intentional effect, just as Autotune is no longer just a way to fix up a dodgy recording, but another instrument in its own right.
This kind of playful experimentation can be hugely satisfying and enjoyable, as many users of generative models are already finding. Pioneer of positive psychology Mihalyi Csikszentmihalyi describes the optimal experience of flow, in which the stimulation of new experiences is balanced by personal confidence in controlling your own intentions and mastering new skills[31]. My student Chris Nash used statistical recordings of music coding to identify those times when composers become absorbed in a state of creative flow. The key requirement is that there has to be a feedback loop between the changes to the code, the output that it generates, and the artist’s perception of that output to direct further changes[32].
These flow experiences relate to the dynamics of attention investment. Examples in earlier chapters described situations where the user has a goal in mind, and may not want to spend any more time interacting with computers than absolutely necessary. In creative flow, spending time becomes meaningful in itself. This is the essence of art and play - things we do for their own sake, worthwhile because they relax us, help us grow as a person, or enrich shared culture, not only for some practical purpose. We invest attention in art, play and relaxation because this is intrinsically good for us as humans.
As another example of the hybrid Moral Codes described in chapter 8, Chris Nash’s VSTrack presents music as a cross between a spreadsheet and a piano roll[33]. The screen scrolls upwards as the music plays, with the notes on the current row playing at each beat[34]. This unusual format, known as a “tracker,” allows computer-based artists such as game designers to explore generative algorithms within the Cubase system, without the musical training needed to read a score. For flow experiences with generative machine learning models, we need open and learnable alternatives to the text-based prompt interface. At present, artists using such models must keep a record of the prompt texts they have tried, experimenting with different changes but without really knowing in advance how these might interact with the content of the model. The rhetoric of AI creativity, which ignores the significance of both the training data and the prompt text, is extremely unhelpful.
We might even imagine new software artforms in which LLMs are used to generate more playful source code, without knowing in advance what we want this code to do. Such code is unlikely to be conventionally useful, as it would be when a software engineer uses an LLM to find reusable code components or to save typing with predictive text. However, we might imagine that stochastic/aleatoric processes for source code synthesis could be a component of an artistic project or programme, perhaps alongside other practices of programming identified by creative hacker Ilias Bergstrom[35], becoming a useful resource for the kinds of people who might be motivated to write code for artistic rather than practical reasons[36].
It’s also important to consider the ethical relationship between those artists whose work might be used as a source of stylistic pastiche elements, and the creative explorer composing new prompts. This is hardly a new problem - in fact every generation of artist through recorded history has created new work in relation to the existing canon, quoting earlier inspiration, and adding the original interpretations that make new work more satisfying than pastiche.
Technological art-forms like hip-hop music rely fundamentally on the remixing of other recordings for rich cultural references. The music industry does not necessarily help, with bureaucratic policing of copyright that is focused on maximum revenue to record companies rather than creative adventure for either artists or audience[37]. As a result, the Free / Open-Source Software and Creative Commons movements have been championed by artists for decades, in recognition of how central these dynamics have become to digital arts[38]. The recording and broadcast industries have often resisted such changes, missing new business opportunities, in part because the fundamental concepts of copyright were invented to support the book publishing industry hundreds of years ago, not the digital culture of today.
If we think about the significance of creative work in the attention economy, the most interesting aspect of remix is the amount of attention invested by the remixer, involving many hours of careful listening, in contrast to the instantaneously mechanical operations of streaming and copying. A sampling art-provocateur like John Oswald[39] creates their work not by stirring together predictable elements, but through attentive reading of the original work in order to mould and transform something new. Novelist Marcel Proust was in the vanguard of this more attentive understanding of pastiche, suggesting that pastiche is more about reading than it is about writing[40]. My colleague Mark Blythe has advocated literary pastiche as a design tool for imaginative approaches to user interface design[41]. But importantly, in all these cases, it is the reading of the original that repays investment of attention, not supplementing the original with mechanical part-copies that bring no critical interpretation or creative intention.
AI companies have interpreted the ideals of Creative Commons rather cynically, treating it as a licence for a new enclosure of the commons, sweeping up these public goods, without acknowledgement, into their own privately-owned machine learning models[42]. As I’ve already explained, pretending that the AI itself is the author of the resulting output is akin to institutionalised plagiarism, especially if commercial licensing terms prevent the open and creative culture that was the intention of Creative Commons in the first place[43]. If we set aside the business of “art” for a moment, and imagine the meaningful human experience of sharing in our play and inspiration, we can imagine Moral Codes where rich statistical models might be an opportunity for dialogue among artists, more creative conversations and public reward, and wider availability of flow experiences for many kinds of people.
Generative models have entered the artistic mainstream, including routine use of neural network-based audio and image generation in music and the visual arts. At the time I write, Midjourney, StableDiffusion and DALL-E have become popular creative tools, not only among professional artists and enthusiastic amateurs, but also as curiosities for the public. Recent news reports have included a major international photography prize won by a wholly generated image, albeit an image constructed with great care by an experienced professional photographer who subsequently refused to accept the prize on the basis that his entry was not real photography[44].
The economic implications for the creative professions seem enormous, perhaps on the same scale as the consequences that photography had for the profession of portrait painting. Until now, it has required years of training to create original figurative images. The only way for a person without artistic training to render an imagined image has been to assemble pieces of photographs into a “Photoshopped” digital collage. There was previously a clear distinction between digital photography (now universally accessible) and the employment of illustrators to create non-photorealistic illustrations. When large image generation networks became available as a commodity service, I immediately, like many others, experimented with whole Powerpoint presentations (including an early presentation of the material in this chapter) where I used synthetic images to illustrate every slide. I have never had the budget to employ an illustrator, but could suddenly illustrate a whole presentation with reasonably attractive original artwork. For those speakers who would previously have employed illustrators to create presentations, mechanically generated images might now be a far cheaper option. Just as with the explosion of “desktop publishing” in the 1980s, new tools promise to make creative work accessible to a wider range of people without specialist training, but also less well rewarded for those who previously specialised in such work.
Professional associations are quickly mobilising to address this challenge, especially among artists whose own work has been included in the training data for the very same generative networks that now threaten to put artists out of business. The site Have I Been Trained? (haveibeentrained.com) allows artists to search for their work in a number of popular training sets, and then to request its removal on the basis of copyright infringement. There are frequent reports where the body of work by a specific artist is used to create new works in the style of that artist, for example a new work by Rembrandt[45] (safely out of copyright), or a new song by Liam Gallagher of Oasis[46] (which seems intuitively as though it ought to belong to him in some way).
It seems likely that companies will soon use technology to replicate popular styles of entertainment at lower cost, especially where providers of online entertainment gain the status of monopsonies (as opposed to monopolies), such that there is only one buyer to whom an artist can sell their work. As explained by Rebecca Giblin and Cory Doctorow[47], monopsony is especially likely in the creative industries, because humans create art whether or not they are paid for it. If politicians and regulators remain more concerned to secure access to celebrities rather than rewarding diverse communities of artists, streaming services will be at liberty to create monopsonistic business models that take all the revenue from the entertainment industry, while human creative input effectively comes for free. Large streaming providers increasingly reduce their costs by offering curated playlists of songs acquired at a discount, capturing your attention through recommendation algorithms[48]. If machine learning is used to circumvent copyright law by generating completely new works imitating the style of a particular genre, playlist or artist, the absurd result could be to reduce the original artist’s compensation to zero, or even require the artist to pay the publisher to accept and promote their work, as already done by academic publishers.
The combination of outdated copyright law and contemporary attention economy logic that leads to Giblin and Doctorow’s Chokepoint Capitalism is becoming ubiquitous, and is poised to reinforce the tendency toward semi-plagiarism of the creative work that is used to train AI models. However, we should remember not only that these logics are historically recent, but also that they are embedded in Western ways of viewing the world. Māori film-maker Barry Barclay explains how copyright law cannot be an adequate basis for the protection of taonga - Māori cultural treasures[49]. The principle of mana tuturu that Barclay developed in consultation with Māori elders, for deposits in the New Zealand Film Archive, encodes the legal principle of Māori spiritual guardianship over such taonga, including the essential element that such guardianship will be the eternal responsibility of descendents. Whereas Western copyright law causes creative works to become less valuable over time (as they become out of date, their copyright expires, and they become available for free in the public domain), Māori taonga actually increase in value over time, often bringing increased responsibility for those who are their guardians.
The implications of alternative moral and legal frameworks can be profound, when contrasted with routine practices of machine learning. In many cultures, the images and voices of ancestors are fundamental aspects of them as people. The idea that such images and voices might be digitised and ground-up into a machine learning model for creation of exotic new artworks is as abhorrent as it would be to suggest to a Western person that they might grind up the body of their dead relative for use in a particularly tasty hamburger recipe. Being attentive, with respect, to those we love and have loved, is a fundamental moral commitment. Creatively expressing this in code is an opportunity for thought rather than purely mechanical profit - an aspect of attentive consciousness that we might very usefully learn from other cultures. Unfortunately, our opportunities to learn such things have been restricted by the perspective we are looking from, as I explain in the next chapter.
[1] Margaret A. Boden, Creativity and art: Three roads to surprise. (Oxford: Oxford University Press, 2010). See also Margaret A. Boden,"Creativity and artificial intelligence". Artificial intelligence 103, no. 1-2 (1998): 347-356.
[2] Margaret A. Boden, The creative mind: Myths and mechanisms (2nd Ed.) (Abingdon UK: Routledge, 2004)
[3] Boden’s comment comes from her gloss on the remark of Lady Ada Lovelace that ‘[Babbage’s] Analytical Engine has no pretensions whatever to originate anything. It can do [only] whatever we know how to order it to perform’. Boden acknowledges as correct and important the emphasis on programming, which this book shares, but goes on to identify our further “Lovelace-questions”: 1. Can computers help us to understand human creativity? 2. Could computers appear to be creative? 3. Could a computer appear to recognise creativity? 4. Could a computer be really creative independent of the programmer? (Boden The creative mind, 16-17)
[4] Nathaniel Ming Curran, Jingyi Sun, and Joo-Wha Hong, "Anthropomorphizing AlphaGo: a content analysis of the framing of Google DeepMind’s AlphaGo in the Chinese and American press," AI and Society 35, no. 3 (2020): 727-735. https://doi.org/10.1007/s00146-019-00908-9
[5] Shannon, “A Mathematical Theory of Communication”; Shannon and Weaver, The Mathematical Theory of Communication.
[6] Myron Tribus, Thermodynamics and thermostatics: An introduction to energy, information and states of matter, with engineering applications. (Princeton, NJ: D. Van Nostrand, 1961), 64.
[7] When writing this book, my 2022 playlist shuffles mysteriously often to repeat the chorus “What we fi do? We're stuck in a loop”, from the album Gbagada Express by Nigerian artist BOJ. (“In a Loop”, featuring Molly and Mellissa, lyrics by Molly Ama Montgomery, Mobolaji Odojukan, Melissa Adjua Montgomery).
[8] The central role of comfort and similarity in pop music, through which successful pop songs sound much the same as older songs, is a constant factor in the pop industry, reflected in the saying that “where there’s a hit, there’s a writ,” as reported by globally successful singer-songwriter Ed Sheeran after winning yet another court case alleging copyright infringement. Ramon Antonio Vargas. “Defending copyright suits ‘comes with the territory’, says Sheeran”. The Guardian, May 8, 2023. https://www.theguardian.com/music/2023/may/07/ed-sheeran-copyright-lawsuit
[9] Not a real death metal band.
[10] There are many aesthetically pleasing mathematical functions that are easy to define, but hard to predict, like the digits of π. Margaret Boden devotes Chapter 9 of The Creative Mind to these topics of “Chance, Chaos, Randomness and Predictability”. This fascinating intersection of mathematical and artistic research is too large for me to discuss further, but as an example (suggested by Alex McLean), programmer Martin Kleppe tweeted a beautiful fractal image resulting from this simple formula: "(x ^ y) % 9" that can be rendered in a web browser by putting the following text in a .html file:
<canvas id="c" width="1024" height="1024">
<script>
const context = c.getContext('2d');
for (let x = 0; x < 256; x++) {
for (let y = 0; y < 256; y++) {
if ((x ^ y) % 9) {
context.fillRect(x*4, y*4, 4, 4);
}
}
}
</script>
Martin Kleppe on Twitter: "I'm fascinated by this simple formula to create bit fields that look like alien art: (x ^ y) % 9 https://t.co/jZL15xzEDL" / @aemkei, 2 April 2021)
[11] Alex McLean, Manjunath B C - Vocal patterns in Konnakol. Recorded video interview, streamed live on June 14, 2022. https://algorithmicpattern.org/events/manjunath-b-c/
The extract containing this quote can be found at the time code: https://youtu.be/6kYwZ8S-qBQ?t=1762
[12] Tuck Wah Leong, Frank Vetere, and Steve Howard, "Randomness as a resource for design," in Proceedings of the 6th ACM conference on Designing Interactive Systems (DIS) (2006),132-139.
[13] Brian Eno and Peter Schmidt. Oblique Strategies: Over One Hundred Worthwhile Dilemmas (Limited edition, boxed set of cards) (London: printed by the authors, 1975); See also Kingsley Marshall and Rupert Loydell, "Control and surrender: Eno remixed – Collaboration and oblique strategies," in Brian Eno: Oblique music, ed. Sean Albiez and David Pattie (London: Bloomsbury Academic, 2016), 175-192.
[14] The Japanese aesthetic of Wabi-Sabi is another example of material imperfection that might be used as a counter to computational predictability. Vasiliki Tsaknaki and Ylva Fernaeus, "Expanding on Wabi-Sabi as a design resource in HCI," in Proceedings of the 2016 ACM CHI Conference on Human Factors in Computing Systems (2016), 5970-5983.
[15] Dennett, "Intentional Systems" and The Intentional Stance
[16] In another completely unsurprising footnote, I am unwilling to reveal which of the sentences in this book demonstrate either one, or the other, of those two approaches.
[17] This word was generated by the GPT-3 model text-davinci-002, temperature setting 0.7, in response to the prompt "Please give me a random word" on 12 September 2022. Accessed on that date via https://beta.openai.com/playground (site no longer supported).
[18] Walter G. Vincenti, What engineers know and how they know it. (Baltimore: Johns Hopkins University Press, 1990).
[19] Claudia Fritz, Alan F. Blackwell, Ian Cross, Jim Woodhouse, and Brian C.J. Moore, "Exploring violin sound quality: Investigating English timbre descriptors and correlating resynthesized acoustical modifications with perceptual properties." The Journal of the Acoustical Society of America 131, no. 1 (2012): 783-794.
[20] Robin was actually moving the position of the sound post. For a technical reflection on my own experience as a bass player, see Alan F. Blackwell, "Too Cool to Boogie: Craft, culture and critique in computing". In Sound Work: Composition as Critical Technical Practice, ed. Jonathan Impett. Leuven: Leuven University Press / Orpheus Institute, 2022, 15-33.
[21] Nick Collins, Alex McLean, Julian Rohrhuber, and Adrian Ward. "Live coding in laptop performance." Organised sound 8, no. 3 (2003): 321-330; Alan F. Blackwell and Nick Collins. "The programming language as a musical instrument". In Proceedings of the Psychology of Programming Interest Group (PPIG) (2005), 120-130; Alan F. Blackwell, Alex McLean, James Noble, and Julian Rohrhuber, edited in cooperation with Jochen Arne Otto. "Collaboration and learning through live coding (Dagstuhl Seminar 13382)." Dagstuhl Reports 3, no. 9 (2014), 130-168; Blackwell, Cocker et al Live Coding: A user’s manual
[22] Nick Collins and Alex McLean, "Algorave: Live performance of algorithmic electronic dance music," in Proceedings of the International Conference on New Interfaces for Musical Expression (NIME), (2014), 355-358.
[23] Aaron and Blackwell, “From Sonic Pi to Overtone”; Samuel Aaron, Alan F. Blackwell, and Pamela Burnard, "The development of Sonic Pi and its use in educational partnerships: Co-creating pedagogies for learning computer programming," Journal of Music, Technology and Education 9, no. 1 (2016): 75-94.
[24] Ingeborg Hoesterey, Pastiche: Cultural memory in art, film, literature. (Bloomington and Indianapolis: Indiana University Press, 2001).
[25] Attributed to Roger de Piles, apparently in a treatise originally dated 1677, although with a complex history of citation and attribution as discussed by Hoesterey Pastiche, pp. 4-6.
[26] Jackson Arn, “How Jony Ive Remade Visual Culture in Apple’s Image”. Artsy.net, Jul 11, 2019
https://www.artsy.net/article/artsy-editorial-jony-remade-visual-culture-apples-image
[27] Abigail Cain, “What Steve Jobs Learned from the Bauhaus”. artsy.net, Oct 10, 2017
https://www.artsy.net/article/artsy-editorial-steve-jobs-learned-bauhaus
[28] Jamie Grierson, “Photographer admits prize-winning image was AI-generated”. The Guardian, Mon 17 April 2023
[29] Alexander Mordvintsev, Christopher Olah, and Mike Tyka. Deepdream-a code example for visualizing neural networks. (blog entry, 2015). https://ai. googleblog. com/2015/07/deepdream-code-example-for-visualizing. html
[30] Schön and Bennett, "Reflective conversation with materials"
[31] Mihaly Csikszentmihalyi, Flow: The psychology of happiness. (New York: Random House, 2013)
[32] Chris Nash and Alan F. Blackwell, "Flow of creative interaction with digital music notations," in The Oxford Handbook of Interactive Audio, ed. Karen Collins, Bill Kapralos, and Holly Tessler. (Oxford University Press, 2014), 387-404; Chris Nash and Alan F. Blackwell, "Liveness and Flow in Notation Use," in Proceedings of the International Conference on New Interfaces for Musical Expression (NIME) (2012), 76-81.
[33] Chris Nash and Alan F. Blackwell, "Tracking virtuosity and flow in computer music," in Proceedings of the International Computer Music Conference (ICMC) (2011), 575-582.
[34] Notation and visualisation enthusiasts are intrigued by the distinctive tracker representation, which represents time flowing down the screen and different instruments arranged in columns across it. The result can be thought of as an orchestra score turned on its side, with the time direction changed from horizontal to vertical, and the musical staves for each instrument changed from lines to columns.
[35] Ilias Bergström and Alan F. Blackwell, "The practices of programming," in Proceedings of the 2016 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC) (2016), 190-198.
[36] Saeed Aghaee, Alan F. Blackwell, David Stillwell, and Michal Kosinski, "Personality and intrinsic motivational factors in end-user programming," in Proceedings of the 2015 IEEE Symposium on Visual Languages and Human Centric Computing (VL/HCC) (2015), 29-36; Alan F. Blackwell, "End-user developers - what are they like?" in New Perspectives in End-User Development, ed. Fabio Paternò and Volker Wulf (Berlin: Springer, 2017), 121-135.
[37] Rebecca Giblin and Cory Doctorow, Chokepoint Capitalism: How big tech and big content captured creative labor markets and how we'll win them back. (Boston, MA: Beacon Press, 2022)
[38] Rishab Ghosh, ed. CODE: Collaborative ownership and the digital economy. (Cambridge, MA: MIT Press, 2006).
[39] Kevin Holm-Hudson, "Quotation and context: sampling and John Oswald's Plunderphonics," Leonardo Music Journal 7 (1997): 17-25. https://doi.org/10.2307/1513241
[40] Hoesterey, Pastiche, p.9
[41] Mark Blythe, "Pastiche scenarios," Interactions 11, no. 5 (2004): 51-53.
[42] Blackwell, "The Age of ImageNet’s Discovery"; Couldry and Mejias, The costs of connection.
[43] Creative Commons Announced, 16 May 2002. https://creativecommons.org/2002/05/16/creativecommonsannounced-2/
[44] A newspaper report led with "German artist Boris Eldagsen says entry to Sony world photography awards was designed to provoke debate". Jamie Grierson (2023). Photographer admits prize-winning image was AI-generated. The Guardian, Mon 17 April 2023 https://www.theguardian.com/technology/2023/apr/17/photographer-admits-prize-winning-image-was-ai-generated
[45] Mark Brown, “New Rembrandt to be Unveiled in Amsterdam,” The Guardian, April 5, 2016.
https://www.theguardian.com/artanddesign/2016/apr/05/new-rembrandt-to-be-unveiled-in-amsterdam
[46] Rich Pelley, “‘We got bored waiting for Oasis to re-form’: AIsis, the band fronted by an AI Liam Gallagher”. The Guardian, April 18, 2023. https://www.theguardian.com/music/2023/apr/18/oasis-aisis-band-fronted-by-an-ai-liam-gallagher
[47] Giblin and Doctorow, Chokepoint Capitalism
[48] Seaver, Computing Taste
[49] Barry Barclay, Mana Tuturu: Maori Treasures and Intellectual Property Rights. (Auckland: Auckland University Press, 2005).