I often keep thinking about this observation that philosopher Janne Vanhanen (of the University of Helsinki's Department of Philosophy, History and Art Research) once put forward when we were discussing music, especially its more experimental edges. We were both admirers of the late Finnish composer and sound artist Mika Vainio as well as his pioneering electronic-industrial duo Pan Sonic (with Ilpo Väisänen), and were pondering what made such austere, seemingly unmusical abstraction and noise like that of Vainio and Pan Sonic musically and sonically captivating. And this was Vanhanen’s conclusion: their sonic explorations managed to sound interesting because in them one could observe the process of musically unpromising elements solving and forming a coherent aesthetic experience; they were trying to find a solution to the very question of what music is – or what it could be.
Around the same time I came up with the idea that music that sounds like music is not music. Obviously a playful take on all things musical, it was inspired by an album titled I AM AI (2018) that the American artist and ‘creative technologist’ Taryn Southern had just released to a certain media and tech community fanfare: the idea and novelty behind the album was that the music – apart from the vocals, vocal melodies and lyrics – was created entirely by AI, more precisely by software called Amper Music in combination with other similar tools including IBM’s Watson Beat, AIVA, and Google Magenta (Southern, n.d.-a). The music on this “world’s first AI solo pop album”[6] (Southern, n.d.-b) sounded impeccably composed, performed and produced as if done by humans – in fact the album sounded like it did not differ from most human-produced albums out there at all: AI had shown that it was able to produce human-like music like any real composer, musician and music producer.
Yet there seemed to be something uncanny about the music, the vocals partially aside. Unlike some other AI music generators at the time that tried to ‘invent’ music from rather limited datasets of musical samples by generating raw audio autoregressively[7] through an unconditional, unsupervised neural network[8] (Carr, & Zukowski, 2018) – and thus ending up sounding clearly artificial (and consequently rather interesting) – Amper Music uses a sample library along with its datasets as its compositional material, this library consisting of “over one million individual samples and thousands of unique instruments” (Welcome AI, n.d.) collected from recordings of performances of real human musicians, recorded by hand and “sculpted with meticulous attention to detail” (Welcome AI, n.d.). “Through the fusion of music theory and AI innovation” (Welcome AI, n.d.), the software creates coherent sounding music from this real musical source material[9]; its output, in other words, is music that sounds exactly like music.
Here is the irony. Whereas the music ‘imagined’ by those other early AI composers might have sounded clearly unnatural and even hilariously crude due to their less sophisticated neural networks and still unrefined hyperparameters guiding them, the Amper AI music on the I AM AI album ended up sounding unnatural precisely because it so perfectly resembled ‘natural’ music: it emulated existing human-made music through an advanced new technology, but without the presence of humans – or the advanced new technology, for that matter: the technology might have passed the Turing Test[10] with its machine-generated output indistinguishable from that produced by humans, but it would have failed a hypothetical Original Musical Thinking Test[11], a yet-to-be-invented method measuring an agent’s capacity to create its own kind of ‘advanced’ music, very distinguishable from other agents.[12] The music was neither advanced nor human but simply an automated digital reconstruction of a once novel human artefact (e.g. an electronically produced piece of music); a sound of a machine haunted by human nostalgia, a ghost from the past trapped in the machine. Attempting to “explore the future of humans and machines” (Southern, n.d.-a), I AM AI instead came to represent what Franco Berardi referred to as “the slow cancellation of the future” (Berardi, 2011, as cited in Fisher, 2014, p. 16): a sense that social and cultural progress promised by the post-Second World War modernity gradually diverged from those of technology and global finance, eventually regressing and reverting to nostalgia, consumer monoculture and outdated ideologies of the past (e.g. capitalism). Here we had an advanced AI technology (Amper Music) producing the kind of music that humans had already been perfectly capable of producing themselves for quite some time; now this creative process had simply been normalised, automated, and scaled for more productive and cost-effective music industry manufacturing.
[6] The French pop artist SKYGGE (Benoit Carré) also claims that his album Hello World (2018) is “the first music album composed with the help of an AI technology”, using software called Flow Machines (see https://www.helloworldalbum.net).
[7] A method where the prediction of the next output in a sequence is based on the previously generated outputs.
[8] E.g. sampleRNN (recurrent neural network). Unconditional and unsupervised means that the network has been trained without any metadata such as music theory or MIDI, and the model instead wanders through its state space and the raw audio dataset, conditioned only on the state of its previous timestep.
[9] Amper Music has since been acquired by Shutterstock, Inc. The original citations were taken from Amper’s own, now defunct, website in 2018.
[10] A test proposed (1950) by the English mathematician Alan M. Turing to determine whether or not a machine is capable of thinking like a human being.
[12] While several tests for measuring the AI creativity and innovation do exist and are frequently used, e.g. the Torrance Tests of Creative Thinking (TTCT) and the Alternative Uses Task (AUT), these fail to objectively and expertly capture the complex historical-cultural-aesthetic considerations of ‘musical and sonic thinking’.
This particular fontness of I AM AI, the music-that-sounds-like-music churned out by Amper Music, was not entirely the fault of the machine or the artist herself. Amper Music, like the numerous other similar AI music generators that have been evolving and proliferating since – AIVA, Amadeus Code, Beatoven, Boomy, Ecrett Music, Hydra II, Landr, Loudly, Moises, Mubert, Mureka, MuseNet, MusicLM, Soundful, Soundraw, Suno, Udio (to name the most popular ones)[13] – is a generative AI application, an advanced machine (deep) learning model that creates new content based on patterns and structures learned from existing data. These models of AI music generators are trained on vast datasets of existing music – a massive collection of songs and their associated metadata – to learn patterns and relationships within the data.[14] In other words, the concept and idea these AI music generators have of music is based on whichever existing pieces of music their respective training datasets have contained; these generators did not move to Berlin (Bowie), walk among lava fields (Björk), rebel against military dictatorship (Fela Kuti) or dive into New York City’s street life (Steve Reich) in order to develop their ideas about music – they remained static in their server farms while being made to harness the creative achievements and lived lives of others (humans). Add to this the text-to-music interface (prompts) used to instruct these AI models as well as limited, often culturally conditioned (e.g. genre-specific) language available to describe the music, and it is easy to see why this AI generated music ends up sounding just like music (“a particular typographical font”) and not really music (e.g. Berlin, lava fields, rebellion, street life).
As writer Adam Clair points out in his article What AI in music can – and can’t – do, “any music AI generator is only as good as the data it’s trained on” (2024). This also means that any biases and shortcomings in the training data will be reproduced by the model in its output. If the data has excluded, for example, musical traditions predating recording technology or music of non-Western origins, it is highly unlikely the model will be able to generate music in these veins, even if prompted to do so; similarly, any notion of ‘futuristic’ in music is limited to those past musical examples that have been labelled as ‘futuristic’ – the model cannot conceptualise possible future musics, to think outside the box it operates in. As currently designed, today’s AI models “are more likely to produce stereotypical sounds within a genre or style than they are to produce anything peculiar, let alone innovative or interesting. Generative AI systems have a bias toward mediocrity, but transcendent [as in ‘pre-eminent’, ‘supreme’, ‘unparalleled’, ‘unique’, ‘extraordinary’, ‘superior’ or ‘sublime’] music is found on the margins” (Clair, 2024).
This aesthetic limitation of ‘autonomously creative music systems’ is also noted by composer Artemi-Maria Gioti: referencing an earlier research on AI conducted by Xu, Wang and Bhattacharya that “design research on artificially intelligent systems has focused primarily on goal-oriented problem-solving, ignoring the problem creation phase that should precede problem-solving” (Xu, Wang, & Bhattacharya, 2010, as cited in Gioti, 2021, p. 38), Gioti notes a similar appoach in later research on automatic composition systems, where “composition is considered as problem-solving – the ‘problem’ being one of style imitation – rather than problem creation” (Gioti, 2021, p. 38). This emphasis on imitational rather than transformational possibilities results in the tendency of such systems “to produce outputs with limited aesthetic value and virtually no innovation potential” (2021, p. 38) – a view also shared by artist and scholar Sofian Audry, who argues that machine learning/engineering approach to creativity has often framed art-making as a problem to be solved, idealising optimisation and reproducibility over open-endedness and diversity; by reducing creativity to a form of computation, it has also failed to take into account the wider cultural and environmental context in which art operates (Audry, 2021). Gioti calls for a more ‘ecosystemic’ (McCormack, 2012, as cited in Gioti, 2021, p. 38) application of AI in music, away from the mere automation and simulation of human creativity and towards co-creative human-machine assemblages, where AI is used to augment and diversify (rather than replace) the human output and interact with its user and environment more thoughtfully. In such human-computer co-exploration, AI acts as a complementary rather than competitive agent, a tool for experimentation and discovery instead of optimisation (Gioti, 2021). This approach is similar to the ecological enmeshing of AI proposed by Bridle – the oracle of the machine being the outside world –, the creative adaptation and 'hijacking' of machine learning processes by Audry (see Audry, 2021), as well as my own future application, in which AI will act as an agential system in a greater environmental field.
[13] See the following links, for example: https://workmind.ai/blog/best-ai-music-generators/; https://www.digitalocean.com/resources/articles/ai-music-generators; https://www.overtune.com/blog/5-best-ai-music-generators-2024.
However, the aforementioned margins of the AI music generators may also begin to expand and move towards the centre, the more the deep learning neural networks running the generative AI models evolve and the more advanced the AI music applications become. “Transcendent music” could emerge more frequently – it might even become possible to train the models not just on musical but all kinds of data, so when a user one day prompts the system to move to Berlin, walk among lava fields, rebel against military dictatorship, or draw from New York City street life, it will generate the music accordingly. ‘Platform Musicking’ – a term coined by Jennifer Walshe to describe the ever-increasing entanglement of AI, streaming platforms and user data – could become a dominant form of the future music where the roles of listener and composer blur and music, instead of being an independent artistic entity, simply emerges through the user’s engagement with the platform: “music on demand, music as an artform which requires only the user’s data or imagination or funny idea, music as an endless, highly personalised flow, fleeting and non-linear” (Walshe, 2024). One already popular domain of this Platform Music which could acquire even more prominence in the future is functional music – music designed to soundtrack and optimise, for instance, concentration, motivation and relaxation – provided by platforms such as Endel and Brain.fm. Through the advancing use of AI, neuroscience and psychology, future music could become highly individualised and optimised, based on the most granular personal information of the user. “On these platforms,” Walshe writes, “the user enters willingly and enthusiastically into a deeply entangled state with the platform, in return for optimisation. Sound exists beyond the artist, album or playlist; sound is instead precisely designed, ‘scientifically proven to increase focus’, audited first and foremost for mood management, for assistance, for comfort” (2024). The listener becomes the composer, the personal data the musical material, and the AI platform the copyright owner (what becomes of composers and musicians will form part of my future research).
AI music might also finally be able to exit its Goldilocks zone – well-trained and comfortable, music-that-sounds-like-music, just in the right distance between the past and the future – and voyage further and deeper into the musical space: to conceive music that is only possible with the help of AI, that couldn’t have existed prior to the current transformer (deep learning) architecture enabled generative AI.
At the moment, however, the underlying principle and function of these AI music generators and platforms seem geared toward the “slow cancellation of the future”, their design, musical aesthetic and business model serving more the technocapitalist pursuit of profit than any advancement (or liberation) of the culture. By enabling us to produce more of the music-sounding music of the past – faster, easier, cheaper – they continue to perpetuate what cultural theorist Mark Fisher described as “a crushing sense of finitude and exhaustion” oppressing our 21st-century music culture (2014, p. 18); compared to the 20th-century experimental culture “seized by a recombinatorial delirium, which made it feel as if newness was infinitely available”, to Fisher the 21st century “doesn’t feel like the future” (2014, p. 18). The increasing reliance on ‘nostalgia mode’ (a yearning for a form and technique rather than any historical period)[15] and retro styles by contemporary popular musicians made Fisher feel as if the 21st century was yet to start – and he had not even encountered the dawn of generative AI that would go on to magnify this “crushing sense of finitude and exhaustion” exponentially through ever-accelerating computational power and resource consumption.
It was in the evolution of popular music culture where Fisher most clearly observed the gradual erosion of the future: the exhilarating ‘future shocks’ experienced between 1950-2000 that were followed by a kind of reversion to anachronism from 2000 to present (see also Reynolds, 2012). One could argue that this trajectory mirrors the evolution of music technology, where the biggest leaps – multitrack recording, sound synthesis, digitalisation – occurred between 1950-2000, while 2001 saw the introduction of Ableton Live, a popular digital audio workstation (DAW) that went on to democratise (electronic) music production by making its techniques and processes more accessible, streamlined and automated for everyone. Economic and social theorist Jacques Attali had already noted in the 1970s the degrading and stifling effect that recording technology, through its inexhaustible ability to replicate and repeat, could have on musical innovation and vitality (Attali, 2009). For music and technology have always formed an intertwined progress, and while I would argue that many of the earlier technological innovations still offered in their indeterminate and unrefined novelty new ways to problematise the concept of music, Ableton Live, in a sense, solved the problem of music by modelling such indeterminacies into a highly refined product: it became an idealised representation, an abstraction, of the novelty-producing processes observed and experiments conducted earlier – a virtual modelling of Berlin/lava fields/rebellion/street life, if you will, similar to the generative AI music generators now. Due to its optimised design, it was now possible for almost anyone to produce music that sounded like music; in conjunction with the increasing profitability demands of the music industry that began to prioritise familiarity and predictability, nostalgia and old music (in contrast to the vigorous research and development ventures of the earlier decades) (Gioia, 2022), these developments unavoidably led to the kind of plateauing of musical thinking, and saturation of the musical landscape, that Fisher referred to.
While the point here is not to criticise Ableton Live or other similar DAWs that came to evolve alongside it – nor to criticise the democratisation of music, which has allowed greater diversity of voices and new stylistic innovations to flourish on the margins of both pop and other forms of music, a progress which, however, has become more difficult to perceive due to our ever-fragmenting and -saturating cultural landscape – the question this leads us to, especially in the age of generative AI, is: what do we expect of technology in terms of creativity and musical thinking? A more intelligently idealised representation and generation of the music-that-is-solved-and-sounds-like-music, or perhaps a more intelligent tool to further problematise the concept and creation of music? As Audry et al. have demonstrated, while the former has largely been the approach taken by the scientific and engineering (and now corporate) interests – treating the artistic process as an optimisation problem that, when solved, would better “reproduce what already exists (i.e. the expected)” (Audry, 2021, p. 26) – the latter has been the interest of artists (and the art world), especially those working with new technologies, who often approach their practice as one of further problematisation, challenging existing preferences and seeking to create the unexpected (Audry, 2021).
This observation by Brian Eno from 1995 is capable of arousing nostalgia in someone who started making electronic music – often through experimentation and failure – in the 1990s: ah, to be faced with the technological limitations of the 90s still! If the music publishing landscape felt saturated in 1995, how might one even begin to describe the scenery now? A whole technology exists for generating millions of songs and tracks in any style in a matter of seconds – and simply by typing a few keywords or logging into one’s personal Functional Music account. The problem of music has been solved: the result is an endless ocean of music-sounding music.
While such a hyper-commodified, automated and 'future-cancelling' repetition of the culture – “the banalization of the message” as Attali referred to it (2009, p. 109) – is unavoidable and perhaps even preferable by the majority of consumers in our society driven by the (techno)capitalist pursuit of profit and optimised escapism, we can nevertheless foster and build cultures and designs that are perhaps more dynamic, heterogeneous and future-affirming/opening; to enable those lines of flights to the outside and the cosmic, to unleash the minoritarian and molecular becomings, which, according to Deleuze and Guattari, is the aim of the arts (2004). If we follow Attali’s argument that music, as a cultural form, is prophetic, anticipating and prefiguring through its forms and practices social change while making “audible the new world that will gradually become visible” (Attali, 2009, p. 11), then what kind of future society might we be able to envision through our AI-assisted musical culture? More of the late capitalist repetition through music-that-is-solved-and-sounds-like-music? Or perhaps something more potential, experimental and transformative[16], aided by music-that-is-unsolved-and-sounds-...-actually-more-like-noise? For Attali, the Repetition of post-industrialist capitalism would be succeeded by the aleatoric and unfinished art of post-capitalist ‘utopianism’, where the production and consumption of music merge into one ever-evolving entity (Composition) and become more individuated and communal, empowering and transformative, emphasising process over product (Attali, 2009); ‘utopianism’, for Attali was aware that for Composition to truly work, the society would need to “stop confusing well-being with the production of demand” (2009, p. 146) and “conceive of other systems of economic organization” (p. 145) to support such experimental, participatory and open-ended production (otherwise the creators would have no means to make a living). It is interesting to note how similar Attali’s ideal of Composition, which he had envisioned in the 1970s, sounds to the current generative AI mode of production and consumption (e.g. Platform Musicking) – but without the upgraded economic system that would enable the participants to earn a living from it.
The answer, for me at least, lies somewhere in-between: in the minoritarian, molecular and rhizomatic topologies between repetition and aleatoricism, artificial and environmental, digital and analogue, music and noise. For repetition can be a form of change while aleatoricism can produce randomness that is just monotonous; the oracle will remain outside in the world, enabling intelligence and creativity to emerge from the interactions of its various agents – cities, plants, lava fields, humans, rebellions, penguins, algorithms, machines, hyperobjects; and when one listens to the sonic explorations of Mika Vainio and Pan Sonic, what one hears in their becoming-music of noise and becoming-noise of music is a new kind of haecceity[17], a “dawn of the world” (see Deleuze & Guattari, 2004, p. 309), that emerges only in the de/reterritorialising line between noise and music, a zone where sound becomes simultaneously resolved and problematised. What remains essential in our AI-assisted musical future, in particular, is the centrality of the real-experience-in-the-world for creative practice – for music retains its vitality when it is driven not only by notes and codes, theories and idioms, but by intensities of the real world, processes behind products, which the musicians and other intelligent agents then develop into a musical language. Be it Berlin, lava fields, rebellion, street life, or two taciturn Finns (Pan Sonic) with oscillators and noise generators, trying to imitate Jamaican dub and American rock’n’roll but running into problems.
[16] For one such societal design, see the concept of ‘deep freedom’ by the Brazilian philosopher and politician Roberto Mangabeira Unger.
Vibrant Matter I (2021).
A proposal for a generative sound and light installation for non-ideological and non-commercial spaces of the future.
Artwork by Ilpo Jauhiainen
“Consider the fate of the concept of ‘futuristic’ music. The ‘futuristic’ in music has long since ceased to refer to any future that we expect to be different; it has become an established style, much like a particular typographical font.” (Fisher, 2014, p. 19)
“It’s so easy to make ‘sonic landscapes’ now and there are just millions of people at it. A whole technology exists for it, leading to the thought that by the time a whole technology exists for something it probably isn’t the most interesting thing to be doing.” (Eno, 1996, p. 250)
There is also another, more entertaining thought: that perhaps we should simply let AI automate and simulate all our human culture to date – to act as an interactive and living archive, an intelligent museum of the human civilization up to this point – so that humanity could free itself of the past and move on to devise and create new cultures that better respond to the present and future challenges, local and global.
Radiant City IV (2022).
A proposal for an intelligent music system / a generative urban composition.
Image by Ilpo Jauhiainen. The artwork uses an original drawing by Mamadou Cissé (b. 1960 Senegal); the drawing has been digitally processed through several iterations of superimposition and colour grading.
Below are my own experiments with AI music generators.
I had intended to provide more examples from a greater selection of generators, examining differences in their functionalities and outputs; however, within 30 minutes of experimenting I had gotten utterly bored with the interfaces and outputs of the first few of them, and had to quit in order to retain my interest in music and music-making. For when it comes to creativity and composition, there is nothing more dispiriting and disheartening than reducing those complex (and exhilarating!) processes into prompts, whether it is text-based (typing) or MIDI/audio files (uploading) – and especially when the results continue being just more music that sounds like more music. As I argue in this essay, it is the centrality of the real-experience-in-the-world that makes creative practice substantially more transformative and meaningful; it is about the process in which the product is simply a side effect.
Beatoven
I instructed the application to generate '1 minute of futuristic arabic punk reggae' (as I had often wondered how that might sound like). My role in the creative process, after typing the above prompt, was to sit and wait for two minutes. This is the result:
AIVA
I used two of my published, randomly chosen pieces as audio prompts, and asked the application to transform them into different styles of music from a ready-made menu of diverse musical genres and forms that AIVA provides. Since the original tracks represent electronic music, I was more interested in 'orchestrating' them in styles very different from electronic or popular music genres; hence the two orchestral versions, symphonic and jazz, which again were randomly selected from the several options available.
I am very positively surprised by the results. The synthetic sound quality notwithstanding, the results are aesthetically interesting and affective, and the slightly artificial/mechanical feel only adds to the novelty and 'futurity' of the music – similar to that of Kraftwerk, Steve Reich and Herbie Hancock, for example. These could easily be presented as film scores or soundtracks for other media, released as recordings (with additional production), or used as demos for actual orchestrations with real musicians.
Yet I felt no need to continue with the experimentation: refining the results with tools and suggestions offered by AIVA, or transforming more of my existing pieces into further styles and orchestrations. For I found it problematic (and tiresome) again that my participation as a creator was reduced to that of an administrator: uploading files, ticking boxes, sitting back and waiting. There was no intellectual, creative, emotional or social engagement for me in the process.
Had I experienced this process of transforming the pieces myself – notating and arranging, travelling to Berlin, collaborating with musicians and engineers, taking a stroll among lava fields, resisting outdated norms and forms, celebrating the results with the whole team in NYC's nightlife1 – I might have learned something about music and the world, established new connections, evolved as a human being and artist. Now I had simply been looking at my screen, waiting.
Future Forest Space (2017).
A site-specific generative sound installation and music composition for the Radio Forest pavilion in Klankenbos, Neerpelt BE.
This environmental artwork became a precursor of my current research project outlined in this essay. In the work, a composition made of the sounds of the local forest is generated by an algorithmic system with a stochastic and emergent behaviour; however, this seeming complexity is achieved through simpler procedural and rule-determined transformation processes (symbolic AI) as well as by inviting the actual environment and architecture of the site into the composition.
In my current and future research, I intend to develop this ecosystemic and ecological approach to composition into a more intelligent, elaborate and versatile (rhizomatic) composer-environment-machine assemblage.
For more information, please click on the image (opens up a new window to the presentation of the artwork on my official site).
Artwork by Ilpo Jauhiainen
Lightcubism (2022).
A proposal for an intelligent music system / an environmental composition / a generative sound and light installation for non-ideological and non-commercial spaces of the future.
Artwork by Ilpo Jauhiainen
[above and below: click on the image to play and pause the slideshow]
Territories (2021).
A proposal for an intelligent music system / an environmental composition / a generative sound and light installation for non-ideological and non-commercial spaces of the future.
Artwork by Ilpo Jauhiainen
Floating Cities (2022).
A proposal for an intelligent music system / an environmental composition / a generative sound and light installation.
Artwork by Ilpo Jauhiainen
The original proposal for my Terrestrial (a working title) AI music application, as presented in the Spring 2024.
The diagrams outline the overall concept and operation behind the proposed system. These will be refined further and expanded upon in my dissertation; the practical application will also be developed over the course of my research project and presented as part of the dissertation.
Images by Ilpo Jauhiainen
































