Finding Musical Structure by Jimmy Axelsson

The aim of this master’s project is to gain a better understanding of form in electroacoustic music composition, and to apply this knowledge in studio-based work as well as in compositions for acoustic instruments. The following research questions was asked: What methods and strategies are used to create form in electroacoustic music? What approaches to form can I use in my electroacoustic compositions to obtain logical and structurally coherent musical results? What approaches to form in electroacoustic music can I use in my compositions for acoustic instruments and in mixed media works? To reach the aim and answer my research questions an extensive theory chapter have been made that include theory about Macro, meso, sound object and microform, Spectromorphology, Organizing form, Algorithmic and generative processes, Moment form, and Fibonacci sequence and the Golden Section proportions. The method chapter describes how the analysis of form is made, which are based on spectromorphology. Interviews with composer of three of the analysed pieced have been conducted, and strategies employed for composing my own pieces is also presented. The result chapter include my form analysis and reflections of four electroacoustic works together with information gathered during my interviews and the five pieces that I have composed during this project, two electroacoustic pieces, two acoustic pieces for symphony orchestra and one mixed media composition for piano and tape. Those pieces are presented with analysis of form and how I have composed the pieces. In the discussion are my research question answered together with more insights and reflections that I gotten from doing this master’s project.

Chapter 3 Theory

This chapter will present an introduction to what electroacoustic music is and the theory in

this master’s project. The theory will include: Macro, meso, sound object and microform,

Spectromorphology and notation of electroacoustic music, Organizing form, Algorithmic,

aleatoric, generative and process compositions, Moment form, Fibonacci sequence and the

Golden Section proportions.

3.1 What is electroacoustic music?

Electroacoustic music gives the composer of today the possibility to work with music in a

way far beyond what the old masters could ever have dreamed about. Both in the sense of

musical material, and how the organization of music is constructed, but also the workflow and

technical wonders of today. “Electroacoustic music opens access to all sounds, a bewildering

sonic array ranging from the real to the surreal and beyond” (Smalley, 1997, p. 107).

So, what is electroacoustic music? Anything sounding that is recorded could be used in an

electroacoustic composition, for example nature sounds, acoustic instruments, voice,

electronic instruments, oscillators¹ and much more. If it makes sound, it can be recorded and

used in an electroacoustic composition. Within this genre the compositional work is done in

the recording studio or as today simply with a computer using a DAW² or other programs

such as Max/MSP, SuperCollider, e.g.

Electroacoustic music or acousmatic music as it also can be called is music composed to be

performed by loudspeakers, “music which is partly or wholly acousmatic, that is, music where

(in live performance) the sources and causes of the sounds are invisible – a music for

loudspeakers alone, or music which mixes live performance with an acousmatic, loudspeaker

element” (Smalley, 1997, p. 109). The sources and causes of the sounds are invisible as

Smalley so neatly puts it, is the primordial difference between electroacoustic music and other

genres of music that is recorded. Every one of us are listening to recorded music through

loudspeakers either by choice or unwillingly. When listening to the recording of Led

Zeppelin’s Stairway to Heaven the intro guitar, the source of the sound is invisible, but the

cause of the sound is not abstract or unclear, it’s very clear that it’s a guitar performing the

music. Here the cause of the sound is not abstract, and in a live performance we would see the

performer, while in electroacoustic music, in a concert we would only see loudspeakers. The

music in the genre of electroacoustic music is written for loudspeakers and not performers.

Many composers work outside of the traditional framework of western music in

electroacoustic music. “Gone are the familiar articulations of instruments and vocal utterance;

gone is the stability of note and interval; gone too is the reference of beat and meter”

(Smalley, 1997, p. 107). But in electroacoustic music you can still use the harmonic and

rhythmic framework of western music or any other type of music for that matter in a

composition or switch between different realms. What made all the new sound material for

the composers possible is the inventions in sound and electronic technology during the 20th

century. Manning (1985) explains that innovations in the new field of electronics, made it less

costly and the devices more compact for generating synthetic sound. The direct current arc

oscillator was invented in 1900, and in 1906 the same year as the Dynamophone was first

demonstrated, Lee De Forest patented the vacuum-tube triode amplifier valve. Manning

(1985) continues with writing that the progress was slow but steady, and by the end of the first

world war, the industry was well established, and several engineers could work with

investigating the new technology in electronics for the possible use in electronic musical

instruments.

¹ An electronic oscillator is a circuit that generates a periodic, oscillating or alternating current (AC) signal,

typically a sine wave, square wave, or triangle wave, powered by a direct current (DC) source. Oscillators are

integral to many electronic devices, including radio receivers, televisions, radio and TV broadcast transmitters,

computers, computer peripherals, cell phones, radar systems, and numerous other applications.

Oscillators are often categorized by the frequency of their output signal:

•A low-frequency oscillator (LFO) generates frequencies below approximately 20 Hz, often used in

audio synthesizers to distinguish it from audio-frequency oscillators.

•An audio oscillator produces frequencies within the audio range, from 20 Hz to 20 kHz.

•A radio frequency (RF) oscillator generates signals above the audio range, typically between 100 kHz

and 100 GHz. (Electronic oscillator, 2024, 4 December).

² A DAW, short for Digital Audio Workstation, is a software application that allows you to record, edit, and

produce music on your computer. It encompasses every step of the music creation process, from recording audio

and crafting beats or melodies with virtual instruments to adding effects and fine-tuning your final mix.

Essentially, Daw’s are all-in-one tools designed to support every aspect of your musical journey. (Steinberg,

n.d.).

3.2 Macro, meso, sound object and microform

The composer Curtis Roads develops an analytical approach to electroacoustic music arguing

that “musical meaning is embedded in layers and encoded in many simultaneous musical

parameters or dimensions” (Roads, 2015, p 285). To lay the ground for such an analysis of the

different layers that simultaneously contribute to our perception of electroacoustic music as

meaningfully d, Roads (2001) explains the different musical structures as follows.

Macro: The time scale of overall musical architecture of form, measured in minutes or hours,

or in extreme cases, days.

Meso: Divisions of form. Groupings of sound objects into hierarchies of phrase structures of

various sizes, measured in minutes or seconds.

Sound object: A basic unit of musical structure, generalizing the traditional concept of note to

include complex and mutating sound event on a time scale ranging from a fraction of a second

to several seconds.

Micro: Sound particles on a time scale that extends down to the threshold of auditory

perception (measured in thousandths of a second or milliseconds).

3.2.1 Macroform

Macroform, or Macro time scale, as Roads (2001) calls it, is the large-scale form in a music

composition. On this level, we experience the architecture of the composition, and how the

different sections in a composition are joined together. This time scale is measured most often

in minutes. For instance, if someone ask how long a song is, the answer we would give would

be in the macro time scale.

Macroform is the top hierarchy when it comes to form. “Just as musical time can be viewed in

terms of a hierarchy of time scales, so it is possible to imagine musical structure as a tree in5

the mathematical sense” (Roads, 2001, p. 12). The trunk of the tree would be the entire work

and the roots going down and branching into smaller and smaller sections is portraying the

formal hierarchy.

3.2.2 Mesoform

Mesoform or, meso time scale, is the level in form hierarchy where we would place the

“theme”, and this time scale is measured in seconds. Roads (2001) points out that the

mesoform is local as opposed to the macroform that is global. “The mesostructural level

groups sound object into a quasi-hierarchy of phrase structures of durations measured in

seconds” (Roads, 2001, p. 14).

Further, Roads (2001) explains that on this local level the composition of a piece is

comprehensible to us listeners in real time while the macroform is something we perceive in

retrospect. In the mesoform we have the different themes, variations, developments, harmony,

rhythm and melodic ideas. “In electronic music, the meso layer presents timbre³

melodies, simultaneities (chord analogies), spatial⁴ interplay, and all manner of textural

evolutions” (Roads, 2001, p. 14).

³ Timbre is our perception of the "color" of sound, allowing us to distinguish between different instruments, even

when they play the same note. According to the American Standards Association, timbre is defined as "that

attribute of sensation by which a listener can perceive that two sounds of the same loudness and pitch are

different." It is primarily influenced by the spectrum of the sound, but also by factors such as the waveform,

sound pressure, frequency distribution, and the temporal characteristics of the sound. (Ballou, 2008).

⁴ The term spatialization is particularly associated with electroacoustic music and refers to the projection and

localization of sound sources in physical or virtual space, as well as the movement of sound within that space.

(Spatial music, 2024, 4 December).

3.2.3 Sound object

The notion of the sound object derives from Pierre Schaeffer, and is closely related to

Schaeffer’s experimentation with reel-to-reel tape recorders, which allowed, not only for

editing sound into smaller units, but also invited to repeated listening. Through these

technological advancements emerged not only the pioneering compositional movement of

musique concrète, but also Schaeffer’s exploration of sound through his long-term research

into acousmatic listening (Schaeffer, 2017). According to Roads (2001), the sound object time

scale is any sound, drawn from any source and usually last from 100ms to several seconds.

“The sound object time scale encompasses events of a duration associated with the elementary

unit of composition in scores: the note” (Roads, 2001, p. 16). In a score the note is performed

by an instrument or vocalist, in electroacoustic music any sound could be the source, and

therefore sound object is a better term then note in that genre of music. “Any sound within

stipulated temporal limits is a sound object” (Roads, 2001, p. 17).

3.2.4 Microform

Micro time scale, also called microform, refers to sounds that are very short in duration.

Roads defines this layer as embracing “transient audio phenomena, a broad class of sounds

that extends from the threshold of timbre perception (several hundred microseconds) up to the

duration of short sound object (~100ms)” (Roads, 2001, p. 20-21). We are exposed to

microsounds all around us in the natural world daily.

We experience the interactions of microsounds in the sound of a spray of water

droplets on a rocky shore, the gurgling of a brook, the pitter-patter of rain, the

crunching of gravel being walked upon, the snapping of burning embers, the

humming of a swarm of bees, the hissing of rice grains poured into a bowl, and

the crackling of ice melting. (Roads, 2001, p. 21)

3.2.5 Example of how to look at classical form in a multistructural thinking

The classical Sonata form in its most basic form consists of three parts in its macroform,

exposition, development (Schoenberg calls this part elaboration), and recapitulation. Looking

further into the exposition marked as A in figure 3.2 we can see that Schoenberg (1967) has

marked up three layers of mesoform all of which are going into smaller and smaller sections.

The top layer consists of two parts A (Tonic region) and B (Related region). Going deeper in

the mesoform we can see that the next layer is more elaborated in detail of what the A and B

region consists of. Next layer takes this even further with showing the basic idea and phrases.

Schoenberg (1967) doesn’t go deeper than mesoform for explaining the structural relations of

a Sonata. But motives and characteristic intervals would be considered part of the Sound

object layer.

Microform is not applicable for an acoustic piece of music and is therefore not a layer of

analysis of form that is possible as Roads (2001) explains, microevents touch the extreme

time limits of what a human can perceive and perform. To examine and manipulate these

events with precision, we need digital audio software and hardware that can act as a

microscope and magnify down to the micro time scale so we can operate on it.

3.3 Spectromorphology and notation of electroacoustic music

With electroacoustic music new ways of notation and analysis is needed to be able to

intellectually discuss this genre of music.“The art of music is no longer limited to the

sounding models of instruments and voices. Electroacoustic music opens access to all sounds,

a bewildering sonic array ranging from the real to the surreal and beyond” (Smalley, 1997, p.

107). This thesis and my analysis take base mostly in Spectromorphological thinking, but I

have also looked how others have worked with analysis and notation of electroacoustic music.

3.3.1 Spectromorphology

Spectromorphology is developed by Dennis Smalley but has its base in the teachings of Pierre

Schaeffer, as acknowledged in Smalley’s observation of how “The development of

spectromorphological thinking owes most to Pierre Schaeffer’s Treatise on Musical Objects”⁵

(Smalley, 1997, p. 107). Spectromorphology is a method of analysis to describe the aural

experience of music. “The two parts of the term refer to the interaction between sound spectra

(spectro-) and the ways they change and are shaped through time (-morphology)” (Smalley,

1997, p. 107). This method creates a way to understand structural relationships in

electroacoustic music. Smalley (1997) writes that a spectromorphological approach sets out

spectral and morphological models and processes and provides the researcher a framework for

understanding structural relations and behaviors as they are experienced in the temporal flux

of music.

Smalley (1997) describes spectromorphology, not as a method for composition, but rather as a

descriptive tool based on aural perception, its intention is to be of help to the listener, and to

be able to explain electroacoustic music. Further Smalley argued that “Although

spectromorphology is not a compositional theory, it can influence compositional methods

since once the composer becomes conscious of concepts and words to diagnose and describe,

then compositional thinking can be influenced, as I am sure my own composing has been”

(Smalley, 1997, p. 107).

Ignoring electroacoustic and computer technology in spectromorphological thinking is

something that Smalley (1997) points out as important. He continues that it’s difficult but

necessary and a logical action to ignore the desire to understand the mechanics behind the

sounds in electroacoustic music, this desire to know is natural. Every culture has knowledge

of how sounds are made because of listening and observation, of seeing and hearing another

person physically playing music. He continues that electroacoustic music is not the same as

playing an instrument since it’s acousmatic, a sound-texture or event in a composition, is

seldom a result of a single, quasi-instrumental, real-time, physical gesture. Further in

Smalley’s explanation he says, “Therefore, while in traditional music, sound-making and the

perception of sound are interwoven, in electroacoustic music they are often not connected.

Not that gesture, sources and causes are unimportant in electroacoustic music” (Smalley,

1997, p. 109).

In spectromorphology there is a term called Technological listening (Smalley, 1997) and it

refers to how a listener perceives the technology or technique behind the music rather than the

music itself, to such length that the meaning behind the music is perhaps blocked. They are

not listening to the music, but rather to the technology behind it, which can obscure the

meaning of the music that the composer wants to portray. Several of the devices and methods

can easily impose their own character and cliché´s on music, and according to Smalley (1997)

the technology should be transparent. Or at least the qualities of the music should overshadow

the tendency to listen to the music in a primarily technological manner. Smalley (1997)

continues by observing that for the composer, there is a difficulty to adopt a purer

spectromorphological ear untainted by technological listening, and further, that the technical

preoccupations are interfering with the creative stream and clouding perceptual judgement.

Smalley (1997), spectromorphological thinking is basing its criteria on the possibility that it

can potentially be apprehended by all listeners, and it is concentrating on the fundamental

features to describe sound. “That is, it is an aid to describing sound events and their

relationships as they exist within a piece of music” (Smalley, 1997, p. 110).

Music is not a closed autonomous artefact, it refers not only to itself, but it relates to

experiences outside of the composition, says Smalley (1997). He sees music as a cultural

construct, arguing that, in culture, an extrinsic foundation is necessary for the intrinsic to have

meaning. The intrinsic and extrinsic are interactive.

Smalley (1997) explains that in electroacoustic music, the wide-open sonic world that it is,

encourages imaginative and imagined extrinsic connections for the composer and listener.

Since the musical material is of variety and ambiguity, and uses motion of colorful spectral

energy, and explores spatial perspectives. He gives this example:

There is quite a difference in identification level between a statement which says of a

texture, ‘It is stones falling’, a second which says, ‘It sounds like stones falling’, and a

third which says, ‘It sounds as if it’s behaving like falling stones’. All three statements

are extrinsic connections but in increasing stages of uncertainty and remoteness from

reality. If a listener, elaborating on either statements two or three, comments on

qualities and features of the texture as heard within the musical context, then attention

turns away from the primarily extrinsic towards special intrinsic features and therefore

moves more deeply into the particular musical experience. It is thus that this listener

starts to engage in spectromorphology. (Smalley, 1997, p. 110)

Another term that Smalley (1997) has invented is source bonding and he defines it as the

“natural tendency to relate sounds to supposed sources and causes, and to relate sounds to

each other because they appear to have shared or associated origins” (Smalley, 1997, p. 110).

He says that this term represents the intrinsic-to-extrinsic link, from the inside of the work to

the sounding world outside.

Another concept that Smalley (1997) has drawn from Schaeffer (2017) is that of Reduced

listening. For a composer it entails focused and repeated listening to a sound event, an activity

that is common in the process of composing electroacoustic music. This process is of

investigating nature, where detailed spectromorphological characteristics and relationships are

discovered. Reduced listening demands that the distractions of source bonding and intrinsic-

extrinsic threads to be blocked out to be able to concentrate on refining spectromorphological

detail and sound quality. It is an abstract and relatively objective process, microscopic in its

focus on details and intrinsic listening. There are concerns with reduced listening, and it is as

dangerous as it is useful, for two reasons. First, once one has discovered an aural interest for

the more detailed spectromorphological features, it is very hard to restore the extrinsic threads

to their rightful place. Second, reduced listening tends to highlight less important, low-level,

intrinsic detail to such an extends that the composer–listener can easily lay their focus to

much on the background at the expense of the foreground of the music. The repeated listening

has the advantage of deeper exploration and the discovery of finer details in the music it also

causes perceptual distortions. Smalley’s experience with teaching composers has often

showed that this kind of perceptual distortions are frequent among composers. In

electroacoustic music, reduced listening mechanisms lie behind the development of concepts,

and is a necessity to a full analysis of electroacoustic music, particularly on the lowest levels

of structure within the music.

Smalley (1997) states that the basic gesture of traditional instrumental music is of a sound-

producing nature. While such gestures are not part of electronic studio-based compositions,

both in electronic and acoustic music, the embodied experience of gestures shapes our

perception of the music. He goes on to argue that in tonal music, notes form a consistent low-

level unit, and are grouped into higher level gestural contours, and into phrases that

traditionally are based on breath-groups. He continues with that, in electroacoustic music, the

scale of gestural impetus is also variable, from the smallest attack-morphology to the broad

sweep of a much longer gesture, continuous in its motion and flexible in its pacing. Smalley

further states that gestures are a forming principle for propelling time forwards:

The notion of gesture as a forming principle is concerned with propelling time

forwards, with moving away from one goal towards the next goal in the

structure – the energy of motion expressed through spectral and morphological

change. Gestural music, then, is governed by a sense of forward motion, of

linearity, of narrativity. The energy– motion trajectory of gesture is therefore not

only the history of an individual event, but can also be an approach to the

psychology of time. (Smalley, 1997, p. 113)

He continues with saying that most music is a mix of both texture and gestures,

“but most music’s are texture–gesture mixtures, either in that focus shifts between them, or because they

exist in some kind of collaborative equilibrium. Where one or the other dominates in a work

or part of a work, we can refer to the context as gesture- carried or texture-carried” (Smalley,

1997, p. 114). Furthermore, Smalley (1997) observes how individual gestures can have

textured interiors, in that case gestural motion frames the texture, the gestural contour

dominates, but the conscious is of both gesture and texture. This is an example of gesture-

framing. With texture-carried structures, the environments are not always democratic interiors

where every microevent is equal and individuals are incorporated in collective activity.

Gestures can be in the foreground like the sculptural method of relief from the texture. This

basic framework is an example of texture-setting texture that provides the individual gesture a

stage to act within.

Smalley writes much more about technical details of spectromorphology in his article and

uses plenty of more terms. These more technical details and terms isn’t something I’m using

in my analysis or in this thesis, so I don’t deem it necessary to cover them in this theory

chapter. If this is of interest, I would highly recommend reading his article

Spectromorphology: explaining sound-shapes in full. Some final remarks from Smalley:

Spectromorphology is concerned with perceiving and thinking in terms of

spectral energies and shapes in space, their behaviour, their motion and growth

processes, and their relative functions in a musical context. Although the detail

of spectromorphological description may sometimes not be easy to follow,

particularly without an extensive experience of electroacoustic music repertory,

it is far from being an esoteric activity. Spectromorphological thinking is basic

and easily understood in principle because it is founded on experience of

sounding and non-sounding phenomena outside music, a knowledge everyone

has – there is a strong extrinsic–intrinsic link. In this sense spectromorphology

derives from a common, shared, natural base which provides a framework for

the individual, cultural works of electroacoustic music. (Smalley, 1997, p. 124-

125)

⁵ The original was published in French in 1966 titled Traité des objets musicaux (Schaeffer, 1996)

3.3.2 Notation of electroacoustic music

There are several systems developed for notating electroacoustic music. The first is by

Schaeffer (2017) in Treatise on Musical Objects, originally published in French in 1966,

wherein he is mainly using letters as symbols to describe the music.

The composer Lasse Thoresen, with the assistance of Andreas Hedman, made an adaptation of

Schaeffer’s typomorphology that they present in the article Spectromorphological Analysis of

Sound Objects, An adaptation of Pierre Schaeffer’s Typomorphology (2007). Their adaptation

serves to develop graphical symbols to represent sonic structures, with the aim of providing a

system which can create a representation of the listening experience.

A development of Thoresen & Hedman’s system is the system of Sound Notation by Mattias

Sköld (2020) and it is the first major adaptation of SASO⁶. Sköld’s Sound Notation places

symbols in a hybrid frequency-staff system where specific pitches can easily be recognized,

while the corresponding frequency-scale provides an aid to relate spectral data for the actual

frequency⁷ content of the sound.

In Sköld’s (2023) thesis, it is stated that Sound Notation is a newly develop notation system

for composition, analysis, and transcription that holds the possibility to describe all types of

sounds. Standard staff notation is combined with analysis of electroacoustic music to form a

hybrid system. All symbols in this notation system are related to auditive qualities in the

sound object. This makes it possible for a person or a computer to be able to identify the

symbols from their sonification or musical interpretation.

⁶ SASO is an abbreviation of Thoresen & Hedman’s (2007) article Spectromorphological Analysis of Sound

Objects.

⁷ Audio practitioners work with waves, which are created when a medium is disturbed. This medium can be air,

water, steel, the earth, or other substances. The disturbance causes a fluctuation in the medium's normal state,

which then propagates outward as a wave from the source. When using one second as a reference time span, the

frequency of the event is the number of fluctuations per second, measured in cycles per second, or Hertz.

Humans can hear frequencies ranging from 20 Hz to 20,000 Hz (20 kHz). In audio circuits, the primary focus is

usually the electrical voltage, while in acoustical circuits, it is the deviation in air pressure from the ambient

atmospheric pressure. When air pressure fluctuations occur within the 20 Hz to 20 kHz range, they become

audible to humans. (Ballou, 2008).

3.4 Organizing form

How then can coherent musical structures be created in electroacoustic music? Roads (2015)

argues that any composition, not only an algorithmically generated one, could be seen as a

structure that is the product by a set of operations constrained by a set of rules or grammar:

“The concept of compositional organization is an abstraction––a mental plan for ordering

sounds and spawning sound patterns” (Roads, 2015, p. 283). This might not even be clear for

the composer, who could see it as elements of style within a genre or just working intuitively.

The sonic result of a composition rarely illuminates the grammar or process that created the

piece of music, and still, it constitutes a foundation that lays the ground for the compositional

work. A map of different approaches, drawn from Roads (2015) to the organization of musical

form in the compositional process is found below in Fig. 3.6.

3.4.1 Macroform

Macroform is the top structural hierarchy of a composition and a major component to make a

composition structurally work. “More than once, I have lost a composition on the emergency

operation table of formal organization” (Roads, 2015, p. 290). For planning out the macro

form in a composition Roads (2015) presents three strategies:

The Top-down strategy, where you start with a predefined macroform to use as a template

such as sonata, rondo e.g. or by designing your own macroform and use in the same fashion.

The next step in this strategy would be to design the mesoform and finally to create sonic

material to use in the higher-level structures. Possible problems with this approach, that Roads

(2015) points out, is that a strict top-down planning can put too much emphasis on the higher

hierarchical structure and be neglective of the sound material. To mitigate this, the sound

material must be molded to fit within the predefined macro and mesoform.

The next strategy is the Bottom-up strategy, and it is the opposite of a Top-down strategy. “It

constructs form as the final result of a process of internal development produced by

interactions on low levels of structure––like a seed growing into a mature plant” (Roads,

2015, p. 294). Problems with this strategy, that Roads (2015) points out, is that the surface

structure can be quite complicated but lacking a rich hierarchical structure. Compositions

made in this manner can lack a clear sense of beginnings, middles, endings e.g. “These do not

simply “emerge” out of most bottom-up strategies” (Roads, 2015, p. 298).

The final strategy that Roads (2015) presents he calls Multiscale planning. In simple terms, it

is a strategy where you work on all the different time scales within a composition at the same

time and modify the form as the process of the composition is taking shape. “Multiscale

planning can begin from either a top-down or bottom-up starting point. For example, one

might start from a high-level conception and then modify it as specific sounds are mapped

onto it” (Roads, 2015, p. 300). This has benefits, as argued by Roads: “The core virtue of

multiscale planning is flexibility; it mediates between abstract high-level concepts and

unanticipated opportunities and imperatives emerging from the lower levels of sound

structure” (p. 299). With the multiscale strategy composers must be analytical when they

construct their compositions. “Ongoing analysis of all levels of musical form and function is

important int the multiscale process, especially in the lates stages of construction. Problems in

a composition must be confronted directly through analysis, a process akin to debugging

software” (Roads, 2015, p .305).

Roads (2015) writes that the multiscale approach is flexible and opportunistic as a

compositional strategy, mixing top-down and bottom-up strategies for structural organization.

This method of organization can be compares to a heterarchy of partial systems that come into

and go out of being. The multiscale approach can employ generative processes but gives the

composer the right to interact, intervene, edit, and transform the material at any time.

3.4.2 Mesoform

The structures within the meso plain of hierarchy can be designed in various forms. According

to Roads (2015) common mesostructures for both instrumental and electroacoustic music are:

Repetitions––the most basic musical structure: iterations of a single sound or

group of sounds. If the iteration is regular, it forms a pulse or loop.

Melodies––sequential strings of varying sound objects forming melodies, not

just of pitch, but also of timbre, amplitude⁸, duration, or spatial position.

Variations––iterations of figure groups under various transformations, so that

subsequent iterations vary.

Polyphonies––parallel sequences, where the interplay between the sequences is

either closely correlated (as in harmony), loosely correlated (as in counterpoint),

or independent; the sequences can articulate pitch, timbre, amplitude, duration,

or spatial position. (Roads, 2015, p. 306)

Repetitions and variations in those repetitions is a basic way of creating mesoform,

Schoenberg states this about repetition and variation. “Intelligibility in music seems to be

impossible without repetition. While repetition without variation can easily produce

monotony, juxtaposition of distantly related elements can easily degenerate into nonsense,

especially if unifying elements are omitted. Only so much variation as character, length and

tempo required should be admitted: the coherence of motive-forms should be emphasized”

(Schoenberg, 1967, p. 20). The motive that Schoenberg presses on as very important would be

in the category of Sound object time scale according to Roads (2009). Schoenberg (1967)

writes that the motive should produce unity, relationship, coherence, logic, comprehensibility

and fluency to a composition. By connecting “the motive” to the term of sound object, gives

us the possibility that any sound can act as the motive and be treated with repetitions and

variations just as a motive in acoustic composition, but transferred to the realm of

electroacoustic composition. “The concept of sound object extends this to allow any sound,

from any source” (Roads, 2001, p. 17).

It is important, in the mesoform, to give variations to the sound object (the motive). As

observed by Schoenberg (1967) “repetition alone often give rise to monotony. Monotony can

only be overcome by variation” (p. 8). A melody or a theme is part of the mesoform in a

composition, and Schoenberg points out that the structure within the theme is important. “The

organization cannot be so loose that one might feel a lack of structure” (Schoenberg, 1967, p.

103). Further he says that: “A melody, classical or contemporary, tends toward regularity,

simple repetitions and even symmetry, Hence, it generally reveals distinct phrasing. Of

course, the length of a singer’s breath is no measure for the length of a phrase in an

instrumental melody, but the number of measures in moderate tempo is likely to be about the

same as in a vocal melody” (Schoenberg, 1967, p. 103). hence, according to Schoenberg,

when organizing the mesoform in a composition, the mesoform needs to be in several layers.

The top layer of the mesoform points out the themes in a composition and the next level in the

mesoform displays the structure within the theme itself.

Polyphony and counterpoint are fundamental in instrumental and chorale music. Schoenberg

describes counterpoint as: “the study of the art of voice leading with respect to motivic

combination (and ultimately the study of the ‘contrapuntal forms’)” (Schoenberg, 1978, p.

13). Roads (2015) proposes that polyphony or counterpoint in traditional instrumental and

vocal music refers to the use of two or more melodic lines that are independent from each

other but still work in conjunction, given rise to a pattern of note oppositions. He further

argues that, in electroacoustic music polyphony and counterpoint can differ from instrumental

and vocal music way:

Polyphony takes different forms in electronic music. We can categorize these

according to the timescale. Polyphony on a micro timescale (as in granular

synthesis⁹) results in cloud¹⁰, stream¹¹, or sound mass¹² textures. While

traditional polyphony depends on a counterpoint of stable notes, a texturally rich

and mutating sound mass does not necessarily require a contrasting line to

maintain interest. Of course, one or more independent lines (such as a bass line)

playing against a sound mass can also be effective. (Roads, 2015, p. 306)

On the structural level of sound objects the polyphony is closer connected to that of

instrumental music. “Polyphony on the level of sound objects is analogous to traditional note-

against-note-polyphony” (Roads, 2015, p. 306). Further Roads (2015) adds that polyphony in

electronic music on a sound object level is not only pitch contra pitch but also timbre contra

timbre and pitch contra noise(13). There are other types of polyphony common in

electroacoustic music that are harder or near impossible to create in instrumental music

without the help of electronics. “Other kinds of polyphony frequently heard in electronic

music include crossfading voices, repeating echoes, or reverberations that carry over other

sounds” (Roads, 2015, p. 307). Both fission and fusion of sounds are part of the possible

polyphony in electroacoustic music. “Polyphony in electronic music is also related to

processes of fission (splitting of a sound) and fusion (merging of a sound)” (Roads, 2015, p.

307).

⁸ Amplitude refers to the extent of change in a periodic variable within a single cycle, whether in time or space.

For a non-periodic signal, amplitude represents its magnitude relative to a reference value. (Amplitude, 2024, 5

December).

⁹ Granular synthesis is a type of sound synthesis that uses a technique called granulation. This process involves

dividing an audio sample into tiny fragments known as “grains,” which are usually between 1 and 100

milliseconds long. Granular synthesizers give users the ability to manipulate these grains, allowing them to

reshape and transform the original sound into unique and often surprising results. (Native Instruments, n.d.).

¹⁰ Closely related to streams are sound clouds—collections of hundreds or thousands of sound particles

controlled statistically, first described by Xenakis in 1960. Cloud textures suggest an alternative approach to

musical organization, focusing on the unfolding of musical mesostructures through processes of statistical

evolution. Cloud evolutions can occur in various domains, including amplitude (crescendo/decrescendo), internal

tempo(accelerando/rallentando), grain density (increasing/decreasing), harmonicity (pitch/chord/noise, etc.), and

spectrum (high/mid/low, etc.). (Roads, 2015).

¹¹ A sound mass is a solid block of sound that evolves gradually, while streaming mesostructures seems to flow

quickly, like liquids. Within these fluid-like processes, sound is conceptualized as a continuous emission of

microsonic particles. (Roads, 2015).

¹² Sound mass is a unified texture or monolith of sound formed by the layering of multiple sources. Its density

and opacity set it apart from the stream and cloud sound morphologies. (Roads, 2015).

¹³ The dictionary defines noise as an unwanted disturbance. In engineering, noise typically refers to unwanted

interference in a signal channel, such as buzz, hum, or hiss, that disrupts a meaningful message. Noise is a

natural part of many sounds and are often a combination of pitch and noise. Noise in music is not new or

exclusively for electroacoustic music. Unpitched percussion such as snare drum. Tom-tom, cymbals, woodblock

produce noise such is also true for the scraping of a cello bow or in breathy tones in woodwinds or brass (Roads,

2015).

3.5 Algorithmic, Aleatoric, Generative, and Process composition

Eigenfeldt (2016) describes generative art as works created using a system. He goes on with

writing that the distinguishment and uniqueness of generative artworks is that with each run

of the system, a new and changed result is given to us. Generative music can take many

forms, and provides means through which a composition may take different sounding form in

each performance. During the 20th century new methods of composing arose in the form of

algorithmic¹⁴, aleatoric, generative, and process driven composition. “A composition

algorithm serves as a generative engine for music creation” (Roads, 2015, p. 339). These

techniques are based on other rules than harmony and counterpoint. For the composer

working with these techniques, the algorithm or rules set up by them, can be just as much part

of the piece as the finished score itself. “The concept and rules make the work. Some would

say that the algorithms are the art” (Roads, 2015, p. 348).

There are several ways to work with algorithmic music, for instance using different serial

techniques, building on the techniques developed for structural permutation of the 12 tone-

row by Arnold Schoenberg, but applied not only to pitch but also to other parameters, such as

duration, dynamics, rhythm, etc., as developed by Anton Webern, Olivier Messiaen, Pierre

Boulez and others (Roads, 2015).

A different approach is to use aleatoric techniques, which means that you use chance or

randomness to create your composition or generate your musical material. This can be made

in different ways, as Roads (2015) explains: Certain details of a piece would be left open to

the interpreter (in the case of instrumental music), or else they were composed according to

random or chance operations like throwing dice or coins and then mapping the outcome to a

list of corresponding notes or note patterns. Aleatoric techniques is not new for the 20th

century but was used already during the 18th century. “A Musikalisches Würfelspiel (German

for “musical dice game”) was a system for using dice to randomly generate music from

precomposed options” (Musikalisches Würfelspiel, 2024, 28 November).

Lejaren Hiller, one of the pioneers of algorithmically generated music using a computer, used

mathematical models to create music, already in the 1950s (Roads, 2015). Two famous

examples are the Illiac Suite for String Quartet, written in 1956, a landmark in the history of

musical composition, as well as The Computer Cantata written in 1963. Together with John

Cage, Hiller also created HPSCHD (1969) for harpsichord and computer. Hiller and Isaacson

(1956) cited in Roads (2015), argued, with reference to the creation of the Illiac Suite that:

Music is . . . governed by laws of organization, which permit fairly exact codification.

(. . . it has even been claimed that the content of music is nothing other than its

organization.) From this proposition, it follows that computer-composed music which

is “meaningful” is conceivable to the extent to which the laws of musical organization

are codifiable. (Hiller and Isaacson, 1956, cited in Roads, 2015, p. 342-343)

Hiller’s algorithms could generate different kinds of music with the computer, as Roads

(2015) points out, musical ideas from the canons of traditional counterpoint and to the

principles of serial technique, both deterministic and stochastic methods could be coded to

create algorithmical music.

Today the composers working with creating algorithms to generate music, works with

programs that are either graphical or they work directly with code. Roads (2015) gives these

examples of programs that are being used: Max/MSP, SuperCollider, PD and OpenMusic.

What is the difference between deterministic and stochastic algorithms? Roads (2015)

explains that deterministic procedures generate musical material by carrying out a fixed, rule

based compositional task that does not make use of random selection. The supplied variables

in a deterministic procedure are called the seed data. Seed data could be a set of pitches, a

musical phrase, or some other constraints that the algorithm must obey too. An example of a

deterministic procedure would be a program to harmonize a chorale melody in the style of J.

S. Bach. The seed data would be the melody, harmonization rules taken from a textbook

would ensure that the program uses only legal chord sequences. The deterministic algorithm

would look for a solution that ticks of all the rules of the harmonization textbook. A

deterministic program is more like a calculator and if fed the same seed data, the result should

be the same every time.

Further, Roads (2015) explains how stochastic algorithms works is that they integrate random

choice into the decision-making process. A basic stochastic generator produces a random

number and compares it to values in the probability table, the algorithm then creates the event

related with that range. By comparing the probability of certain events over others, one can

assure an overall trend, but keeping the local events unpredictable.

There are several strategies that a composer can choose between when working with

generative algorithmic compositions, one being the Batch mode composition approach. Roads

(2015) explains Batch mode composition in this fashion: Using this method the composer will

start with coding the algorithm. The next step in the process will be to enter the seed data,

followed by executing the program. Now the composer has a choice to either accept or reject

the output, the output of the algorithm is the composition. If they reject, they must go back

and modify the algorithm and redo the process until they are satisfied with the output of the

program. “We cannot simply correct the offending events without generating a new score.

This is because in a batch approach to automated composition, the unit of composition and

interaction are an entire piece” (Roads, 2015, p. 349). Many composers on the other hand are

not that strict and will make edits and changes to the generated material. “Xenakis treated the

output of his composition programs flexibly. In particular, he edited, rearranged, and refined

the raw data emitted by his Stochastic Music Program” (Roads, 2015, p. 350).

Heuristic algorithms are another approach that could be used. “To produce wonderful forms,

what is needed is a hybrid formal/informal approach, combining the computational power of

algorithmic control with the magical influence of heuristics. What is heuristic influence?

Heuristics is the art of experience-based strategies for problem-solving” (Roads, 2015, p.

351). The difference between heuristic and brute-force computer models is explained this way

by Roads (2015), the heuristic method stands in contrast to brute-force computer models that

count and search millions of possibilities and then makes its choice based on short-term

statistics. An approach like the brute-force works well if the task is of a fixed rule nature, for

example games like checkers and chess. In art where the rules are not as fixed a brute force

method simply doesn’t work that well. The heuristic method makes use of rules of thumb,

educated guesses, intuitive judgements, and common sense, all of this is based on experience.

The heuristic approach understands context, the algorithm is programed to understand the

context in a game, a composition or even the context of a culture.

According to Roads (2015), some composers use generative methods to break away from

conventional musical structure, in the search for a fresh alternative compared to traditional

narrative formulas. But narrative is hard to escape, as Roads puts it, “the mind relates what it

has heard to the previous context and anticipates subsequent events; we inevitably react

emotionally according to how expectations are met or denied. The construction of narrative is

the human mind’s innate response to perceived process and structure” (Roads, 2015, p. 361).

One of the challenges with algorithmic compositions is to create a musical structure that is

coherent on multilayered plains. “The design of form is the ultimate test of a composition.

Many generative systems employ bottom-up strategies that do not consider the meso and

macro layers of form” (Roads, 2015, p. 361). Along similar lines, Eigenfeldt makes the

following observations regard how form is generated in algorithmic compositions:

This difficulty is multiplied exponentially when applied to generative music:

how can one codify structural decisions when many of these decisions are

aesthetic in nature? For example, interactive systems allow the composer to

determine when to move to the next section, or when to alter a process, based

upon choices informed by context – how long has the current section been going

on? – and aesthetics – is the material starting to lose interest? Different surface

features (i.e. the musical context) will engender different decisions; codifying

such processes suggests the need for computational aesthetic evaluation, a

highly complex notion that remains an open problem. (Eigenfeldt, 2016, p. 1-2)

An algorithmic method can be of help for the composer in the composition process. “Any

thought experiment involving musical process can be designed, coded, and tested” (Roads,

2015, p. 365). Continuing on the subject of algorithmic composition programs, Roads (2015)

writes that such a program could handle more organizational detail than would be possible for

a human composer to handle. This gives the composer the possibility to pay less attention to

arcane details that could be handled by the program according to the instructions from the

composer and lets the composer to focus on the higher level of abstraction. “At this level, the

composer manages the meta-creation of the piece in terms of its process model” (Roads,

2015, p. 365).

¹⁴ The term "algorithm" is commonly used today to refer to a set of rules or procedures a machine, particularly a

computer, follows to achieve a specific goal. However, it is not limited to computer-related tasks. The term can

just as accurately describe the steps involved in making a pizza or solving a Rubik's Cube as it does in computer-

based data analysis.

Algorithms are often paired with terms that specify the activity for which they are designed. For instance, a

search algorithm is a procedure used to determine what kind of information is retrieved from a large data set,

while an encryption algorithm refers to the rules used to encode information so that unauthorized individuals

cannot read it.

Though "algorithm" first appeared in the early 20th century and was primarily used in mathematics and

computing, the term has a surprisingly deep history. It originates from "algorism," which refers to the system of

Arabic numerals. This word, which dates to Middle English, ultimately comes from the name of the 9th-century

Persian mathematician, Abu Jaʽfar Mohammed ibn-Mūsa al-Khuwārizmi, who made significant contributions to

algebra and numeric systems. (Merriam-Webster, n.d.).

3.6 Moment form

Moment form is a form concept introduced by Stockhausen with his pieces Kontakte (1960),

Carré (1960) and Momente (1965). Roads (2015) explains that with this pieces Stockhausen

introduced a paradigm called moment form, where a piece unfolds as a succession of

unrelated episodes or moments.

In moment form, Chang (n.d.) explains that the moments are regarded as free-standing from

each other. This gives the consequence that the composition does not have to be based on the

forward development of a basic thematic moment, thus making the sequencing of the

moments non-linear.

In his lecture Four Criteria of Electronic Music (1972, cited in Roads, 2015), Stockhausen

observes how these forms do not aim to reach a climax, and the structure in moment form do

not contain the usual development that can be expected of a normal composition, the

introductory, rising, transitional, and fading stages. The difference is that these new forms are

intense already from the start, they are forms in a state of having already been originated.

Every present moment is of importance, and a given moment is not the consequence of the

previous one, neither the prelude to the next moment. They are individual, independent, and

centered in themself.

Regarding the macroform in moment form, Roads (2015) explains that there is no overarching

formal direction. From the start of the composition, that sounds like the middle of a piece and

has no function of introduction, to the ending of the composition “Each episode is a kind of

non sequitur––a separate vignette that does not particularly follow from the previous vignette”

(Roads, 2015, p. 314). Eigenfeldt adds further perspectives by suggesting that, in moment

form compositions:

A moment is comprised of a static entity – for example, a single harmony;

moments avoid development and goal-directed behaviour, although the potential

for processes to provide variation in the surface design is possible. Subsequent

moments are contrasting, often dramatically, with one another, as their internal

organisation and concerns must be different; as a result, changes between

moments result in what Kramer refers to as discontinuity. (Eigenfeldt, 2016)

Further Eigenfeldt (2016) also writes that each moment contains its own structure, it can

consist of a great deal of variation on the surface of the moment. But these variations are not

allowed to contribute to a behavior that is directed towards a goal. The combination of

contrasting moments is what gives the overarching structure in moment form.

3.7 Fibonacci sequence and the Golden Section Proportions

Hultqvist (2013) explains that the Fibonacci sequence invented by Leonardo of Pisa (ca.

1170-1240) is a number sequence following the principle of adding the two previous numbers

to obtain the next. Hence, if the sequence starts on 0, then you add 0+1 and get 1, move

forward with 1+1 and get 2, then 1+2 and you will get 3 and so on. The initial sequence is

then: 0 1 1 2 3 5 8 13 21 34 55 89 144 233 etc. Livio (2002) shows that, when reaching higher

in the Fibonacci sequence, the ratio of the sequence is very close to that of the Golden Section

proportions.

The number of the Golden Ratio is 1.61803 also known as Phi, a number of geometrical

proportions known since antiquity (Livio, 2002). The Golden Ratio is also found in nature for

example in the spiral shells of mollusks (see Figure 3.8 below).

One of the interesting parts of the Fibonacci sequence is its fractal properties that Livio (2002)

explains as self-similarity, referring to symmetry across size scale. He describes how The

Steinhardt-Jeong model for quasi-crystals produces long-range order without resulting in a

fully periodic crystal, a general property which is also found in the Fibonacci sequence. Livio

(2002) continues by showing an algorithm for the creation of a sequence known as the Golden

Sequence. It starts with the number 1, and then replace 1 by 10. Further on, replace each 1 by

10 and each 0 by 1 (see Fig. 3.9).

As pointed out above, the Golden Sequence is closely related to the Fibonacci sequence, and

the logarithmic spiral displays self-similarity in the algorithm since it looks exactly the same

under any magnification. The fractal properties in the Golden Sequence are easy to see and

the sequence is easy to continue. By adding the first row 10 after the second row 101 we get

the third row of 10110. In the same way if we add the second row 101 after the third row

10110, we get the fourth row of 10110101, by doing this the sequence can go on forever but

still keeping its fractal properties. The ratio of the numbers will approach the Golden Ratio as

the sequence is extended.

The Golden ratio can also be seen as a line, and Livio (2002) describes how the line (see Fig.

3.10) that stretches from A to B is without doubt longer than the segment AC, but the segment

AC is longer than CB. If the ratio of the length of AC to that of CB is the same ratio as of AB

to AC, then the line has been divided in a Golden Ratio.

Another example by Livio (2002) shows the line example of Phi and its fractal properties, this

can also be seen in a square. “Any odd number of rectangles with sides equal to successive

Fibonacci numbers fits precisely into a square” (Livio, 2002, p. 104).

Logarithmic spirals are found in many places in nature says Livio (2002), it can be seen in

sunflowers, seashells, whirlpool, hurricanes and to giant spiral galaxies (see also Fig 3.8

above). Since the spiral is logarithmic it keeps its shape constant no matter its size (see Fig.

3.13).

Livio (2002) continues to point out how the beauty of logarithmic spirals has inspired many

artists. He refers to the work Leda and the Swan, Leonardo da Vinci draws the hair arranged

in the shape of a nearly logarithmic spiral, to provide one example.

Further Livio (2002) says that the logarithmic spiral and the Golden Ratio go hand in hand.

This can be seen by connecting the successive points where these whirling squares divide the

sides in Golden Ratios. A logarithmic spiral that coils inward toward the pole will be created.

This is also true for a Golden Triangle “an isosceles triangle in which the side is in Golden

Ratio to the base (Livio, 2002, p. 119).

Livio (2002) also shows that the Golden Ratio can be used to create musical form.

Referencing the Hungarian musicologist Ernä Lendvai, he describes how the Golden Ratio in

Bela Bartok’s piece Music for Strings, Percussion and Celesta. This piece has 89 measures in

total and is divided into two parts of 55 and 34 measures. The first part is divided in 34 and 21

by removing of the mute for the strings. The second part is divided in 13 and 21 by the strings

putting on the mute. All of these measures correspond to the Fibonacci sequence. However, he

also adds that other musicologist does not support Lendvai’s analysis and musicologist Lazlo

Somfai totally discounts that Bartok would have used the Golden Ratio consciously when

composing.

In the web series titled Sound Field (2019) a method through which the large-scale

proportions of musical works⎯or, as they call it, its Phi moment⎯can be calculated. They

explicate that the Golden Ratio in musical works can be found by multiplying the length of

the song with 0.618, that is the inverse of Phi. As example of a Phi moment, they present it to

us in the song Under Pressure by Queen and David Bowie. The song is 246 seconds

multiplied with 0.618 gives us that the Golden Ratio of the song occurs at 152 seconds, at that

point the song reaches a big climax. Sound Field (2019) don’t claim that this is calculated by

Queen and David Bowie, but rather that we as humans have it inherently in us to feel where

this moment should be.

This equation works both with time and measurement to find out the proportions of length

divided by the Golden Ratio. This can be showed by using that equation of the Fibonacci

numbers found in figure 3.11, how to use this equation in this way comes from my own

thoughts.

89 x 0,618 = 55,002

55 x 0,618 = 33,99

34 x 0,618 = 21,012

21 x 0,618 = 12,978

13 x 0,618 = 8,034

The Golden Ratio can be used to create balance in the form for the music, Livio (2002) in

principle the Golden Ratio can tribute to the satisfaction in a piece of music through the

concept of proportional balance. In music this is trickier, than in visual arts. A painting with

clumsy proportion will stick out instantly at an exhibition, but in music we must hear the

entire piece before we can make a judgment of the proportional balance. An experienced

composer designs the form so it’s in perfect balance with each part and that the individual

parts in itself provides a balanced musical argument.

Tip

Tip

Jimmy Axelsson - Finding Musical Structure - 2025