Chapter 3 Theory
This chapter will present an introduction to what electroacoustic music is and the theory in
this master’s project. The theory will include: Macro, meso, sound object and microform,
Spectromorphology and notation of electroacoustic music, Organizing form, Algorithmic,
aleatoric, generative and process compositions, Moment form, Fibonacci sequence and the
Golden Section proportions.
3.1 What is electroacoustic music?
Electroacoustic music gives the composer of today the possibility to work with music in a
way far beyond what the old masters could ever have dreamed about. Both in the sense of
musical material, and how the organization of music is constructed, but also the workflow and
technical wonders of today. “Electroacoustic music opens access to all sounds, a bewildering
sonic array ranging from the real to the surreal and beyond” (Smalley, 1997, p. 107).
So, what is electroacoustic music? Anything sounding that is recorded could be used in an
electroacoustic composition, for example nature sounds, acoustic instruments, voice,
electronic instruments, oscillators1 and much more. If it makes sound, it can be recorded and
used in an electroacoustic composition. Within this genre the compositional work is done in
the recording studio or as today simply with a computer using a DAW2 or other programs
such as Max/MSP, SuperCollider, e.g.
Electroacoustic music or acousmatic music as it also can be called is music composed to be
performed by loudspeakers, “music which is partly or wholly acousmatic, that is, music where
(in live performance) the sources and causes of the sounds are invisible – a music for
loudspeakers alone, or music which mixes live performance with an acousmatic, loudspeaker
element” (Smalley, 1997, p. 109). The sources and causes of the sounds are invisible as
Smalley so neatly puts it, is the primordial difference between electroacoustic music and other
genres of music that is recorded. Every one of us are listening to recorded music through
loudspeakers either by choice or unwillingly. When listening to the recording of Led
Zeppelin’s Stairway to Heaven the intro guitar, the source of the sound is invisible, but the
cause of the sound is not abstract or unclear, it’s very clear that it’s a guitar performing the
music. Here the cause of the sound is not abstract, and in a live performance we would see the
performer, while in electroacoustic music, in a concert we would only see loudspeakers. The
music in the genre of electroacoustic music is written for loudspeakers and not performers.
Many composers work outside of the traditional framework of western music in
electroacoustic music. “Gone are the familiar articulations of instruments and vocal utterance;
gone is the stability of note and interval; gone too is the reference of beat and meter”
(Smalley, 1997, p. 107). But in electroacoustic music you can still use the harmonic and
rhythmic framework of western music or any other type of music for that matter in a
composition or switch between different realms. What made all the new sound material for
the composers possible is the inventions in sound and electronic technology during the 20th
century. Manning (1985) explains that innovations in the new field of electronics, made it less
costly and the devices more compact for generating synthetic sound. The direct current arc
oscillator was invented in 1900, and in 1906 the same year as the Dynamophone was first
demonstrated, Lee De Forest patented the vacuum-tube triode amplifier valve. Manning
(1985) continues with writing that the progress was slow but steady, and by the end of the first
world war, the industry was well established, and several engineers could work with
investigating the new technology in electronics for the possible use in electronic musical
instruments.
1 An electronic oscillator is a circuit that generates a periodic, oscillating or alternating current (AC) signal,
typically a sine wave, square wave, or triangle wave, powered by a direct current (DC) source. Oscillators are
integral to many electronic devices, including radio receivers, televisions, radio and TV broadcast transmitters,
computers, computer peripherals, cell phones, radar systems, and numerous other applications.
Oscillators are often categorized by the frequency of their output signal:
•A low-frequency oscillator (LFO) generates frequencies below approximately 20 Hz, often used in
audio synthesizers to distinguish it from audio-frequency oscillators.
•An audio oscillator produces frequencies within the audio range, from 20 Hz to 20 kHz.
•A radio frequency (RF) oscillator generates signals above the audio range, typically between 100 kHz
and 100 GHz. (Electronic oscillator, 2024, 4 December).
2 A DAW, short for Digital Audio Workstation, is a software application that allows you to record, edit, and
produce music on your computer. It encompasses every step of the music creation process, from recording audio
and crafting beats or melodies with virtual instruments to adding effects and fine-tuning your final mix.
Essentially, Daw’s are all-in-one tools designed to support every aspect of your musical journey. (Steinberg,
n.d.).
3.2 Macro, meso, sound object and microform
The composer Curtis Roads develops an analytical approach to electroacoustic music arguing
that “musical meaning is embedded in layers and encoded in many simultaneous musical
parameters or dimensions” (Roads, 2015, p 285). To lay the ground for such an analysis of the
different layers that simultaneously contribute to our perception of electroacoustic music as
meaningfully d, Roads (2001) explains the different musical structures as follows.
Macro: The time scale of overall musical architecture of form, measured in minutes or hours,
or in extreme cases, days.
Meso: Divisions of form. Groupings of sound objects into hierarchies of phrase structures of
various sizes, measured in minutes or seconds.
Sound object: A basic unit of musical structure, generalizing the traditional concept of note to
include complex and mutating sound event on a time scale ranging from a fraction of a second
to several seconds.
Micro: Sound particles on a time scale that extends down to the threshold of auditory
perception (measured in thousandths of a second or milliseconds).
3.2.1 Macroform
Macroform, or Macro time scale, as Roads (2001) calls it, is the large-scale form in a music
composition. On this level, we experience the architecture of the composition, and how the
different sections in a composition are joined together. This time scale is measured most often
in minutes. For instance, if someone ask how long a song is, the answer we would give would
be in the macro time scale.
Macroform is the top hierarchy when it comes to form. “Just as musical time can be viewed in
terms of a hierarchy of time scales, so it is possible to imagine musical structure as a tree in5
the mathematical sense” (Roads, 2001, p. 12). The trunk of the tree would be the entire work
and the roots going down and branching into smaller and smaller sections is portraying the
formal hierarchy.
3.2.2 Mesoform
Mesoform or, meso time scale, is the level in form hierarchy where we would place the
“theme”, and this time scale is measured in seconds. Roads (2001) points out that the
mesoform is local as opposed to the macroform that is global. “The mesostructural level
groups sound object into a quasi-hierarchy of phrase structures of durations measured in
seconds” (Roads, 2001, p. 14).
Further, Roads (2001) explains that on this local level the composition of a piece is
comprehensible to us listeners in real time while the macroform is something we perceive in
retrospect. In the mesoform we have the different themes, variations, developments, harmony,
rhythm and melodic ideas. “In electronic music, the meso layer presents timbre3
melodies, simultaneities (chord analogies), spatial4 interplay, and all manner of textural
evolutions” (Roads, 2001, p. 14).
3 Timbre is our perception of the "color" of sound, allowing us to distinguish between different instruments, even
when they play the same note. According to the American Standards Association, timbre is defined as "that
attribute of sensation by which a listener can perceive that two sounds of the same loudness and pitch are
different." It is primarily influenced by the spectrum of the sound, but also by factors such as the waveform,
sound pressure, frequency distribution, and the temporal characteristics of the sound. (Ballou, 2008).
4 The term spatialization is particularly associated with electroacoustic music and refers to the projection and
localization of sound sources in physical or virtual space, as well as the movement of sound within that space.
(Spatial music, 2024, 4 December).
3.2.3 Sound object
The notion of the sound object derives from Pierre Schaeffer, and is closely related to
Schaeffer’s experimentation with reel-to-reel tape recorders, which allowed, not only for
editing sound into smaller units, but also invited to repeated listening. Through these
technological advancements emerged not only the pioneering compositional movement of
musique concrète, but also Schaeffer’s exploration of sound through his long-term research
into acousmatic listening (Schaeffer, 2017). According to Roads (2001), the sound object time
scale is any sound, drawn from any source and usually last from 100ms to several seconds.
“The sound object time scale encompasses events of a duration associated with the elementary
unit of composition in scores: the note” (Roads, 2001, p. 16). In a score the note is performed
by an instrument or vocalist, in electroacoustic music any sound could be the source, and
therefore sound object is a better term then note in that genre of music. “Any sound within
stipulated temporal limits is a sound object” (Roads, 2001, p. 17).
3.2.4 Microform
Micro time scale, also called microform, refers to sounds that are very short in duration.
Roads defines this layer as embracing “transient audio phenomena, a broad class of sounds
that extends from the threshold of timbre perception (several hundred microseconds) up to the
duration of short sound object (~100ms)” (Roads, 2001, p. 20-21). We are exposed to
microsounds all around us in the natural world daily.
We experience the interactions of microsounds in the sound of a spray of water
droplets on a rocky shore, the gurgling of a brook, the pitter-patter of rain, the
crunching of gravel being walked upon, the snapping of burning embers, the
humming of a swarm of bees, the hissing of rice grains poured into a bowl, and
the crackling of ice melting. (Roads, 2001, p. 21)
3.2.5 Example of how to look at classical form in a multistructural thinking
The classical Sonata form in its most basic form consists of three parts in its macroform,
exposition, development (Schoenberg calls this part elaboration), and recapitulation. Looking
further into the exposition marked as A in figure 3.2 we can see that Schoenberg (1967) has
marked up three layers of mesoform all of which are going into smaller and smaller sections.
The top layer consists of two parts A (Tonic region) and B (Related region). Going deeper in
the mesoform we can see that the next layer is more elaborated in detail of what the A and B
region consists of. Next layer takes this even further with showing the basic idea and phrases.
Schoenberg (1967) doesn’t go deeper than mesoform for explaining the structural relations of
a Sonata. But motives and characteristic intervals would be considered part of the Sound
object layer.
Microform is not applicable for an acoustic piece of music and is therefore not a layer of
analysis of form that is possible as Roads (2001) explains, microevents touch the extreme
time limits of what a human can perceive and perform. To examine and manipulate these
events with precision, we need digital audio software and hardware that can act as a
microscope and magnify down to the micro time scale so we can operate on it.
3.3 Spectromorphology and notation of electroacoustic music
With electroacoustic music new ways of notation and analysis is needed to be able to
intellectually discuss this genre of music.“The art of music is no longer limited to the
sounding models of instruments and voices. Electroacoustic music opens access to all sounds,
a bewildering sonic array ranging from the real to the surreal and beyond” (Smalley, 1997, p.
107). This thesis and my analysis take base mostly in Spectromorphological thinking, but I
have also looked how others have worked with analysis and notation of electroacoustic music.
3.3.1 Spectromorphology
Spectromorphology is developed by Dennis Smalley but has its base in the teachings of Pierre
Schaeffer, as acknowledged in Smalley’s observation of how “The development of
spectromorphological thinking owes most to Pierre Schaeffer’s Treatise on Musical Objects”5
(Smalley, 1997, p. 107). Spectromorphology is a method of analysis to describe the aural
experience of music. “The two parts of the term refer to the interaction between sound spectra
(spectro-) and the ways they change and are shaped through time (-morphology)” (Smalley,
1997, p. 107). This method creates a way to understand structural relationships in
electroacoustic music. Smalley (1997) writes that a spectromorphological approach sets out
spectral and morphological models and processes and provides the researcher a framework for
understanding structural relations and behaviors as they are experienced in the temporal flux
of music.
Smalley (1997) describes spectromorphology, not as a method for composition, but rather as a
descriptive tool based on aural perception, its intention is to be of help to the listener, and to
be able to explain electroacoustic music. Further Smalley argued that “Although
spectromorphology is not a compositional theory, it can influence compositional methods
since once the composer becomes conscious of concepts and words to diagnose and describe,
then compositional thinking can be influenced, as I am sure my own composing has been”
(Smalley, 1997, p. 107).
Ignoring electroacoustic and computer technology in spectromorphological thinking is
something that Smalley (1997) points out as important. He continues that it’s difficult but
necessary and a logical action to ignore the desire to understand the mechanics behind the
sounds in electroacoustic music, this desire to know is natural. Every culture has knowledge
of how sounds are made because of listening and observation, of seeing and hearing another
person physically playing music. He continues that electroacoustic music is not the same as
playing an instrument since it’s acousmatic, a sound-texture or event in a composition, is
seldom a result of a single, quasi-instrumental, real-time, physical gesture. Further in
Smalley’s explanation he says, “Therefore, while in traditional music, sound-making and the
perception of sound are interwoven, in electroacoustic music they are often not connected.
Not that gesture, sources and causes are unimportant in electroacoustic music” (Smalley,
1997, p. 109).
In spectromorphology there is a term called Technological listening (Smalley, 1997) and it
refers to how a listener perceives the technology or technique behind the music rather than the
music itself, to such length that the meaning behind the music is perhaps blocked. They are
not listening to the music, but rather to the technology behind it, which can obscure the
meaning of the music that the composer wants to portray. Several of the devices and methods
can easily impose their own character and cliché´s on music, and according to Smalley (1997)
the technology should be transparent. Or at least the qualities of the music should overshadow
the tendency to listen to the music in a primarily technological manner. Smalley (1997)
continues by observing that for the composer, there is a difficulty to adopt a purer
spectromorphological ear untainted by technological listening, and further, that the technical
preoccupations are interfering with the creative stream and clouding perceptual judgement.
Smalley (1997), spectromorphological thinking is basing its criteria on the possibility that it
can potentially be apprehended by all listeners, and it is concentrating on the fundamental
features to describe sound. “That is, it is an aid to describing sound events and their
relationships as they exist within a piece of music” (Smalley, 1997, p. 110).
Music is not a closed autonomous artefact, it refers not only to itself, but it relates to
experiences outside of the composition, says Smalley (1997). He sees music as a cultural
construct, arguing that, in culture, an extrinsic foundation is necessary for the intrinsic to have
meaning. The intrinsic and extrinsic are interactive.
Smalley (1997) explains that in electroacoustic music, the wide-open sonic world that it is,
encourages imaginative and imagined extrinsic connections for the composer and listener.
Since the musical material is of variety and ambiguity, and uses motion of colorful spectral
energy, and explores spatial perspectives. He gives this example:
There is quite a difference in identification level between a statement which says of a
texture, ‘It is stones falling’, a second which says, ‘It sounds like stones falling’, and a
third which says, ‘It sounds as if it’s behaving like falling stones’. All three statements
are extrinsic connections but in increasing stages of uncertainty and remoteness from
reality. If a listener, elaborating on either statements two or three, comments on
qualities and features of the texture as heard within the musical context, then attention
turns away from the primarily extrinsic towards special intrinsic features and therefore
moves more deeply into the particular musical experience. It is thus that this listener
starts to engage in spectromorphology. (Smalley, 1997, p. 110)
Another term that Smalley (1997) has invented is source bonding and he defines it as the
“natural tendency to relate sounds to supposed sources and causes, and to relate sounds to
each other because they appear to have shared or associated origins” (Smalley, 1997, p. 110).
He says that this term represents the intrinsic-to-extrinsic link, from the inside of the work to
the sounding world outside.
Another concept that Smalley (1997) has drawn from Schaeffer (2017) is that of Reduced
listening. For a composer it entails focused and repeated listening to a sound event, an activity
that is common in the process of composing electroacoustic music. This process is of
investigating nature, where detailed spectromorphological characteristics and relationships are
discovered. Reduced listening demands that the distractions of source bonding and intrinsic-
extrinsic threads to be blocked out to be able to concentrate on refining spectromorphological
detail and sound quality. It is an abstract and relatively objective process, microscopic in its
focus on details and intrinsic listening. There are concerns with reduced listening, and it is as
dangerous as it is useful, for two reasons. First, once one has discovered an aural interest for
the more detailed spectromorphological features, it is very hard to restore the extrinsic threads
to their rightful place. Second, reduced listening tends to highlight less important, low-level,
intrinsic detail to such an extends that the composer–listener can easily lay their focus to
much on the background at the expense of the foreground of the music. The repeated listening
has the advantage of deeper exploration and the discovery of finer details in the music it also
causes perceptual distortions. Smalley’s experience with teaching composers has often
showed that this kind of perceptual distortions are frequent among composers. In
electroacoustic music, reduced listening mechanisms lie behind the development of concepts,
and is a necessity to a full analysis of electroacoustic music, particularly on the lowest levels
of structure within the music.
Smalley (1997) states that the basic gesture of traditional instrumental music is of a sound-
producing nature. While such gestures are not part of electronic studio-based compositions,
both in electronic and acoustic music, the embodied experience of gestures shapes our
perception of the music. He goes on to argue that in tonal music, notes form a consistent low-
level unit, and are grouped into higher level gestural contours, and into phrases that
traditionally are based on breath-groups. He continues with that, in electroacoustic music, the
scale of gestural impetus is also variable, from the smallest attack-morphology to the broad
sweep of a much longer gesture, continuous in its motion and flexible in its pacing. Smalley
further states that gestures are a forming principle for propelling time forwards:
The notion of gesture as a forming principle is concerned with propelling time
forwards, with moving away from one goal towards the next goal in the
structure – the energy of motion expressed through spectral and morphological
change. Gestural music, then, is governed by a sense of forward motion, of
linearity, of narrativity. The energy– motion trajectory of gesture is therefore not
only the history of an individual event, but can also be an approach to the
psychology of time. (Smalley, 1997, p. 113)
He continues with saying that most music is a mix of both texture and gestures,
“but most music’s are texture–gesture mixtures, either in that focus shifts between them, or because they
exist in some kind of collaborative equilibrium. Where one or the other dominates in a work
or part of a work, we can refer to the context as gesture- carried or texture-carried” (Smalley,
1997, p. 114). Furthermore, Smalley (1997) observes how individual gestures can have
textured interiors, in that case gestural motion frames the texture, the gestural contour
dominates, but the conscious is of both gesture and texture. This is an example of gesture-
framing. With texture-carried structures, the environments are not always democratic interiors
where every microevent is equal and individuals are incorporated in collective activity.
Gestures can be in the foreground like the sculptural method of relief from the texture. This
basic framework is an example of texture-setting texture that provides the individual gesture a
stage to act within.
Smalley writes much more about technical details of spectromorphology in his article and
uses plenty of more terms. These more technical details and terms isn’t something I’m using
in my analysis or in this thesis, so I don’t deem it necessary to cover them in this theory
chapter. If this is of interest, I would highly recommend reading his article
Spectromorphology: explaining sound-shapes in full. Some final remarks from Smalley:
Spectromorphology is concerned with perceiving and thinking in terms of
spectral energies and shapes in space, their behaviour, their motion and growth
processes, and their relative functions in a musical context. Although the detail
of spectromorphological description may sometimes not be easy to follow,
particularly without an extensive experience of electroacoustic music repertory,
it is far from being an esoteric activity. Spectromorphological thinking is basic
and easily understood in principle because it is founded on experience of
sounding and non-sounding phenomena outside music, a knowledge everyone
has – there is a strong extrinsic–intrinsic link. In this sense spectromorphology
derives from a common, shared, natural base which provides a framework for
the individual, cultural works of electroacoustic music. (Smalley, 1997, p. 124-
125)
5 The original was published in French in 1966 titled Traité des objets musicaux (Schaeffer, 1996)
The composer Lasse Thoresen, with the assistance of Andreas Hedman, made an adaptation of
Schaeffer’s typomorphology that they present in the article Spectromorphological Analysis of
Sound Objects, An adaptation of Pierre Schaeffer’s Typomorphology (2007). Their adaptation
serves to develop graphical symbols to represent sonic structures, with the aim of providing a
system which can create a representation of the listening experience.
A development of Thoresen & Hedman’s system is the system of Sound Notation by Mattias
Sköld (2020) and it is the first major adaptation of SASO6. Sköld’s Sound Notation places
symbols in a hybrid frequency-staff system where specific pitches can easily be recognized,
while the corresponding frequency-scale provides an aid to relate spectral data for the actual
frequency7 content of the sound.
In Sköld’s (2023) thesis, it is stated that Sound Notation is a newly develop notation system
for composition, analysis, and transcription that holds the possibility to describe all types of
sounds. Standard staff notation is combined with analysis of electroacoustic music to form a
hybrid system. All symbols in this notation system are related to auditive qualities in the
sound object. This makes it possible for a person or a computer to be able to identify the
symbols from their sonification or musical interpretation.
6 SASO is an abbreviation of Thoresen & Hedman’s (2007) article Spectromorphological Analysis of Sound
Objects.
7 Audio practitioners work with waves, which are created when a medium is disturbed. This medium can be air,
water, steel, the earth, or other substances. The disturbance causes a fluctuation in the medium's normal state,
which then propagates outward as a wave from the source. When using one second as a reference time span, the
frequency of the event is the number of fluctuations per second, measured in cycles per second, or Hertz.
Humans can hear frequencies ranging from 20 Hz to 20,000 Hz (20 kHz). In audio circuits, the primary focus is
usually the electrical voltage, while in acoustical circuits, it is the deviation in air pressure from the ambient
atmospheric pressure. When air pressure fluctuations occur within the 20 Hz to 20 kHz range, they become
audible to humans. (Ballou, 2008).
3.4 Organizing form
How then can coherent musical structures be created in electroacoustic music? Roads (2015)
argues that any composition, not only an algorithmically generated one, could be seen as a
structure that is the product by a set of operations constrained by a set of rules or grammar:
“The concept of compositional organization is an abstraction––a mental plan for ordering
sounds and spawning sound patterns” (Roads, 2015, p. 283). This might not even be clear for
the composer, who could see it as elements of style within a genre or just working intuitively.
The sonic result of a composition rarely illuminates the grammar or process that created the
piece of music, and still, it constitutes a foundation that lays the ground for the compositional
work. A map of different approaches, drawn from Roads (2015) to the organization of musical
form in the compositional process is found below in Fig. 3.6.
3.4.1 Macroform
Macroform is the top structural hierarchy of a composition and a major component to make a
composition structurally work. “More than once, I have lost a composition on the emergency
operation table of formal organization” (Roads, 2015, p. 290). For planning out the macro
form in a composition Roads (2015) presents three strategies:
The Top-down strategy, where you start with a predefined macroform to use as a template
such as sonata, rondo e.g. or by designing your own macroform and use in the same fashion.
The next step in this strategy would be to design the mesoform and finally to create sonic
material to use in the higher-level structures. Possible problems with this approach, that Roads
(2015) points out, is that a strict top-down planning can put too much emphasis on the higher
hierarchical structure and be neglective of the sound material. To mitigate this, the sound
material must be molded to fit within the predefined macro and mesoform.
The next strategy is the Bottom-up strategy, and it is the opposite of a Top-down strategy. “It
constructs form as the final result of a process of internal development produced by
interactions on low levels of structure––like a seed growing into a mature plant” (Roads,
2015, p. 294). Problems with this strategy, that Roads (2015) points out, is that the surface
structure can be quite complicated but lacking a rich hierarchical structure. Compositions
made in this manner can lack a clear sense of beginnings, middles, endings e.g. “These do not
simply “emerge” out of most bottom-up strategies” (Roads, 2015, p. 298).
The final strategy that Roads (2015) presents he calls Multiscale planning. In simple terms, it
is a strategy where you work on all the different time scales within a composition at the same
time and modify the form as the process of the composition is taking shape. “Multiscale
planning can begin from either a top-down or bottom-up starting point. For example, one
might start from a high-level conception and then modify it as specific sounds are mapped
onto it” (Roads, 2015, p. 300). This has benefits, as argued by Roads: “The core virtue of
multiscale planning is flexibility; it mediates between abstract high-level concepts and
unanticipated opportunities and imperatives emerging from the lower levels of sound
structure” (p. 299). With the multiscale strategy composers must be analytical when they
construct their compositions. “Ongoing analysis of all levels of musical form and function is
important int the multiscale process, especially in the lates stages of construction. Problems in
a composition must be confronted directly through analysis, a process akin to debugging
software” (Roads, 2015, p .305).
Roads (2015) writes that the multiscale approach is flexible and opportunistic as a
compositional strategy, mixing top-down and bottom-up strategies for structural organization.
This method of organization can be compares to a heterarchy of partial systems that come into
and go out of being. The multiscale approach can employ generative processes but gives the
composer the right to interact, intervene, edit, and transform the material at any time.
3.4.2 Mesoform
The structures within the meso plain of hierarchy can be designed in various forms. According
to Roads (2015) common mesostructures for both instrumental and electroacoustic music are:
Repetitions––the most basic musical structure: iterations of a single sound or
group of sounds. If the iteration is regular, it forms a pulse or loop.
Melodies––sequential strings of varying sound objects forming melodies, not
just of pitch, but also of timbre, amplitude8, duration, or spatial position.
Variations––iterations of figure groups under various transformations, so that
subsequent iterations vary.
Polyphonies––parallel sequences, where the interplay between the sequences is
either closely correlated (as in harmony), loosely correlated (as in counterpoint),
or independent; the sequences can articulate pitch, timbre, amplitude, duration,
or spatial position. (Roads, 2015, p. 306)
Repetitions and variations in those repetitions is a basic way of creating mesoform,
Schoenberg states this about repetition and variation. “Intelligibility in music seems to be
impossible without repetition. While repetition without variation can easily produce
monotony, juxtaposition of distantly related elements can easily degenerate into nonsense,
especially if unifying elements are omitted. Only so much variation as character, length and
tempo required should be admitted: the coherence of motive-forms should be emphasized”
(Schoenberg, 1967, p. 20). The motive that Schoenberg presses on as very important would be
in the category of Sound object time scale according to Roads (2009). Schoenberg (1967)
writes that the motive should produce unity, relationship, coherence, logic, comprehensibility
and fluency to a composition. By connecting “the motive” to the term of sound object, gives
us the possibility that any sound can act as the motive and be treated with repetitions and
variations just as a motive in acoustic composition, but transferred to the realm of
electroacoustic composition. “The concept of sound object extends this to allow any sound,
from any source” (Roads, 2001, p. 17).
It is important, in the mesoform, to give variations to the sound object (the motive). As
observed by Schoenberg (1967) “repetition alone often give rise to monotony. Monotony can
only be overcome by variation” (p. 8). A melody or a theme is part of the mesoform in a
composition, and Schoenberg points out that the structure within the theme is important. “The
organization cannot be so loose that one might feel a lack of structure” (Schoenberg, 1967, p.
103). Further he says that: “A melody, classical or contemporary, tends toward regularity,
simple repetitions and even symmetry, Hence, it generally reveals distinct phrasing. Of
course, the length of a singer’s breath is no measure for the length of a phrase in an
instrumental melody, but the number of measures in moderate tempo is likely to be about the
same as in a vocal melody” (Schoenberg, 1967, p. 103). hence, according to Schoenberg,
when organizing the mesoform in a composition, the mesoform needs to be in several layers.
The top layer of the mesoform points out the themes in a composition and the next level in the
mesoform displays the structure within the theme itself.
Polyphony and counterpoint are fundamental in instrumental and chorale music. Schoenberg
describes counterpoint as: “the study of the art of voice leading with respect to motivic
combination (and ultimately the study of the ‘contrapuntal forms’)” (Schoenberg, 1978, p.
13). Roads (2015) proposes that polyphony or counterpoint in traditional instrumental and
vocal music refers to the use of two or more melodic lines that are independent from each
other but still work in conjunction, given rise to a pattern of note oppositions. He further
argues that, in electroacoustic music polyphony and counterpoint can differ from instrumental
and vocal music way:
Polyphony takes different forms in electronic music. We can categorize these
according to the timescale. Polyphony on a micro timescale (as in granular
synthesis9) results in cloud10, stream11, or sound mass12 textures. While
traditional polyphony depends on a counterpoint of stable notes, a texturally rich
and mutating sound mass does not necessarily require a contrasting line to
maintain interest. Of course, one or more independent lines (such as a bass line)
playing against a sound mass can also be effective. (Roads, 2015, p. 306)
On the structural level of sound objects the polyphony is closer connected to that of
instrumental music. “Polyphony on the level of sound objects is analogous to traditional note-
against-note-polyphony” (Roads, 2015, p. 306). Further Roads (2015) adds that polyphony in
electronic music on a sound object level is not only pitch contra pitch but also timbre contra
timbre and pitch contra noise(13). There are other types of polyphony common in
electroacoustic music that are harder or near impossible to create in instrumental music
without the help of electronics. “Other kinds of polyphony frequently heard in electronic
music include crossfading voices, repeating echoes, or reverberations that carry over other
sounds” (Roads, 2015, p. 307). Both fission and fusion of sounds are part of the possible
polyphony in electroacoustic music. “Polyphony in electronic music is also related to
processes of fission (splitting of a sound) and fusion (merging of a sound)” (Roads, 2015, p.
307).
8 Amplitude refers to the extent of change in a periodic variable within a single cycle, whether in time or space.
For a non-periodic signal, amplitude represents its magnitude relative to a reference value. (Amplitude, 2024, 5
December).
9 Granular synthesis is a type of sound synthesis that uses a technique called granulation. This process involves
dividing an audio sample into tiny fragments known as “grains,” which are usually between 1 and 100
milliseconds long. Granular synthesizers give users the ability to manipulate these grains, allowing them to
reshape and transform the original sound into unique and often surprising results. (Native Instruments, n.d.).
10 Closely related to streams are sound clouds—collections of hundreds or thousands of sound particles
controlled statistically, first described by Xenakis in 1960. Cloud textures suggest an alternative approach to
musical organization, focusing on the unfolding of musical mesostructures through processes of statistical
evolution. Cloud evolutions can occur in various domains, including amplitude (crescendo/decrescendo), internal
tempo(accelerando/rallentando), grain density (increasing/decreasing), harmonicity (pitch/chord/noise, etc.), and
spectrum (high/mid/low, etc.). (Roads, 2015).
11 A sound mass is a solid block of sound that evolves gradually, while streaming mesostructures seems to flow
quickly, like liquids. Within these fluid-like processes, sound is conceptualized as a continuous emission of
microsonic particles. (Roads, 2015).
12 Sound mass is a unified texture or monolith of sound formed by the layering of multiple sources. Its density
and opacity set it apart from the stream and cloud sound morphologies. (Roads, 2015).
13 The dictionary defines noise as an unwanted disturbance. In engineering, noise typically refers to unwanted
interference in a signal channel, such as buzz, hum, or hiss, that disrupts a meaningful message. Noise is a
natural part of many sounds and are often a combination of pitch and noise. Noise in music is not new or
exclusively for electroacoustic music. Unpitched percussion such as snare drum. Tom-tom, cymbals, woodblock
produce noise such is also true for the scraping of a cello bow or in breathy tones in woodwinds or brass (Roads,
2015).
3.5 Algorithmic, Aleatoric, Generative, and Process composition
Eigenfeldt (2016) describes generative art as works created using a system. He goes on with
writing that the distinguishment and uniqueness of generative artworks is that with each run
of the system, a new and changed result is given to us. Generative music can take many
forms, and provides means through which a composition may take different sounding form in
each performance. During the 20th century new methods of composing arose in the form of
algorithmic14, aleatoric, generative, and process driven composition. “A composition
algorithm serves as a generative engine for music creation” (Roads, 2015, p. 339). These
techniques are based on other rules than harmony and counterpoint. For the composer
working with these techniques, the algorithm or rules set up by them, can be just as much part
of the piece as the finished score itself. “The concept and rules make the work. Some would
say that the algorithms are the art” (Roads, 2015, p. 348).
There are several ways to work with algorithmic music, for instance using different serial
techniques, building on the techniques developed for structural permutation of the 12 tone-
row by Arnold Schoenberg, but applied not only to pitch but also to other parameters, such as
duration, dynamics, rhythm, etc., as developed by Anton Webern, Olivier Messiaen, Pierre
Boulez and others (Roads, 2015).
A different approach is to use aleatoric techniques, which means that you use chance or
randomness to create your composition or generate your musical material. This can be made
in different ways, as Roads (2015) explains: Certain details of a piece would be left open to
the interpreter (in the case of instrumental music), or else they were composed according to
random or chance operations like throwing dice or coins and then mapping the outcome to a
list of corresponding notes or note patterns. Aleatoric techniques is not new for the 20th
century but was used already during the 18th century. “A Musikalisches Würfelspiel (German
for “musical dice game”) was a system for using dice to randomly generate music from
precomposed options” (Musikalisches Würfelspiel, 2024, 28 November).
Lejaren Hiller, one of the pioneers of algorithmically generated music using a computer, used
mathematical models to create music, already in the 1950s (Roads, 2015). Two famous
examples are the Illiac Suite for String Quartet, written in 1956, a landmark in the history of
musical composition, as well as The Computer Cantata written in 1963. Together with John
Cage, Hiller also created HPSCHD (1969) for harpsichord and computer. Hiller and Isaacson
(1956) cited in Roads (2015), argued, with reference to the creation of the Illiac Suite that:
Music is . . . governed by laws of organization, which permit fairly exact codification.
(. . . it has even been claimed that the content of music is nothing other than its
organization.) From this proposition, it follows that computer-composed music which
is “meaningful” is conceivable to the extent to which the laws of musical organization
are codifiable. (Hiller and Isaacson, 1956, cited in Roads, 2015, p. 342-343)
Hiller’s algorithms could generate different kinds of music with the computer, as Roads
(2015) points out, musical ideas from the canons of traditional counterpoint and to the
principles of serial technique, both deterministic and stochastic methods could be coded to
create algorithmical music.
Today the composers working with creating algorithms to generate music, works with
programs that are either graphical or they work directly with code. Roads (2015) gives these
examples of programs that are being used: Max/MSP, SuperCollider, PD and OpenMusic.
What is the difference between deterministic and stochastic algorithms? Roads (2015)
explains that deterministic procedures generate musical material by carrying out a fixed, rule
based compositional task that does not make use of random selection. The supplied variables
in a deterministic procedure are called the seed data. Seed data could be a set of pitches, a
musical phrase, or some other constraints that the algorithm must obey too. An example of a
deterministic procedure would be a program to harmonize a chorale melody in the style of J.
S. Bach. The seed data would be the melody, harmonization rules taken from a textbook
would ensure that the program uses only legal chord sequences. The deterministic algorithm
would look for a solution that ticks of all the rules of the harmonization textbook. A
deterministic program is more like a calculator and if fed the same seed data, the result should
be the same every time.
Further, Roads (2015) explains how stochastic algorithms works is that they integrate random
choice into the decision-making process. A basic stochastic generator produces a random
number and compares it to values in the probability table, the algorithm then creates the event
related with that range. By comparing the probability of certain events over others, one can
assure an overall trend, but keeping the local events unpredictable.
There are several strategies that a composer can choose between when working with
generative algorithmic compositions, one being the Batch mode composition approach. Roads
(2015) explains Batch mode composition in this fashion: Using this method the composer will
start with coding the algorithm. The next step in the process will be to enter the seed data,
followed by executing the program. Now the composer has a choice to either accept or reject
the output, the output of the algorithm is the composition. If they reject, they must go back
and modify the algorithm and redo the process until they are satisfied with the output of the
program. “We cannot simply correct the offending events without generating a new score.
This is because in a batch approach to automated composition, the unit of composition and
interaction are an entire piece” (Roads, 2015, p. 349). Many composers on the other hand are
not that strict and will make edits and changes to the generated material. “Xenakis treated the
output of his composition programs flexibly. In particular, he edited, rearranged, and refined
the raw data emitted by his Stochastic Music Program” (Roads, 2015, p. 350).
Heuristic algorithms are another approach that could be used. “To produce wonderful forms,
what is needed is a hybrid formal/informal approach, combining the computational power of
algorithmic control with the magical influence of heuristics. What is heuristic influence?
Heuristics is the art of experience-based strategies for problem-solving” (Roads, 2015, p.
351). The difference between heuristic and brute-force computer models is explained this way
by Roads (2015), the heuristic method stands in contrast to brute-force computer models that
count and search millions of possibilities and then makes its choice based on short-term
statistics. An approach like the brute-force works well if the task is of a fixed rule nature, for
example games like checkers and chess. In art where the rules are not as fixed a brute force
method simply doesn’t work that well. The heuristic method makes use of rules of thumb,
educated guesses, intuitive judgements, and common sense, all of this is based on experience.
The heuristic approach understands context, the algorithm is programed to understand the
context in a game, a composition or even the context of a culture.
According to Roads (2015), some composers use generative methods to break away from
conventional musical structure, in the search for a fresh alternative compared to traditional
narrative formulas. But narrative is hard to escape, as Roads puts it, “the mind relates what it
has heard to the previous context and anticipates subsequent events; we inevitably react
emotionally according to how expectations are met or denied. The construction of narrative is
the human mind’s innate response to perceived process and structure” (Roads, 2015, p. 361).
One of the challenges with algorithmic compositions is to create a musical structure that is
coherent on multilayered plains. “The design of form is the ultimate test of a composition.
Many generative systems employ bottom-up strategies that do not consider the meso and
macro layers of form” (Roads, 2015, p. 361). Along similar lines, Eigenfeldt makes the
following observations regard how form is generated in algorithmic compositions:
This difficulty is multiplied exponentially when applied to generative music:
how can one codify structural decisions when many of these decisions are
aesthetic in nature? For example, interactive systems allow the composer to
determine when to move to the next section, or when to alter a process, based
upon choices informed by context – how long has the current section been going
on? – and aesthetics – is the material starting to lose interest? Different surface
features (i.e. the musical context) will engender different decisions; codifying
such processes suggests the need for computational aesthetic evaluation, a
highly complex notion that remains an open problem. (Eigenfeldt, 2016, p. 1-2)
An algorithmic method can be of help for the composer in the composition process. “Any
thought experiment involving musical process can be designed, coded, and tested” (Roads,
2015, p. 365). Continuing on the subject of algorithmic composition programs, Roads (2015)
writes that such a program could handle more organizational detail than would be possible for
a human composer to handle. This gives the composer the possibility to pay less attention to
arcane details that could be handled by the program according to the instructions from the
composer and lets the composer to focus on the higher level of abstraction. “At this level, the
composer manages the meta-creation of the piece in terms of its process model” (Roads,
2015, p. 365).
14 The term "algorithm" is commonly used today to refer to a set of rules or procedures a machine, particularly a
computer, follows to achieve a specific goal. However, it is not limited to computer-related tasks. The term can
just as accurately describe the steps involved in making a pizza or solving a Rubik's Cube as it does in computer-
based data analysis.
Algorithms are often paired with terms that specify the activity for which they are designed. For instance, a
search algorithm is a procedure used to determine what kind of information is retrieved from a large data set,
while an encryption algorithm refers to the rules used to encode information so that unauthorized individuals
cannot read it.
Though "algorithm" first appeared in the early 20th century and was primarily used in mathematics and
computing, the term has a surprisingly deep history. It originates from "algorism," which refers to the system of
Arabic numerals. This word, which dates to Middle English, ultimately comes from the name of the 9th-century
Persian mathematician, Abu Jaʽfar Mohammed ibn-Mūsa al-Khuwārizmi, who made significant contributions to
algebra and numeric systems. (Merriam-Webster, n.d.).
3.6 Moment form
Moment form is a form concept introduced by Stockhausen with his pieces Kontakte (1960),
Carré (1960) and Momente (1965). Roads (2015) explains that with this pieces Stockhausen
introduced a paradigm called moment form, where a piece unfolds as a succession of
unrelated episodes or moments.
In moment form, Chang (n.d.) explains that the moments are regarded as free-standing from
each other. This gives the consequence that the composition does not have to be based on the
forward development of a basic thematic moment, thus making the sequencing of the
moments non-linear.
In his lecture Four Criteria of Electronic Music (1972, cited in Roads, 2015), Stockhausen
observes how these forms do not aim to reach a climax, and the structure in moment form do
not contain the usual development that can be expected of a normal composition, the
introductory, rising, transitional, and fading stages. The difference is that these new forms are
intense already from the start, they are forms in a state of having already been originated.
Every present moment is of importance, and a given moment is not the consequence of the
previous one, neither the prelude to the next moment. They are individual, independent, and
centered in themself.
Regarding the macroform in moment form, Roads (2015) explains that there is no overarching
formal direction. From the start of the composition, that sounds like the middle of a piece and
has no function of introduction, to the ending of the composition “Each episode is a kind of
non sequitur––a separate vignette that does not particularly follow from the previous vignette”
(Roads, 2015, p. 314). Eigenfeldt adds further perspectives by suggesting that, in moment
form compositions:
A moment is comprised of a static entity – for example, a single harmony;
moments avoid development and goal-directed behaviour, although the potential
for processes to provide variation in the surface design is possible. Subsequent
moments are contrasting, often dramatically, with one another, as their internal
organisation and concerns must be different; as a result, changes between
moments result in what Kramer refers to as discontinuity. (Eigenfeldt, 2016)
Further Eigenfeldt (2016) also writes that each moment contains its own structure, it can
consist of a great deal of variation on the surface of the moment. But these variations are not
allowed to contribute to a behavior that is directed towards a goal. The combination of
contrasting moments is what gives the overarching structure in moment form.
3.7 Fibonacci sequence and the Golden Section Proportions
Hultqvist (2013) explains that the Fibonacci sequence invented by Leonardo of Pisa (ca.
1170-1240) is a number sequence following the principle of adding the two previous numbers
to obtain the next. Hence, if the sequence starts on 0, then you add 0+1 and get 1, move
forward with 1+1 and get 2, then 1+2 and you will get 3 and so on. The initial sequence is
then: 0 1 1 2 3 5 8 13 21 34 55 89 144 233 etc. Livio (2002) shows that, when reaching higher
in the Fibonacci sequence, the ratio of the sequence is very close to that of the Golden Section
proportions.
One of the interesting parts of the Fibonacci sequence is its fractal properties that Livio (2002)
explains as self-similarity, referring to symmetry across size scale. He describes how The
Steinhardt-Jeong model for quasi-crystals produces long-range order without resulting in a
fully periodic crystal, a general property which is also found in the Fibonacci sequence. Livio
(2002) continues by showing an algorithm for the creation of a sequence known as the Golden
Sequence. It starts with the number 1, and then replace 1 by 10. Further on, replace each 1 by
10 and each 0 by 1 (see Fig. 3.9).
As pointed out above, the Golden Sequence is closely related to the Fibonacci sequence, and
the logarithmic spiral displays self-similarity in the algorithm since it looks exactly the same
under any magnification. The fractal properties in the Golden Sequence are easy to see and
the sequence is easy to continue. By adding the first row 10 after the second row 101 we get
the third row of 10110. In the same way if we add the second row 101 after the third row
10110, we get the fourth row of 10110101, by doing this the sequence can go on forever but
still keeping its fractal properties. The ratio of the numbers will approach the Golden Ratio as
the sequence is extended.
The Golden ratio can also be seen as a line, and Livio (2002) describes how the line (see Fig.
3.10) that stretches from A to B is without doubt longer than the segment AC, but the segment
AC is longer than CB. If the ratio of the length of AC to that of CB is the same ratio as of AB
to AC, then the line has been divided in a Golden Ratio.
Further Livio (2002) says that the logarithmic spiral and the Golden Ratio go hand in hand.
This can be seen by connecting the successive points where these whirling squares divide the
sides in Golden Ratios. A logarithmic spiral that coils inward toward the pole will be created.
This is also true for a Golden Triangle “an isosceles triangle in which the side is in Golden
Ratio to the base (Livio, 2002, p. 119).
Livio (2002) also shows that the Golden Ratio can be used to create musical form.
Referencing the Hungarian musicologist Ernä Lendvai, he describes how the Golden Ratio in
Bela Bartok’s piece Music for Strings, Percussion and Celesta. This piece has 89 measures in
total and is divided into two parts of 55 and 34 measures. The first part is divided in 34 and 21
by removing of the mute for the strings. The second part is divided in 13 and 21 by the strings
putting on the mute. All of these measures correspond to the Fibonacci sequence. However, he
also adds that other musicologist does not support Lendvai’s analysis and musicologist Lazlo
Somfai totally discounts that Bartok would have used the Golden Ratio consciously when
composing.
In the web series titled Sound Field (2019) a method through which the large-scale
proportions of musical works⎯or, as they call it, its Phi moment⎯can be calculated. They
explicate that the Golden Ratio in musical works can be found by multiplying the length of
the song with 0.618, that is the inverse of Phi. As example of a Phi moment, they present it to
us in the song Under Pressure by Queen and David Bowie. The song is 246 seconds
multiplied with 0.618 gives us that the Golden Ratio of the song occurs at 152 seconds, at that
point the song reaches a big climax. Sound Field (2019) don’t claim that this is calculated by
Queen and David Bowie, but rather that we as humans have it inherently in us to feel where
this moment should be.
This equation works both with time and measurement to find out the proportions of length
divided by the Golden Ratio. This can be showed by using that equation of the Fibonacci
numbers found in figure 3.11, how to use this equation in this way comes from my own
thoughts.
89 x 0,618 = 55,002
55 x 0,618 = 33,99
34 x 0,618 = 21,012
21 x 0,618 = 12,978
13 x 0,618 = 8,034
The Golden Ratio can be used to create balance in the form for the music, Livio (2002) in
principle the Golden Ratio can tribute to the satisfaction in a piece of music through the
concept of proportional balance. In music this is trickier, than in visual arts. A painting with
clumsy proportion will stick out instantly at an exhibition, but in music we must hear the
entire piece before we can make a judgment of the proportional balance. An experienced
composer designs the form so it’s in perfect balance with each part and that the individual
parts in itself provides a balanced musical argument.