There are (at least) two notions of nonlinearity related to this text. The first is the notion of narrative non-linearity outlined in the texts on time-states and memorylessness.
The second notion of nonlinearity refers to nonlinear dynamics and is borrowed from physics. Nonlinear systems, commonly referred to as complex systems or dynamical systems, are a broad category of systems that encompass the majority of the systems encountered in real life. Their common characteristic is their inherent complexity, which renders their long-term behaviour unpredictable, as small variations in their initial conditions can lead to qualitatively different outcomes. As a whole, they are easier to characterize in contrast to linear systems. The standard textbook Nonlinear Dynamics and Chaos: with applications in Physics, Biology, Chemistry and Engineering (1994) by Steven H. Strogatz attempts exactly that:
“Why are linear systems so much harder to analyze than linear ones? The essential difference is that linear systems can be broken down into parts. Then each part can be solved separately and finally recombined to get the answer. In this sense a linear system is precisely equal to the sum of its parts.
But many things in nature don’t act this way, Whenever parts of a system interfere or cooperate, or compete, there are nonlinear interactions going on. Most of everyday life is nonlinear, and the principle of superposition fails spectacularly. If you listen to your two favorite songs at the same time, you won’t double the pleasure!”
Some of the examples of nonlinear systems presented by Strogatz include the pendulum, biological oscillators (such as neurons and heart cells), chaos, neural networks, economics, earthquakes, general relativity, turbulent fluids and life itself. Some examples of linear systems, according to Strogatz, include growth, decay or equilibrium processes (such as exponential growth, radioactive decay and RC circuits), solid state physics, molecular dynamics, equilibrium statistical mechanics, the wave equation and acoustics, heating and diffusion, electromagnetism, and quantum mechanics.
Complex dynamical systems are characterized by constant change, which occurs through both interaction with the external environment and self-organization. Self-organization occurs in the form of feedback loops which means that the output of the system is fed back into the system itself. But nonlinear systems do not necessarily involve feedback. A simple system that involves a threshold is a nonlinear system.
Manuel DeLanda in A new philosophy of society: Assemblage theory and social complexity identifies two types of causalities within social mechanisms, namely nonlinear causality 'defined by thresholds below or above which external causes fail to produce an effect', and probabilistic causality.
“social mechanisms must include the full variety of causal interactions, that is, they must take into account that the thresholds characterizing nonlinear causality may vary from one actor to another (so that the same external cause may affect one but not the other) and that causal regularities in the behaviour of individual actors are, as Weber himself argued, only probabilistic. Statistical causality is even more important when we consider populations of actors. Thus, in the case of explanation by motives, we may acknowledge that individual actors are capable of making intentional choices, and that in some cases such intentional action leads to the creation of social institutions (such as the written constitutions of some modern nation-states), while at the same time insist that the synthesis of larger social assemblages is many times achieved as the collective unintended consequence of intentional action, that is, as a kind of statistical result."
These ideas related to the behavior of nonlinear systems are mirrored in my experiences in working with improvising ensembles and composing music for improvisation. I find the sensitivity to the initial conditions, as well as notions of determinism and unpredictability to be inherent in improvised music. In my piano works, I have been placing material between the piano strings such as chopsticks, knitting needles and mikado sticks that resonate by means of friction. By modifying the pressure of my fingers while moving on the sticks in an improvisatory manner, I can morph between noisy states of disorder and different equilibria positions, highlighting different overtones of the strings.
I have systematically worked with chaotic systems in the work Athroa, a large scale piece for prepared piano, electronics and light installation for my duo with synthesist Egil Kalman. The light installation was programmed to follow the trajectories of chaotic systems. In each iteration of the piece, the chaotic systems produced patterns that were similar in their ‘musical quality’, but never identical. Two performances of the work were never the same although one could identify the same elements/ sections in the overall form. The systems were thus very appropriate for improvisation and the work was based on the improvisatory interaction of the duo and the light installation.
My work is informed by traditional practices in electronic music and especially in composing with analogue synthesizers, where creating chaotic systems by means of feedback is a common and well established practice. In the liner notes from the album, Homage to Dick Raaijmakers by Thomas Ankersmit, one reads that 'he plays his Serge Modular synthesizer as a kind of weather system.'
At this point the reader may wonder where exactly the nonlinearity comes to play, given that the wave equation and acoustics in general are considered to be linear systems. Here we have two separate systems, the sound propagation system and the sound generation system. While the sound propagation system is linear, obeying the linear property of superposition, where the result of two waves propagating in the air is their sum, the sound generation system is nonlinear. The medium is linear but the source of the sound is nonlinear, it involves interaction, thresholds and/or feedback.
Nonlinearity and predictability
Nonlinearity breaks proportional causality and makes the notion of prediction local. You cannot ‘aim’ if you don’t foresee and teleology loses its foundation. The system’s sensitivity on the initial conditions is what breaks the path dependence and renders the process inherently unpredictable. This sensitivity lies at the core of my practice, which is based in foreseeing the next improvisatory micro-gesture based on the slightest variation on the initial conditions (the previous micro-gesture). While attempting to make this prediction of the next 3 to 5 seconds of gestural sonic content, the system is already unstable, producing artefacts that in-real-time alter the space of predictions.
Here nonlinearity is related to predictability in both senses, namely narrative nonlinearity and dynamical nonlinearity. It emanates from the nonlinearities inherent to the sound sources (the material I attach in the piano) in relation to my performative gesture, while at the same time contributes to the dismantling of anticipation and of the linear narrative.
It is continuous process of iterative predicting of that which is inherently unpredictable. And this is exactly central to both my practice and this project. I aspire to create a spaces for empathetic listening where one becomes attentive to the complexities of sound and reflects on how subtle changes in the conditions of complex systems can give rise to previously unexpected events. And while being in this state of memorylessness, experience multiple and expanded temporalities.
Nonlinearity and Machine Learning
Machine learning models based on neural networks are trained for the purposes of learning the complex relationships between inputs and outputs (mapping) and for learning complex distributions of training data. To model such complex structures, the machine learning models rely on nonlinearities built-in across all their components, from basic components such as individual nodes to the higher-level mechanisms such as attention. For example, the simplest threshold function (mentioned in the text by DeLanda above), is among the most common nonlinear functions added to each layer of neural networks. Without such nonlinearities, these models would be able to learn only simple linear mappings, which very rarely correspond to the complexity of real-life data. So neural networks are inherently nonlinear and unpredictable while at the same time often used for predicting the future behavior of complex systems.
In attention blocks, the core building blocks within transformers, which are the modern neural network architectures, the application of nonlinearity functions breaks proportionality and introduces competition between previous states.
More in depth on Nonlinearity and Attention (optional reading)
Moreover, the attention mechanism breaks linearity at a higher conceptual and architectural level, and it is precisely the nonlinearity it introduces which is responsible for making transformers significantly more powerful than previous architectures. In earlier architectures, namely Recurrent Neural Networks (RNNs), and within the context of large language model training, long term dependencies - i.e. dependencies between two distant words - could only be learned by propagating information linearly across all intervening words. This introduces what can be called a linear interaction distance between the two words. This linear interaction distance creates computational constraints which prevent the model from effectively learning long-term dependencies. At the same time, this linear indexing of the sequences requires that the computations related to all previous states be completed before computations for the current state can occur. This problem is known as dependence on time and makes any notion of parallelizability in computations inapplicable.
[Note: in computer science, parallelizability is the ability of hardware to perform many computations simultaneously.]
Attention within transformers breaks the notion of dependence on time, by introducing a mechanism in which memory is relevance-based rather than time-based, and operates across multiple timescales as the same time. For each element in a sequence, the system can ‘attend to’ any previous element based not on temporal adjacency but on semantic similarity. Attention replaces temporal decay with semantic relevance, allowing elements to carry different degrees of significance across contexts, enabling parallelizability and the coexistence of multiple simultaneous temporalities.
For definitions of the ideas of linear interaction distance and dependence on time as well as a detailed exposition on attention and transformers, I refer here the reader to the outstanding lecture Self-Attention and Transformers, Lecture 8 in the 2023 iteration of the course CS224: NLP with Deep Learning given from Stanford University and accessible via Stanford Online:
https://www.youtube.com/watch?v=LWMzyfvuehA&list=PLoROMvodv4rOaMFbaqxPDoLWjDaRAdP9D&index=11
Autoregressive inference
When I wrote my project description, I deliberately chose to talk broadly about nonlinear systems knowing that the term encopases a huge set of systems. My initial sense was that I would work mostly on chaotic systems, in the lineage of the electronic music tradition and naturally continuing from my previous work. Nevertheless, I purposefully wanted to keep the field of possibilities very broad, thus not excluding other nonlinear systems. Indeed, very early in my process, I identified connecting threads between my project and algorithms for machine learning and inference that shifted my interest towards machine learning. Looking back now, it feels to me that investigating complexity through the lens of present day inference was a natural evolution of my project.
This goes back again to the granular nature of my improvisational practice already described above. For each gesture or event, I am working in real time within in a precarious and stochastic state, where slight gestural variations give rise to unexpected sonic artefacts. In this context I need to imagine multiple potential futures and make real-time decisions about the next incremental gesture, orienting myself toward one of these possible outcomes. I find this process of prediction to be closely related to the process of autoregressive inference in machine learning models.
Autoregressive inference in machine learning can be understood as the generative process in which each new element (for example each new token) is generated solely based on the current context representation. Rather than accumulating information through a linear history, the current state distills all information necessary -made available through attention- for predicting the next state. All previous relevant relations are folded into the present state. In this sense the model is conditionally memoryless, in that the prediction of the next state depends only on the current state.
[Note: The prediction of the next state depends only on the current state due to the Markov property assumption. All memoryless processes are Markovian but not all Markov processes are memoryless.]
This structure mirrors the idea of time-states improvisation, where the past does not persist as a linear history but as a configuration of constraints, tendencies, and affordances that shape the present moment, much like semantic similarity within attention. In this sense, it relates to the type of improvisation that is central within my practice, one which unfolds not through the linear accumulation of memory, but through successive present states.
