s l o w i n g   d o w n   ~   z o o m i n g   i n   ~   l i s t e n i n g

Listening to the one second of the idle kernel in real 1:1 time is a burst of noise.

It's quite a nice burst, but only because it was sonified so. In this sonification, each event in the dataset is represented by an extremely short burst of filtered noise. With our limited sample rate for digital sounds and limited perception of very high frequencies we humans cannot fully appreciate the dataset in real time when represented in this way. Most of the information is lost to us, both because the sampling frequency of 44 100 Hz that we used is much lower than the sampling frequency of the dataset, 1 000 000 Hz, and because sounds so short are impossible to separate for the human ear. We need it to slow down so that we have time to appreciate it, and the digital sounds have enough samples to form before they disappear.

Using only the timestamp, the process id and the CPU core number from the original data, we can listen to it being played back slower and slower starting at 10 times the original duration and halving the speed every time. Every event is still represented by the same kind of filtered noise burst as before. A continuous sound means a number of microseconds saw the same activity. A bell sound, which as previously discussed has been used to mark, emphasise or enlarge silences in music, meditation and worship, marks each time we start over from the beginning.

The timescale of 170 000 times the original duration seems to be the point where every event can be heard, 170 ms long, and the sequence of events is not so hectic as to be stressful in itself.The whole trace of the original one second of inactivitywill at this pace take almost two whole days to play.

The synthesis process consists of filtered noise (for the <idle> process, see a trace of a system doing nothing for more details) and sine waves (for any other process). The pitch is a result of the sum of all of the processes responsible for the function calls within that microsecond, allowing us to hear patterns of activity in the pitch content. Loudness and the sharpness of attacks are modulated by the number of function calls within the microsecond. The 8 CPU cores are panned from left to right and are all fed into a high feedback delay network acting as a reverb to give a sense of the vast magnification being performed.

With these qualities of the digital in mind, we might expect to find a clear cut and simple experience of undisturbed software at the other end of our sonic microscope. Listening from activity burst to activity burst, however, we hear both recurrence and variations in the details. This is a small manifestation of the complexity of software, which has grown so vast that it is impossible to understand through reasoning alone, giving rise to branches of software development such as chaos engineering. (Basiri et al., 2016)

Coming back to the questions in the introduction, if we accept that software can be calm based on the lack of work it is asked to perform, we have shown that there are ways to experience this. In order to collect the data for this experiment, we used a software tool for monitoring software, showing that the Linux kernel does have tools built in especially for analysing its own activity.

In the section a trace of a system doing nothing we described how a dataset representing the activity of a laptop during one second was collected, the activity measured being Linux kernel function calls. In this section we will present ways to experience that dataset through sonification.

More specifically, we will search for a way for us to experience the dataset, which describes a situation outside of human reference points, as tranquil in itself.

Do you hear the tranquillity of the data yet? The more stretched out the trace becomes, what was originally just a click is revealed to be a whole sequence of short clicks. Silence is revealed between bursts of activity. Yet even at the maximum magnification, 1280 times the original duration, the intensity of the sudden bursts of sound make it challenging to hear the sonification as tranquil.

"[The computer's] operations occur at a rate and in a space vastly different to the realm of our direct perceptual experience."


Jon McCormack and Alan Dorin


Human hearing of pitch is often said to go down to 20 Hz, meaning that the time between two oscillations is 50ms, but we have to go down to around 8 Hz for short repeated sound events to stop sounding continuous. Pulse, meter and rhythm happenbetween around 8 and 0.12 Hz (Roads 2004, p. 17). The effect of forward masking means that the onset of one sound masks another if they are closer apart than ca 200ms, but the ear is more sensitive in certain spectra and we can have slightly shorter events if we keep our pitches within such spectra (Roads 2004, p. 23).

The sounds generated have a number of features that sound noticeably digital. Their pitch, timbre and duration are precise, consistent, discrete and quantised. There is a clear boundary between non-continuous events. In reality, this boundary is on a far smaller time scale than the microsecond (CPU clock speeds generally being in the order of GHz), but the concept is an important feature of the digital, and the nature of software execution. Impulses join impulses of sound at a constant precise rate, hour after hour, relentlessly ticking time away. Pitches are limited to a tiny subset of all possible frequencies, quantised to the few process id numbers and their combinations. The timbre is consistently the same mathematically simple sine or noise based ‘beep’, with the exception of the sounds representing the special <idle> process which are given a noisier timbre.

Some of the sonic features have been softened for the benefit of the intended human listener: the pitches avoid the extreme high piercing range, a smooth attack and decay is added to each event and there is a long reverb like delay network creating the impression of a great church. This last feature has a symbolic function of creating a sense of magnification and perhaps the larger software and hardware context in which the functioncalls in the dataset exist. In addition, it smooths out the sonic experience which could have otherwise felt quite hard and unnatural, as all sounds do without a space in which to sound.

Having written all the code, dialled in the numbers and filled the buffers, hearing this come out of the speakers was awe-inspiring. The sections are unexpectedly rich in rhythmically interesting material and there are clear recurring patterns with variations. Listening to these sounds felt like being granted access to a new hidden realm inside the computer, and all of this richness took place in just a single second.

It is up to each listener to discover if they hear tranquillity in any of these sonifications of an idle computer system, and how they choose to listen to them. With that in mind, the activity and richness found within one of the most silent and tranquil states of a normal laptop software system has some similarities to the meaningful and active silences described in silence as we hear it. There is a lot to listen to in a software silence. Likewise, at the time scale where we can hear every event there are long periods of absolutely nothing. When listening to the slowest sonification, where one microsecond is expanded into 170 milliseconds, long stretches of acoustic silence are framed by sections of software activity. A two days long combined analogue/digital meditation.

Clicks and silence here allow for the greatest temporal accuracy, but great temporal accuracy is not necessarily the best way to hear the trace. Maybe we are more interested in the overall level of activity. As discussed in silence as we hear it there is no absolute silence for a human being to hear. We, the authors, often find our silences in calm sounds of nature: water flowing through a creek, sea waves or the wind through trees, all different kinds of filtered noise. What if we apply a method somewhat analogous to dithering where filtered noise is made to follow the level of activity in the trace? In this version, a synthesised wind slowly takes over from the clicks as duration is increased, covering one silence with another.

Activity level and the magnitude of different bursts both seemed more apparent using the wind sonification technique while also making it, subjectively, more tranquil. However, we shouldn't give up on being able to appreciate the details of the (in)activity.

Fundamentally, pitch and rhythm are different ways that our ears and brains interpret the same phenomenon where very slow repetitions of a sound are heard as rhythm whereas fast repetitions are heard as pitch. To truly hear the details we need to extend the duration of a microsecond, in the original, to a duration that allows our ears to perceive it, not as frequency, but as rhythm. We have already slowed the sonification down by a factor of 1280. Now, we will zoom orders of magnitude further in.

What follows is the same excerpt at different levels of magnification:

The first peak in the waveform, 2 min 13 s

10 000 times the original duration

The trace is sonified at a slower and slower tempo starting at 10x the original duration and going to 1280x. Starting over is marked by a bell sound.

bursts of activity in the trace with the silence between them removed

Sonified slower and slower just like the previous example above, but a synthetic wind gradually takes over.

20 000 times the original duration

s i l e n c e   w i t h o u t   a b s e n c e

t h e   d e e p e s t   l i s t e n i n g

Waveform of 47 hours 13 minutes and 30 seconds of sound. What looks like solid blocks are really varied rhythmical structures.

c o n c l u s i o n s

z o o m i n g   i n

r e a l   t i m e

70 000 times the original duration

McCormack, J., & Dorin, A. (2001, December). Art, emergence and the computational sublime. In Proceedings of Second Iteration: A Conference on Generative Systems in the Electronic Arts. Melbourne: CEMA (pp. 67-81).

Roads, C. (2004). Microsound. MIT press.

Basiri, A., Behnam, N., De Rooij, R., Hochstein, L., Kosewski, L., Reynolds, J., & Rosenthal, C. (2016). Chaos engineering. IEEE Software, 33(3), 35-41.

170 000 x

 

170ms per event

 

1 005 611 microseconds [slightly more than a second, a testament to the diffuculty of counting wall-clock time precicely in a computer]

 

47 hours 13 minutes and 30 seconds

 

41.9 GiB of uncompressed sound data

130 000 times the original duration

170 000 times the original duration

The end of the trace, 38 min 38s