3. Sound Synthesis and Processing Methods
3.1. Design considerations
Sonic information design for the display of proteomic data must take into account the perceptual attributes that will undergo variation according to changes in the multivariate proteomic data matrices that occur with neuro-degradation. The most straightforward approach to this design problem would seem to involve identifying a few select components of variation in the data, values which can be used to modulate sound synthesis parameters associated with variation in selected perceptual attributes. If this approach is taken, then the design would be identified as a parameter-mapping sonification (Grond and Berger 2011), where ideally a one-to-one mapping is established between the data domain and the parameter domain describing the synthesis of sound to be presented to the listener. Whether the variation in the parameter domain is associated with orthogonal variation in the perceptual domain is difficult to determine without solid foundational knowledge in psychoacoustics. Typically, the manipulated perceptual attributes are chosen based on whether they can be easily distinguished, such as pitch ranging from low to high while timbre ranges from dark to bright. Such variation, which is not strictly orthogonal, nonetheless admits of two perceptually salient dimensions with easily identifiable anchoring points. If a single sound is isolated at each of these anchoring points, it could be identified in terms of pitch-timbre pairings such as low-dark, low-bright, high-dark, and high-bright. The analogy to musical performance on a trumpet might be appreciated here, as playing from low to high pitch while muting and then unmuting a trumpet can move through these two perceptually salient dimensions.
Note that pitch and timbre are not strictly independent perceptual dimensions, since pitch has been shown to influence judgements of timbral differences. Although timbral relationships between recorded musical instrument tones are similar at different pitches within an octave range (Marozeau, de Cheveigné, McAdams and Winsberg 2003), if pitch is allowed to vary over a wider range, the higher pitched tones will be heard to sound brighter than the lower pitched tones (Marozeau and de Cheveigné 2007). It might be thought that such interactions between two perceptual attributes would be less likely if one was a timbral attribute and the other was a spatial attribute, such as the apparent direction of a spectrally-rich tone (e.g., a plucked string sound). However, in everyday listening, if the incidence angle of a tone varies in azimuth from a frontal incidence angle (0 degrees) to an extreme lateral angle (90 degrees), the tone will naturally increase in perceived brightness because of the increased high-frequency emphasis apparent in the acoustical response of the head (as measured by the head-related transfer function [HRTF]). This influence of the HRTF on the timbre of sound sources is not usually noticed, most likely because it is a common feature of direction-dependent spectral variation that is habitually interpreted in a spatial mode in everyday listening. HRTF-based sound processing is often used to control the apparent direction of sound sources (see Martens 2003) and is particularly useful in headphone-based auditory display. However, there are potential difficulties that can be found here, especially when synthetic rather than recorded natural sounds are displayed. A demonstration of the interaction between timbral modulation and spatial processing of synthetic string sounds can be found in section 3.5.
3.2. Data preparation prior to sonification
A critical aspect of parameter-mapping sonification identified by Grond and Berger (2011) is data preparation. This is because the complex datasets of interest (such as proteomic data) are typically not ready for direct input to a sonification system in their raw form. It is quite rare to find good results with the “drag and drop” method of data input. Therefore, the data are often subjected to preliminary processing that make them more amenable to sonification (of course, the same would be true if the data were to be submitted for visualization). For the sake of the current discussion of design considerations, it will suffice to say that the multivariate complexity of the data to be sonified needs to be reduced to variation in terms of the principle components that can potentially distinguish between the three cell types being examined here. Describing this data reduction process is beyond the scope of this essay; the reader may refer to the process description that appeared in the ICAD paper by Martens et al. (2016). What is pertinent here is that values on 1815 variables could be reduced to values along three principle dimensions. Moreover, the variation on these dimensions not only captured a large proportion of the total variance in the multivariate data but were also found to be potentially revealing of gross differences between the three cell types to be compared.
3.3. Sound synthesis for the sonifications
To generate a sonification for the available proteomic data of interest, a parameter-mapping strategy for synthesis that took into account the complexity of the large multivariate dataset was formulated. For nine distinct cases, an assembly of short-duration, temporally-overlapping “grains” of sound were created, the timing parameters of which were selected to approach approximately the minimum perceivable event time for distinct percepts of duration, frequency, and amplitude (i.e., approaching auditory resolution of human observers in discriminating between identifiable attributes of loudness, pitch, and those component auditory attributes that are generally regarded as belonging to one of two collections termed timbral or spatial attributes). The “hypothesis-driven” design approach taken here required sound synthesis technology that could offer independent variation of many sound synthesis parameters to provide identifiable variation in distinct auditory attributes. In the initial stage of this work, synthesis based upon a simple physical model (Karplus and Strong 1983) was tested for its versatility in producing a wide range of short sounds exhibiting audibly identifiable timbral variations, each showing potential for evoking physical referents in the minds of the listeners (such as the “plucked-string” tones that are presented in Audio Object 1).