Interactive Composing with Granular Time-shifting of Sound

Agostino Di Scipio

LMS - Laboratorio Musica e Sonologia

L'Aquila, Italy



This paper describes methods of real-time granular processing of sound adopted in recent compositions by the author. Discussed are also the interactive control structures available and/or specially designed for those compositions using two computer music systems, namely KYMA-CAPYBARA33 (at LMS) and the PODX (at Simon Fraser University). Special emphasis is given to polyphonic processes of time-shifting and to recursive granulation. Beside some technical details, the paper also shows the role of such strategies in the particular context of microcomposition.

1 Introduction

In this paper I would like to describe methods of granular processing of sound utilized in two recent works of mine, Hybris and Essai du Vide. Schweigen. In both cases I utilized time-shifting of sound (altering the sound duration without - in principle - side-effects in the frequency domain) by means of granular processing techniques.

1.1 Granular time-shifting as a technique for microcomposition

The particular time-shifting methods discussed here should be seen as instances of micro-time sonic design, or microcomposition (with a term of the electroacoustic music jargon). The common feature of the many different approaches of microcomposition lies in the possibility of manipulating microtemporal relations among myriads of minimal sonic quanta. For such a reason, these methods are often operated by means of compositional control structures which instantiate global strategies of statistical nature. I assume, however, that any strategy of microcomposing deals with a specific problem domain: in which way minimal, time-limited units should be composed in order to give rise to high-level structural properties. This is a problem of perception (Bregman, 1990, p.118 and passim) and, at the same time, a problem of music theory - hence of computer music software design - the discussion of which involves issues of relevance to the aesthetic potential of electroacoustic and computer music (Di Scipio, 1995; 1994a).

1.2 A notion of interactivity

Another central point in this paper relates to the notion of interactivity. In principle interactive control structures embody knowledge about the continuing exchange between concept and percept in the course of composition and/or performance, knowledge about the way in which the composer/performer adjusts his/her own actions and goals during the creative process. Interactivity requires the possibility of exerting real-time controls over various parts of a program such that both the sonic and syntactic levels can be accessed by the user in an attempt to establish a link between them. Hence "interactivity" cannot simplistically mean "immediate audible ouput" - it also means that the user can address his/her action to different rates of change in the musical flow, from audio-rate through event-rate and higher.

We should make a distinction between interactive composition systems and interactive performance systems. Secondly, between interactive programs and interactive programming environments. Accordingly, there are 4 general classes of interactive music systems (for a more detailed taxonomy, see Rowe, 1993):

program environment

composition 1 2

performance 3 4

The distinction between program and environment is primarily a question of the particular computational paradigm at work. A classical notion of computer science is that a program is a theory, resulting in a procedural description leading to the solution of a specific problem. The user uses a theory put forth (or relied upon) by the designer (Simon & Newell, 1970). On the other hand an environment is a meta-theory, a theory about how to instantiate and verify either new or existing theories. In recent years, this has often taken the form of object-oriented representations.

The following two sections describe my recent experiences with granular processing using a programming environment for composition and live performance (cases 2 and 4, above) and with a program for interactive composition (case 1). These experiences revealed to me the relevance of interactive controls for perspectives of micro-time sonic design, a most peculiar and fertile terrain for electroacoustic composition (Di Scipio, 1994b).

2 Polyphonic time-shifting

Granular representations of sound (also wavelet representations) lend themselves to techniques operating in the frequency- and the time-domain independently. Various authors have demonstrated theoretical and practical aspects of granular time-shifting of sound (Jones & Parks, 1988; and various articles in De Poli et al., 1992). My interest is not so much for the mere compression or dilation of the duration of sound. Rather, it led me to devise processes possibly functional to the emergence of dynamical gestures and polyphonic textures within and through the sound of instruments played live. Two facets must be taken into account: (2.1) the real-time generation of streams of grains captured from the live sound, and (2.2) the algorithmic control structure which allows me to handle different streams of grains, each with its specific parameters.

2.1 Granulation of live sound

The input signal is continuously recorded into a n second long wavetable memory, wrapping around the same chunk of memory m times. When the cyclic recording starts, the wavetable is also cyclically looked-up and its samples picked-up. The current look-up memory position is itself a function of time, either another wavetable or a virtual slider on the computer screen. A minimum look-up sampling increment is prescribed whose duration in samples equals the grain duration, here ranging from 10 to 70 msec. Hence the pointer to the input memory buffer proceeds linearly only within the grain-time (from the first to the last grain sample), then jumps elsewhere in the buffer as from the function or real-time slider by which it is controlled. If you use a ramp, then the linear direction of time is preserved though duration can be shortened or lengthened. If you use, for example, a gaussian function, then initially the direction of time will be preserved, with an s-shape like accelerando, then it will be reversed, with a reversed s-shape ritardando. There's no prescription as to how the look-up pointer should move through the input memory buffer. Hence not only dilation, but also compression of the sound duration can be achieved.

Samples picked-up from the input buffer are enveloped with gaussian or trapezium curves. What I consider a "stream of grain" is actually made of two "granulators" (look-up + envelope + allpass filter), shifted by half grain-duration with respect to one another. The signal of both granulators is allpass filtered with an avarage delay time equal to half grain-duration. Intergrain delay and allpass delay are given random values within prescribed limits.

This can be seen as a form of "asynchronous" granulation. However, micro-level time parameters (intergrain delay, grain duration, onset delay) can be made functions of the pitch of the input signal (if any pitch is tracked) and set in synchronous relation to the detected periodicity.

2.2 Algorithmic control in Hybris

For the performance of Hybris (g-flute, bassclarinet and computer processing, 1994), I designed a control structure which activates the following: the input signal is continuously recorded into a 5" wavetable, wrapping around 4 times; 4 different processes of granulation and time-shifting start at 5", 10", 15" and 20"; each process has its own low-level parameters and time-shift ratio - respectively 5, 4, 3, 2 times slower than real. The look-up is driven by linear ramps (time direction preserved).

Depending on the shift ratio and the performers' nuances of timing, some input material eventually will not be granulated; this happens when wavetable locations are not looked-up before new material comes in there (the case with large shift ratios). Other material, however, may undergo more than one process of granulation and time-shift; this happens when locations are looked-up by two or more granulators at the same time. A clear picture of the overall process can be drawn only by studying the phase relationship between the write and read pointing functions in relation to the 5" wavetable memory buffer.

The process lasts 30 seconds and represents one single formal unit in the first section of Hybris (made of 4 such units). It yields a polyphony of gestures and textures emerging from the instrumental performance. Usually there will be a growing density of sound as the process approaches the end point. The average density will be of a few hundreds grain/sec, depending on the amount of feedback in the allpass filters.

2.3 Implementation in KYMA

At the time of the first performance of Hybris (with M.Zurria, flute, and P.Ravaglia, bassclarinet, Rome, December 1994) the above method had been implemented using the KYMA2.0 software (under MS-Windows) controlling the CAPYBARA DSP system (Scaletti & Hebel, 1991); then it was ported to KYMA4.0 for a recent performance (Montreal, ISEA95). A representation of the HYBRIS1 algorithm is found below, with the KYMA objects (or Sounds), hierarchically ordered from top down.

In this graph both the HYBRIS1 Sound and its subSound, RTPTM (real-time polyphonic time modifications) actually are quite simple and short Smalltalk-80 programs, the first scheduling 4 instances of the second, the second scheduling 4 instances of the INSTR Sound. Together they represent, in effect, the highest level control structure by which the variables reputed of primary compositional interest are evaluated during performance.

"script in HYBRIS1"

1 to: 4 do: [ :i | RTPTM start: ((i*30)-30) seconds.].

"script in RTPTM"

1 to: 4 do: [ :i |instr start: ((i*5)-5) seconds grainDur1: 0.04 stretchFactor: (6-i).].

A single granulator corresponds to the abstraction level of the INSTR Sound. After debugging and optimizing, I made one only object of the whole substructure of INSTR, a new Sound class aSample&ShiftWithAllPass, with its own icon and parameters - those expected to change during rehearsing and performing (all other parameters being customized). With this new prototype (see graph below), HYBRIS1 could be re-shaped in a more compact and efficient representation.

2.4 Some considerations

A look at the parameters of this new prototype, would show that its functionality is primarily a question of time variables (indeed this is the case with Hybris' first section, but wouldn't be with sections 2 and 3, not described here). The sounding results include phase-related effects especially intersting when the live sounds extend to breathing, hisses, whistles, noise bursts (e.g. tonguing), etc...

As a new class, aSample&ShiftWithAllPass features custom spatial trajectories which aid the perceptual segregation of grains in separate auditory streams amd the perception of a polyphony of distinct gestures in the stereophonic space - rather than a single "wall" of sound.

Interactivity in Hybris is observable at two levels. In the compositional process, the KYMA user interface allowed me to search for the appropriate models of sound transformation. This is not just a matter of editing, testing and optimizing the algorithms in a straightforward deterministic manner. It is more like analysing the link among acoustic, perceptual and more abstract, conceptual aspects of the sound processing technique: exploring how the organization of low level details could achieve auditory images of peculiar timbral properties.

At another level, interactivity is paramount in the performance of Hybris. The sounding effects of the sound processing method is particularly sensitive to the details of the input signal; the performers cannot always predict the (sometimes dramatic) effects caused by the magnification of rather unperceptible differences in their performance. As he/she is urged to dynamically adapt the performance in timing and timbral qualities, the performer becomes here a source of feedback and self-regulation within a dynamical system. Far from being that of trasforming written symbols into audible sounds, his/her task is more one of self-regulating the whole "performance system" in order to avoid totally uncontrolled results as much as strictly periodic behaviors. (I took this approach also for my 7 short variations on the cold, a piece for trumpet and interactive system premiered by Russel Whitehead at the ICMC95, in Banff).

3 Recursive processing

I would like to conclude with a short description of a particular extension of granular processing and time-shifing, that I find very rich in implications. I refer to it as recursive granular processing. To give the reader a concrete idea of its functionality, I describe how recursive granulation was used in the composition of Essai du vide. Schweigen, a solo tape work composed in 1993 while I was visiting composer at Simon Fraser University (Burnaby BC). Among the many programs of B.Truax' PODX set of programs for interactive composition and sound synthesis, the resource I utilized the most was GSAMX, which allows for real-time granular processing of sound. I refer the reader to (Truax, 1991; 1994) for a discussion of the technical details.

3.1 Details of Essai du vide. Schweigen

I devised a strategy involving two steps: 1) a sampled sound is submitted to granular time-shifting and the result is recorded on hard disk; 2) the time-shifted sound is randomly accessed in order to select short fragments of it, resulting in a new pattern of the same chunks in the sampled sound. The result of step 2 is then passed back through step 1 and time-shifted, then again randomly accessed, and so on... In other words, the strategy involved a nonlinear transformation comparable to the iterated application of two transfer functions to an initial datum:

xn+1 = fb(fa(xn))

x0 : original source sound;

xn : granulated sound after the n-th iteration;

fa : time-shift function;

fb : random access rule

Similar strategies I had already utilized in Zeitwerk (tape, 1992) and Kairós (soprano sax and tape, 1992). See discusion of technical details in (Tisato & Di Scipio, 1993; Di Scipio 1994b). In those works, however, they were to transform synthetic sounds, while in Essai du vide the original source was "concrete", the sound of scratching with nails and jazz brushes against the lower strings of a harp, followed by a short pause. At the time of the writing of this paper, a real-time implementation of recursive granulation was already made (with KYMA) for a more recent work, Variazioni sul ritmo del vento (that is variations on the rhythm of the wind, for contrabass recorder and interactive system, premiered by Antonio Politano at Sibelius Academy, Helsinki, Summer 1995).

3.2 An exploratory process

The control variables in the realization of Essai du vide were: average grain duration, duration range, average intergrain delay (or density, i.e. number of grains/second), intergrain delay range and time-shift ratio. Interactive control took place via a single line on the screen of the PDP Micro11 computer controlling the DMX-1000 signal processor. The control variables could be accessed one at a time during synthesis. They could also be "ramped", allowing for a simple automated control. All steps in the composition involved grains shorter than 50 msec. The time-shift ratio in fa was cyclically ramped, with an avarage shift of 30 times slower than real. The random access rule fb was a Poisson distribution, and the fragments recursively picked-up were 170 msec long.

I could explore the audible effects of micro-level modulations with interesting auditory properties due to the changing relationship of phase among overlapping grains, and shape extended structures including both continuous textures of sound and more dynamical gestures, also featuring silent pauses (the magnified effect of the short pause in the source material). The iterated transformations produced a "vaporization" of the original source and brought forth gestural patterns absolutely unpredictable at the outset.

Recursive granulation in Essai du Vide. Schweigen provides an example of a creative metaphor made concretely utilizable. It is a method for subtracting energy from the sound, as opposed to methods of granular synthesis: in the former case one takes quanta of energy off the sound, in the latter one puts quanta of sound into the otherwise silent flow of time. In this way the sense of void, internal silence and solitude which is behind the title, was given both an actual significance and, I think, a chance of being concretely conveyed to listeners.


Bregman, A.Auditory Scene Analysis, MIT Press, 1990

De Poli, G., A.Piccialli & C.Roads (eds.) Representations of Musical Signals, MIT Press, 1991

Di Scipio, A. "Formal Processes of Timbre Composition Challenging the Dualistic Paradigm of Computer Music", in Proceedings ICMC, 1994a

Di Scipio, A. "Micro-time sonic design and the formation of timbre", Contemporary Music Review,10(2), 1994b

Di Scipio, A. "Inseparable models of material and of musical design in electroacoustic and computer music", Journal of New Music Research, 24(1), 1995

Jones, D. & T.Parks "Generation and organization of grains for music synthesis", Computer Music Journal, 12(2), 1988

Rowe, R. Interactive Music Systems. Machine listening and Composing. MIT Press, 1993

Scaletti, C. & K.Hebel "An Object-based Representation of Audio Signals", in (De Poli et al.) Representations of Musical Signals, MIT Press, 1991

Simon, H. & A.Newell "Information Processing in Computer and Man", in (Pylyshyn, Z. ed.) Perspectives on the Computer Revolution. Prentice-Hall, 1970

Tisato, G. & A.Di Scipio "Granular synthesis with the Interactive Computer Music System", in Proceedings CIM, AIMI/LIM, 1993

Truax, B. Real-time Granular Synthesis with the DMX-1000 Computer. Software documentation. 1991

Truax; B. "Discovering Inner Complexity. Time-shifting and Transposition with a Real-time Granulation Technique", Computer Music Journal, 18(2), 1994