TEMPLATES FOR DAFX-02, HAMBURG, GERMANY

applications is to provide a 3D-audio technology that is accurate enough to allow creating the .... "Study and Comparison of efficient methods for 3D Audio ...
52KB taille 5 téléchargements 221 vues
Proc. of the 6th Int. Conference on Digital Audio Effects (DAFX-03), Queen Mary, London, September 8-11, 2003

A MIXED PHYSICAL AND PERCEPTUAL APPROACH TO CONTROL SPATIALIZATION IN AUDIO AUGMENTED REALITIES Olivier Delerue IRCAM - Room Acoustics Team [email protected] 1. INTRODUCTION The LISTEN project [1], “Augmented Everyday Environment through interactive soundscapes”, is a generalization of the audio guide concept: a visitor walks through a museum exhibition wearing a wireless device that tracks his position and delivers binaural audio content according to the path he follows in his visit. This audio augmented realities system can be used for delivering didactic content in the case of a audio guide / museum application as well as artistic content for instance, for multimedia installations. One of the challenges in audio augmented realities applications is to provide a 3D-audio technology that is accurate enough to allow creating the impression of immersion, such as for instance convincing the visitor of the presence of a virtual sound source at a defined position in the physical space. IRCAM collaborates to this project at two different levels. First, IRCAM is in charge of rendering the virtual soundscape: the spatialization technology [2] is adapted to fulfill the difficult requirements of audio augmented realities such as handling an important number of sound sources with a sufficient level quality. For these aspects, an important part of the work has been focusing on multichannel binaural related technologies ([3] and [4]). Second, IRCAM is in charge of designing an ad hoc authoring tool dedicated to the scene description. We first describe this tool, ListenSpace, and then show the advantages of adding a physical approach level to the spatialisateur for the control of spatialization in the particular context of audio augmented realities. We finally describe the combined perceptual and physical approach that has been built and focus on the particular case of early reflections calculation. 2. LISTENSPACE The proposed environment is named ListenSpace [5] and is composed of a graphical user interface that provides means for representing and editing the auditory scene, including the physical elements of the room as well as the virtual elements that belong to it, such as virtual sound sources, zones used for interactions and description of room acoustic parameters, and listeners. Figure 1 gives a view of a virtual sound scene using the floor plan description of the museum of contemporary art in Bonn: two sound sources have been added to this scene, surrounded with polygonal zones meant for triggering audio events as the visitor enters them.

This application communicates with the other software components of the project via a networked real-time communication based on UDP / Open Sound Control standards. ListenSpace is a light, standalone application written in java and can be run on a ultra-portable tablet PC device for “on-site” authoring and fine adjustment of the scene.

Figure 1: partial view of the Bonn’s contemporary art museum floor plan as a virtual sound scene in ListenSpace

3. CONTRIBUTION OF A MIXED PERCEPTUAL AND PHYSICAL APPROACH Although the IRCAM’s Spatialisateur is essentially based on a perceptual control interface, the addition of ListenSpace makes it possible to consider as well the physical approach to control sound spatialization. We believe that the contribution of this physical description of the scene can significantly enrich the qulity of the result in the particular context of audio augmented realities. However we insist on maintaining the flexibility given by the perceptual approach and that is far more adapted to the needs of the author. More precisely, the physical approach can bring some relevant information on the source – listener auditory channel. For instance, the physical description is necessary to take into account the effects of occlusion or obstruction of sound sources. Another example would be the use of the physical geometry in order to generate meaningful trajectories of sound source e.g. trajectories of sound sources that remain in the room and do not cross walls. Finally an important aspect concerns the relations that a sound source virtually maintains with the physical space: a typical example of such property is the reflection of this source on the different surfaces of the room. We choose to focus on this third aspect as a case study of this combined approach and describe it in the following section with details of its implementation.

DAFX-1

Proc. of the 6th Int. Conference on Digital Audio Effects (DAFX-03), Queen Mary, London, September 8-11, 2003 4. EARLY REFLECTIONS Importance of early reflections in perception of sound source localization and room acoustic is not to be demonstrated any more (3): although early reflections can sometimes be cumbersome and have negative effect on the auditory result (for instance when delivering sound through loudspeakers in natural environments), it seems that the accuracy of their simulation can be one of the decisive criteria in the context of audio augmented realities, to produce the necessary sensation of immersion to be experienced by the user. Therefore, we added to ListenSpace a reflection calculation algorithm based on the source-image method in order to evaluate parameters for reflection of the first, second and third order: these parameters are typically, for each reflection in a given source – listener auditory channel, the delay as well as amplitude, azimuth and elevation with respect to the listener’s orientation. Taking into account all these reflections in the impulse response would cause two different problems. First, even when limited to the first three orders, a large number of reflections occur and rendering all of them with an accurate binaural filter would be technically impossible because of computational cost and real-time requirements. As an example, Figure 2 displays reflections of the first three orders between a source and a listener: even when considering such a simple room model, 6 reflections are expected at first order, 30 reflections at the second order, and up to 180 at third order. These numbers also increase as the geometry gets more complex.

improvement has been made in the spatialisateur in order to render a fixed order of reflection with a full fledged binaural filter.

Figure 3: different possibilities of selection for early reflections while emphasizing time aspects (1), direction of the source (2) or "anti-direction" (3) The resulting spatialization system keeps then all the perceptual controls available in the Spatialisateur, but is fed with parameters that are extracted from the physical space. 5. DISCUSSIONS

Figure 2: reflections of first, second and third order in a simple room model Second, rendering all these reflections would signify adopting a complete physical approach to sound spatialization and loosing all possibilities of control from the perceptual point of view: the author would loose control other important aspect of the sound rendering such as reverberation time for instance. The solution we propose consists in selecting only a small number of these reflections and using their properties to replace the corresponding values in the early part of the impulse response in the spatialisateur, which were previously based on a statistical approach. To handle the issue of the selection of reflections, we introduced the notion of evaluator that ranks each reflection along a given factor : so far, the factors that have been implemented are – among others – “earliness”, “direction of the source” and “antidirection of the source”. The user gives then arbitrary weights to each of these factors and the selected reflection are the ones that gather the best results from the fitting function of each of the evaluators. Figure 3 gives for a given virtual scene three different selections of early reflection according which of the evaluators the importance is given to. A corresponding

We are now conducting a series of perceptual tests such as an “Audio Augmented Reality” version of the prominent “Localization of sounds in rooms” held by Hartmann in 1983 ([6]) in order to assess the impact of this combined physical and perceptual approach. This first test aims at evaluating the accuracy of the audio rendering in terms of angular precision and resolution when the listener’s task is to evaluate the azimuth of a sound source. We reproduced for this test the experimental conditions setup by Hartmann in order to compare our results to real physical listening conditions. Further investigations will allow evaluating our system under different aspects of perception such as the notion of distance which seems to belong to the aspects of spatial sound simulation that have to be particularly carefully handle for audio augmented realities. 6. REFERENCES [1]

[2]

[3]

DAFX-2

G. Eckel, “Immersive Audio-Augmented Environments The LISTEN Project”, in Proceedings of the 5th International Conference on Information Visualization (IV2001), IEEE Computer Society Press, Los Alamitos, CA, USA, 2001. J.M. Jot, “Real-time spatial processing for music, multimedia and interactive human-computer interfaces”, in Multimedia Systems 7 pp55-69, Springer-Verlag 1999. V. Larcher & al. "Study and Comparison of efficient methods for 3D Audio Spatialization based on linear

Proc. of the 6th Int. Conference on Digital Audio Effects (DAFX-03), Queen Mary, London, September 8-11, 2003

[4]

[5]

[6]

decomposition of HRTF Data. Preprint of the 108th AES Convention, Paris, Feb 2000. Emmanuel Rio, Guillaume Vandernoot, Olivier Warusfel, “Perceptual evaluation of weighted multichannel binaural format”, submitted at the DAFX 2003 conference. Olivier Delerue, Olivier Warusfel, “Authoring of virtual sound scenes in the context of the LISTEN project” in Proceedings of the 22nd Audio Engineering Society Conference (AES 22), Espoo, Finland, June 2002. W.M. Hartmann, “Localization of sounds in room”, in Journal of the Acoustical Society of America, 74, pp1380 – 1391, 1983.

DAFX-3