Proposition for post-doctoral research (contract researcher)

(video games, simulations, serious games...) and augmented reality. The recent emergence of complex video games and virtual universes has increased the ...
130KB taille 1 téléchargements 237 vues
Proposition for post-doctoral research (contract researcher) Title: Analysis-synthesis of environmental sounds with sparse decompositions. Laboratory: Laboratory of Mechanics and Acoustics (LMA), CNRS UPR 7051, Marseille, France. Duration: 12 months with possible extension, available immediately. Funding: ANR Contint PHYSIS. Supervisor: Olivier DERRIEN – [email protected] Profile: The hired contract researcher will be a PHD in the field of applied mathematics or signal processing, with the experience of sparse decompositions and preferably also of dictionaries learning. Knowledge in audio signal processing will be appreciated, but is not mandatory. Detailed specification 1. The ANR PHYSIS project PHYSIS is a research project funded by the French National Agency for Research (ANR), which is centered on the modeling, transformation and synthesis of sounds for interactive virtual worlds (video games, simulations, serious games...) and augmented reality. The recent emergence of complex video games and virtual universes has increased the need for new and efficient solutions to generate sounds automatically rather than recording all sounds in advance and playing them back in an almost un-interactive way. Sound in today's video games is still produced through pre-recorded audio when image is purely synthesized in real time. The rise in available processing power has made possible the precise simulation of audio and acoustic properties of everyday-life sounds based on physical parameters. 2. The atomic approach for synthesizing environmental sounds The so-called atomic approach is an efficient method for synthesizing some classes of environmental sounds. It consists in combining elementary waveforms (called atoms), taken from a relatively small dictionary, in a pseudo-random order. It allows synthesizing realistic sounds of wind, rain, fire, waves … However, this approach implies that atoms and their combination laws are defined a priori, so that the intervention of an expert supervisor is necessary. The PHYSIS project aims at extending the possibilities of atomic synthesis through a better understanding of the structure of real environmental sounds. 3. Synthesis-by-analysis Our approach is centered on the analysis of recorded (real) environmental sounds. We aim at characterizing the best atoms and combination laws which allow an efficient resynthesis. In a first step, we only consider deterministic atoms and linear combinations. From an academic point-ofview, this approach is related to the so-called sparse decomposition methods with redundant dictionaries. This includes the decomposition algorithms itself, but also the learning of the dictionary. These methods have been extensively studied in the last years, and many papers have been published about these methods. 1/2

4. Sparse decompositions with redundant dictionaries Sparse decompositions can be classified in two main categories: ñ Decompositions on a priori dictionaries. Only the combination laws have to be determined (typically using algorithms derived from the Matching Pursuit). This approach can be efficient with some classes of signal with prior knowledge coming from physical models. ñ Decompositions on a posteriori dictionaries. The learning of the dictionary is performed for instance with the K-SVD algorithm. Then, both the optimal set of atoms and the combination laws have to be determined. Recent studies have showed that efficient methods are available to solve this class of problems, but this does not guarantee the possibility of an efficient and intuitive control of the synthesis process, especially when there is no straightforward relationship between the waveforms of the atoms and timbre descriptors. Another approach consists of defining parametric dictionaries: The analytic expression of the waveforms is set (in our case from physical models), but the possible parameter values are unknown. We plan to use learning algorithms to determine the best parameter values, according to a perceptual criterion for the quality of the synthesized sound. This approach seems more desirable than the full a posteriori approach because atom parameters can be related to timbre descriptors, which preserves the possibility of an intuitive control of the synthesizer. 5. Goals of the study The study that we propose has three main goals: ñ In a first step, we plan to apply existing sparse decomposition methods to several classes of environmental sounds for which physical models are already available. Preferably, perceptive quality criterion should me optimized. Our goal is to improve our understanding of the phenomena involved in the production of environmental sounds. ñ In a second step, we plan to use this knowledge to design relevant parametric dictionaries and optimize the parameter values with learning algorithms. ñ Finally, we aim at extending the sparse decomposition methods to dictionaries of stochastic atoms, i.e. which are not defined by their exact waveform, but though the shape of their energetic envelope, both in time and frequency domains.

2/2