Perceptual evaluation of vibrato features: the case of saxophone sounds Prof. Catherine Guastavino[ & Vincent Verfaille] [ School
of Information Studies Schulich School of Music
] SPCL/IDMIL,
C I R MM T
Centre for Interdisciplinary Research in Music Media and Technology
SMPC07, Concordia University Montréal, Qc, Canada July 30th – Aug 3rd, 2007
V. Verfaille & C. Guastavino
Perceptual evaluation of vibrato features: the case of saxophone sounds
Outline of the talk 1 introduction – vibrato definition – spectral envelope modulations?
2 rationale – vibrato features – signal processing model(s)
3 perceptual evaluations – previous experiment – new experiment: procedure, preference results
4 conclusion & future works
1
V. Verfaille & C. Guastavino
Perceptual evaluation of vibrato features: the case of saxophone sounds
Introduction
Introduction
Definition: vibrato is – a “pulsation in pitch, intensity and timbre” [Seashore, 1932] – a vibrating quality of musical sounds, corresponding to: i) fundamental frequency modulations (FM) ii) global amplitude modulations (AM) iii) spectral envelope modulations (SEM)
which alone or in combination enrich timbre
2
V. Verfaille & C. Guastavino
Perceptual evaluation of vibrato features: the case of saxophone sounds
Introduction
Introduction
– various studies on vibrato perception: rate, deviation, mean pitch perception, vibrato waveform, temporal evolution – several signal processing models – some questions about spectral envelope modulations: – is it real? for which instruments? observed for voice by [Maher & Beauchamp, 1990] – what is their impact on vibrato perception?
3
V. Verfaille & C. Guastavino
Perceptual evaluation of vibrato features: the case of saxophone sounds
4
Rationale Vibrato features
Fundamental Frequency Modulations (FM) Changes of fundamental frequency (and perceived pitch)
=⇒ harmonics’ amplitude modulations when sweeping the SE Ex: voice [Sundberg, 1987], bowed strings [Mathews & Kohut, 1973]
V. Verfaille & C. Guastavino
Perceptual evaluation of vibrato features: the case of saxophone sounds
Rationale Vibrato features
Global Amplitude Modulations (AM) Changes of sound intensity (and loudness)
=⇒ same amplitude modulations for all harmonics Ex: woodwinds (saxophone, flute)
5
V. Verfaille & C. Guastavino
Perceptual evaluation of vibrato features: the case of saxophone sounds
6
Rationale Vibrato features
Spectral Envelope Modulations (SEM) with implicit AM Spectral enrichment / spectral centroid (and brightness)
=⇒ second source of harmonics’ amplitude modulations Ex: brass (trumpet), woodwinds and voice
V. Verfaille & C. Guastavino
Perceptual evaluation of vibrato features: the case of saxophone sounds
Rationale Vibrato features
Vibrato features on a saxophone sound
7
V. Verfaille & C. Guastavino
Perceptual evaluation of vibrato features: the case of saxophone sounds
8
Rationale Signal processing model(s)
Signal processing model(s)
Vibrato models: – allow for analysis, transformation, synthesis – are traditionally based on instrument-specific features – only consider FM & AM, except: – two-level sinusoidal model (TLSM) [Marchand & Raspaud, 2004]: harmonics’ (ai [n], fi [n]) as sums of sinusoids (implicit SEM) – panned-wavetable synthesis (PWS) technique [Maher & Beauchamp, 1990]: explicit SEM
V. Verfaille & C. Guastavino
Perceptual evaluation of vibrato features: the case of saxophone sounds
9
Rationale Signal processing model(s)
Signal processing model(s) We developed a vibrato model [Verfaille, Guastavino & Depalle, 2005]: combines TLSM (vibrato analysis & synthesis) and PWS (SEM computation) accounts for a great diversity of vibrato behaviors Differences: hypothesis: only 1st sin. component of vibrato is perceived cross-synthesis between flat/vibrato sounds: + jitter of (ai [n], fi [n]) in the flat sound + AM, FM = first component from the vibrato sound + SEM modulated by interpolation from the vibrato sound
V. Verfaille & C. Guastavino
Perceptual evaluation of vibrato features: the case of saxophone sounds
Perceptual evaluations Previous experiment
Previous experiments about SEM – no SEM: proeminence of harmonics’ AM over FM for vibrato perceived quality (violin) [Mellody & Wakefield, 2000] – question: does also SEM contribute to perceived quality? – saxophone sounds – AB comparison task on pairs of sounds: – original AM/FM + constant average SE (same as [Mellody & Wakefield, 2000]) – original AM/FM/SEM
– evaluation: which one sounds the most natural? – result (8 listeners): sounds more natural with SEM (p < 0.001) [Verfaille, Guastavino & Depalle, 2005]
10
V. Verfaille & C. Guastavino
Perceptual evaluation of vibrato features: the case of saxophone sounds
11
Perceptual evaluations New experiment: procedure
Procedure for the new experiment – questions: – How does each type of modulation contribute to the perceived naturalness of saxophone sounds? – Which one sounds the most natural?
– 4 versions considered for each note (F3, B3, D4, G4, C5): vibrato feature(s) original AM/FM/SEM AM only (cross-synthesis) FM only (cross-synthesis) modeled AM/FM/SEM (cross-synthesis) original flat sound
F3 F3 F3 F3 F3
B3 B3 B3 B3 B3
Notes D4 D4 D4 D4 D4
G4 G4 G4 G4 G4
– subjectively matched for loudness (pre-test, 3 experts) – AB comparison for all possible pairs: 14 subjects, 2 × 6 pairs × 5 notes = 60 trials / subject
C5 C5 C5 C5 C5
V. Verfaille & C. Guastavino
Perceptual evaluation of vibrato features: the case of saxophone sounds
Perceptual evaluations New experiment: results
Binomial test 14 subjects (n=840) 90
n.s.
80
# times selected
70
n.s.
n.s.
n.s.
n.s.
n.s.
60 50 40 30 20 10 0 orig. SEM vs. orig. SEM vs. orig. SEM vs. FM AM mod. SEM
FM vs. AM
FM vs. mod. SEM
No significant difference... Really? =⇒ not if we take into account the expertise!
AM vs. mod. SEM
12
V. Verfaille & C. Guastavino
Perceptual evaluation of vibrato features: the case of saxophone sounds
13
Perceptual evaluations New experiment: results
Binomial test: novices 6 novices (n=360) 50
# times selected
40
n.s.
n.s. n.s.
30
20
10
0 orig. SEM vs. orig. SEM vs. orig. SEM vs. FM AM mod. SEM
FM vs. AM
FM vs. mod. SEM
AM vs. mod. SEM
=⇒ significantly prefer FM (p = 0.001), AM (p < 0.025) and modeled AM/FM/SEM (p = 0.001) over original AM/FM/SEM. “Too much vibrato, exaggerated” / “I’m just used to sax without vibrato!” / “I chose the one with less obvious vibrato”
V. Verfaille & C. Guastavino
Perceptual evaluation of vibrato features: the case of saxophone sounds
Perceptual evaluations New experiment: results
Binomial test: experts 8 experts (n=480) 70
# times selected
60 50
n.s.
40 30 20 10 0 orig. SEM vs FM
orig. SEM vs AM
orig. SEM vs. mod. SEM
FM vs. AM
FM vs. mod. SEM
AM vs. mod. SEM
– significantly prefer original AM/FM/SEM over FM (p = 0.005), AM (p < 0.0001) and modeled AM/FM/SEM (p = 0.001) – selected FM significantly less than AM (p = 0.001) or modeled AM/FM/SEM (p = 0.002)
14
V. Verfaille & C. Guastavino
Perceptual evaluation of vibrato features: the case of saxophone sounds
Perceptual evaluations New experiment: results
Chi-square: differences between groups Effect of expertise 40 35
% time selected
30
n.s.
n.s.
25 Experts Novices
20 15 10 5 0 FM
AM
mod. AM/FM/SEM
orig. AM/FM/SEM
over all pairs: χ2 (3) = 42.11(p 0.001) =⇒ very significant effect of expertise!
15
V. Verfaille & C. Guastavino
Perceptual evaluation of vibrato features: the case of saxophone sounds
Conclusion and future research
Conclusion – everyone can hear the difference between AM/FM/SEM and AM or FM only: =⇒ we need to model SEM – SEM: more depth, irregular, more pronounced – but “naturalness” depends on expertise: – experts: – original SEM sounds the most natural – FM sounds the least natural
– novices: – original SEM sounds the least natural – FM sounds the most natural
16
V. Verfaille & C. Guastavino
Perceptual evaluation of vibrato features: the case of saxophone sounds
17
Conclusion and future research
Future research
– perception: – experiments for flute, trumpet, voice, violin – dissimilarity ratings, investigate perceptual dimensions
– signal model: – – – –
improve the attack modelling consider vibrato waveform (not mono-sinusoidal) develop another SEM model than by interpolation (PWST) provide clearer control parameters
V. Verfaille & C. Guastavino
Perceptual evaluation of vibrato features: the case of saxophone sounds
18
Selective bibliography
Selective bibliography Maher, R. C., & Beauchamp, J. 1990. An Investigation of Vocal Vibrato for Synthesis. Applied Acoustics, 30, 219–45. Marchand, S., & Raspaud, M. 2004. Enhanced Time-Stretching Using Order-2 Sinusoidal Modeling. Pages 76–82 of: Proc. Int. Conf. on Digital Audio Effects (DAFx-04), Naples, Italy. Mathews, M., & Kohut, J. 1973. Electronic Simulation of Violin Resonances. J. Acoust. Soc. Am., 53(6), 1620–6. Sundberg, J. 1987. The Science of the Singing Voice. Dekalb, IL: Northern Illinois University Press. Verfaille, V., Guastavino, C., & Depalle, Ph. 2005. Perceptual Evaluation of Vibrato Models. Colloquium on Interdisciplinary Musicology, Montréal (CIM’05).