Are young male speakers losing Tone 3 breathiness in Shanghai Chinese? An acoustic and electroglottographic study. SRPP – LPP November 8th 2013
Jiayin Gao1 & Pierre Hallé1,2
(
[email protected])
1. Laboratoire de Phonétique et Phonologie 2. Laboratoire Mémoire et Cognition
Shanghai Chinese 1
nearly 14 million speakers belongs to the Wu family
(spoken in southern Jiangsu, Shanghai, Zhejiang, etc. nearly 80 million speakers)
characterized phonologically by its three-way laryngeal contrast voiceless
unaspirated (全清:fully clear) voiceless aspirated(次清:secondary clear) voiced(全濁:fully muddy) * 次濁 (secondary muddy) is used to describe ‘sonorants’
Three-way laryngeal contrast in SH 2
Comparison with other dialects. Character
Meaning
攤
spread out
丹
cinnabar
蛋
egg
Mandarin Cantonese Shanghai /than/1!
/tha:n/1!
/thɛ/1!
/tan/4!
/ta:n/6!
/dɛ/3!
/tan/1!
/ta:n/1!
/tɛ/1!
* The numbers represent tone categories, not tone values.
Phonological association between onsets and tones in Shanghai Chinese. Onsets p, pʰ, t, tʰ, ts, tsʰ, tɕ, tɕʰ, k, kʰ, ! f, s, ɕ, h, l, m, n, Ø! b, d, (dz), dʑ, g,! v, z, ʑ, l, m, n, Ø!
Tones 53 (T1)
34 (T2)
23 (T3)
5 (T4)
yin tones (high)
2 (T5)
yang tones (low)
3
Voiced onsets with Muddy airflow: impressionistic descriptions
Les b des dialectes Wou, comme les autres occlusives sonores de ces dialectes – explosives tant qu’affriquées – sont accompagnés, à la détente, d’une aspiration sonore. En réalité, celle-ci est tout à fait identique au phonème initial du sanscrit bharati. Cependant l’aspiration des dialectes Wou est, à mon avis, trop faible pour mériter d’être désignée. (Karlgren, 1915-1926: 260) The b of Wu dialects, like all the other voiced stops of those dialects – plosives as well as affricates – are accompanied by a voiced aspiration at the release. In fact, a voiced aspiration is identical as the initial phoneme of bharati in Sanskrit. However, in my opinion, the aspiration in Wu dialects is too weak to worth its name.
4
Voiced onsets with Muddy airflow: impressionistic descriptions The ancient sonants 並,定,擎,牀,etc, (or aspirated sonants, according to Karlgren) remain as sonants or apparent sonants. The real nature of these initials, as was first noticed by Dr. Liou Fuh (“Fu Liu”), and later verified experimentally by the present writer, is that they begin with a quite voiceless sound and only finish with a voiced glide, usually quite aspirated, in the form of a voiced h. In the case of fricatives and affricatives(順,騎), the second half may be voiced; in polsives(旁), there is ususually no voice at all until the explosion takes place (…) However, in intervocalic positions, all the quasi-voiced initials become true voiced sounds. (Chao, 1928: xii)
5
Voiced onsets with Muddy airflow: impressionistic descriptions
在這次所做過的方音裡,大多數 把 這 個 讀 成 ‘ 清 音 濁 流 ’ 的 音。 例如‘bh’讀如[下圈b加彎頭h], 或[p加彎頭h],又如‘z’母讀如 德 文 派 的 [ s z ] , 或 [ s 加 彎 頭 h ]。 (Chao, 1928: 21). In this dialectological study, most speakers pronounce (the voiced obstruents) as ‘clear sounds with muddy airflow’. For example, ‘bh’ as [b̥ɦ] or [pɦ], and ‘z’ as [sz] as in German, or [sɦ].
6
Question 1: How can we describe this ‘muddy airflow’?
“muddy” used
in Chinese phonology to describe what we call today “voiced” or “sonant” in contrast with “clear” Perhaps “muddy” implies some special voice quality?
“muddy airflow” breathy
voice, breathiness (e.g., Sherard, 1972; Norman 1988), as will be used in this study slack voice (Ladefoged & Maddieson, 1996; Chen & Downing, 2011) depression, “depressor” (Rose 1989, 2001, Chen & Downing, 2011)
7
Voiced onsets with Muddy airflow: experimental investigations
Yes, breathier! H1-H2 (Cao & Maddieson, 1992; Gao & Hallé, 2012): higher for “muddy” H1-A1 (Cao & Maddieson, 1992): higher for “muddy” fiberoptic transillumination (1 speaker) (Ren, 1988): maximal glottal openness (MGO): aspirated > “muddy” > voiceless.
No evidence for breathiness ePGG (1 speaker) (Gao et al., 2011): MGO: aspirated > “muddy” ≈ voiceless. AF/AP (ratio of air flow to air pressure) (Cao & Maddieson, 1992): no diff. between “muddy” and voiceless Positional effect (on second syllable): H1-H2 (Chen, 2011; Gao & Hallé, 2012): no or reversed diff. between “muddy” and voiceless
8
Question 2: Where does ‘muddy airflow’ come from? Old Chinese p-V, p-N! b-V, b-N! p-ʔ! b-ʔ! p-s > p-h! b-s > b-h! p-p,t,k! b-p,t,k!
Early Middle Chinese 平 even tone
上 rising tone 去 departing tone 入 entering tone
loss of coda > contour tones
Late Middle Chinese 阴平 Yin even 阳平 Yang even 阴上 Yin rising 阳上 Yang rising 阴去 Yin departing 阳去 Yang departing 阴入 Yin entering 阳入 Yang entering
loss of onsets’ voicing > register tones (TONE SPLIT: Haudricourt 1954)
9
Question 2: Where does ‘muddy airflow’ come from? Old Chinese p-V, p-N! b-V, b-N! p-ʔ! b-ʔ! p-s > p-h! b-s > b-h! p-p,t,k! b-p,t,k!
PHONATION DIFFERENCE Early Middle Chinese Late Middle Chinese 平 even tone
上 rising tone 去 departing tone 入 entering tone
loss of coda > contour tones
modal
阴平 Yin even
breathy
阳平 Yang even
modal
阴上 Yin rising
breathy
阳上 Yang rising
modal
阴去 Yin departing
breathy
阳去 Yang departing
modal
阴入 Yin entering
breathy
阳入 Yang entering
loss of onsets’ voicing > register tones (TONE SPLIT: Haudricourt 1954)
Voicing contrast > tone contrast 10
F0 lowering of voiced onsets aerodynamic: high air flow after voiceless, low air flow after voiced (Ohala, 1973) articulatory:
automatic
perceptual: deliberate use of low F0 enhances the percept of the presence of low-frequency energy in and near the stop and conveys the [voice] information (Kingston & Diehl, 1994, 1995; Kingston et al., 2008)
controlled
slack (≠stiff) vocal folds facilitate voicing and lower the F0 (Halle & Stevens, 1971) larynx height lower for voiced than voiceless stops (e.g., Ewan 1976) and the vertical movement of the larynx changes F0 (Honda et al., 1999)
Phonologization of the F0 difference: voicing contrast reinterpreted as a (register) tone contrast (high vs. low)
11
Voicing contrast > phonation contrast > tone contrast Phonetic motivations aerodynamic:
breathy voiced consonants, vocal folds not closely adducted, weak Bernoulli force => lower subglottal pressure at the stop’s release => lower F0 (Ohala & Ohala, 1972; Hombert et al., 1979) articulatory: breathiness produced with abducted arytenoid cartilages with less forceful contraction (Ohala 1973), and typically (not always) accompanied by a lowering of the larynx that correlates with the lowering of the F0 (Thurgood 2002). perceptual: deliberate use of breathiness and (low F0) to participate in the integrated percept of the presence of low-frequency energy, i.e. the [voice] information (Kingston, 2011).
automatic
controlled
12
Voicing contrast > phonation contrast > tone contrast
Evidence in development of Middle Chinese and language contact (Pulleyblank, 1977): In level tone, Middle Chinese breathiness was replaced by onsets’ voiceless aspiration in Mandarin and in Cantonese. In Vietnamese, the huyen tone (corresponding to Chinese low level) has a breathy quality and the ngang tone (high level) has a clear quality (Thompson, 1965).
Evidence in other languages (among others) Mon-Khmer: voicing contrast > phonation contrast (head register/ modal vs. chest register/breathy) (Shorto, 1967; Henderson, 1952; Wayland & Jongman, 2003). East Cham: two registers combining pitch, phonation, duration and vowel quality (high tone & modal voice vs. low tone & breathy voice) (Edmondson & Gregerson, 1992; Brunelle, 2005, 2006).
Summary of previous studies 13
Question 1: articulatory and acoustic descriptions of “muddy airflow” Acoustic
findings of H1-H2, H1-A1 (higher for “muddy” than voiceless) Physiological findings of fiberoptic transillumination (MGO greater for “muddy” than voiceless)
Question 2: origin of the “muddy airflow” Historical
leftover of breathiness in Middle Chinese (not due to a synchronic coarticulatory effect) automatic controlled
Purpose of this study 14
To provide EGG data for the “muddy” series. Electroglottograph (EGG): degree of contact of the vocal folds during phonation => “open quotient” as an indication of voice quality Has been used to study languages with phonations: e.g., Vietnamese, Naxi (Michaud, 2005), Tamang (Mazaudon & Michaud, 2006; Mazaudon, 2012), White Hmong (Esposito, 2012), etc.
To gain an insight on the evolution of “muddy” voice in Shanghai Chinese during the last century
Shanghai Chinese spoken in the urban area has experienced the most rapid change among all the Wu dialects, since the 2nd half of the 19th century: contact with migrant dialects, and more recently with Mandarin Chinese, the latter lacking breathiness.
Method 15
Participants: 11
native speakers of Shanghai Chinese, born and raised in Shanghai urban area Elderly group: 3M + 1F, mean 67.3, range 64-72 Young group: 4M + 3F, mean 24.9, range 24-28
Speech materials Tone T1 T2 T3
Zero (ʔ)ɛ 哀 (ʔ)ɛ 爱 ɛ咸
Stop pɛ 杯 tɛ 堆
Fricative fɛ 翻 sɛ 三
pɛ 板 tɛ 胆
fɛ 反 sɛ 伞
bɛ 办 dɛ 谈 vɛ 饭 zɛ 才
Nasal (ʔ)mɛ 蛮 (ʔ)mɛ 美 mɛ 梅
frame sentence /__ gə ə zɨ ŋo nin tə ə/ (“__” this character, I know it), written in Chinese characters
Method: EGG 16
Calculate the dEGG signal (derivative of EGG, Henrich et al., 2003) with a semi-automatic Matlab program “peakdet.m” (Michaud). In the case of double or multiple peaks, the larger peak was retained. Obtain F0 and OQ values for each glottal period in the /ɛ/ vowel. Interpolate F0 and OQ values every 5 ms. Compute average F0 and OQ for each consecutive fifth of the vowel’s duration.
http://voiceresearch.free.fr/egg/
F0 and OQ at five positions
Method: EGG 17
Henrich et al., 2003
Method: EGG 18
maximal peak
maximal peak
Method: EGG 19
exclusion of the data if inconsistency between the 4 methods of determination of the peak: maximal peak, barycenter method, with smoothed or unsmoothed dEGG
Method: H1-H2 20
Computed on 30 ms windows for each consecutive fifth of the vowel’s duration, with a Praat script (based on Gendrot) No correction (Iseli, 2004): consistent non-high vowel to avoid the effect of F1 variation on H1 and H2.
H1-H2 at five positions Gordon & Ladefoged, 2001
Results: fundamental frequency
150
150
F0 (Hz)
F0 (Hz)
200
200
250
250
21
1
Tone 2 2
3
Tone 3 4
Tone 1
100
100
Tone 1
5
Position
3 elderly males: F0 curved averaged across items and speakers
1
Tone 2 2
3
Tone 3 4
5
Position
1 elderly female: F0 curved averaged across items
Males’ mean: 156 Hz vs. Females’ mean: 211 Hz
Results: fundamental frequency 350
22
Tone 2
Tone 3
250 150
200
F0 (Hz)
300
Tone 1
1
2
3
4
5
Position
3 young females (solid lines) and 4 young males (dotted lines): F0 curved averaged across items and speakers.
Males’ mean: 144 Hz vs. Females’ mean: 251 Hz
Results: open quotient 60
40
40
50
*
50
*
Open quotient (%)
*
1
2
Tone 2 3
Tone 3 4
Position
3 elderly males: OQ at 5 positions, averaged across items and speakers
Tone 1
30
Tone 1
30
Open quotient (%)
60
70
70
23
5
1
2
Tone 2 3 Position
Tone 3 4
1 elderly female: OQ at 5 positions, averaged across items
*: higher T3 than T1 and T2, p < .05.
5
Results: open quotient 60 55 45
50
Open quotient (%)
55 50 45 1
2
Tone 2 3 Position
Tone 3 4
4 young males: OQ at 5 positions, averaged across items and speakers
Tone 1
40
Tone 1
40
Open quotient (%)
60
65
65
24
5
1
2
Tone 2 3 Position
Tone 3 4
3 young females: OQ at 5 positions, averaged across items
5
Some examples
0
Frequency (Hz)
7000
0 Time (s)
Frequency (Hz)
7000
0
0
0.5 Time (s)
7000
0
0.5
Frequency (Hz)
Frequency (Hz)
Frequency (Hz)
Frequency (Hz)
25
0
0.5 Time (s)
young female: age 25
7000
0
0 Time (s)
7000
0
0
0.47 Time (s)
7000
0
0.47
0
0.46 Time (s)
elderly female: age 67
Some examples 7000
0.3573 0
0 0
0.4 Time (s)
7000
-0.5582
0
0
7000
0
0.4 Time (s)
0
0.4
Frequency (Hz)
Frequency (Hz)
Frequency (Hz)
Frequency (Hz)
26
0 Time (s)
7000
0
0.48
0
0.48 Time (s)
Time (s)
elderly male: age 64
elderly male: age 66
Results: variations in F0 and OQ 220 200 160
180
Tone 1 1
2
Tone 2 3
100
120
140
F0 (Hz)
160 140 120 100
F0 (Hz)
180
200
220
27
Tone 3 4
5
Position
Averaged F0 curves of two young males: larger F0 range (about 80 Hz) and higher mean F0 (163 Hz)
Tone 1 1
2
Tone 2 3
Tone 3 4
5
Position
Averaged F0 curves of two young males: smaller F0 range (about 50 Hz) and lower mean F0 (125 Hz)
Results: variations in F0 and OQ 28
F0 (Hz)
Tone 1 1
2
Tone 2 3
45
45
50
50
F0 (Hz)
55
55
60
60
*
Tone 3 4
5
Position
Averaged OQ of two young males with larger F0 range and higher F0
Tone 1 1
2
Tone 2 3
Tone 3 4
5
Position
Averaged OQ of two young males with smaller F0 range and lower F0
Some examples
0.33
Time (s)
7000
0
0
0.5 Time (s)
7000
0
0.5
0
0.5 Time (s)
young male: age 25, larger and higher F0
Frequency (Hz)
0
0 Time (s)
7000
0
0
0.33 Time (s)
7000
0
0.33 Frequency (Hz)
Frequency (Hz)
0
7000
Frequency (Hz)
Frequency (Hz)
0.33
0
Frequency (Hz)
Frequency (Hz)
0.33
7000
Frequency (Hz)
Frequency (Hz)
29
0
0.33 Time (s)
young male: age 28, smaller and lower F0
7000
0
0
7000
0
0
7000
0
0
Summary of EGG results 30
Tone realization: Young female speakers have greater F0 range and extremely higher F0 than male speakers. Variation in young male speakers.
Open quotient: Positional effect: the difference between modal and “muddy” is more pronounced at the 1st than the 2nd half of the vowel (consistent with Cao & Maddieson, 1992, but not with Rose, 1989). Values at the beginning: “muddy” OQ=0.58-0.6 for speakers that show the difference between “muddy” and modal, which is around 0.1 for elderly males and 0.05 for young males. No difference for female speakers. Some F0-phonation tradeoff in young male speakers. (cf. in Risiangku Tamang, Mazaudon, 2012)
Results: H1-H2 4 2 0 -4
-4
-2
H1-H2 (dB)
0
*
-2
*
1
2
Tone 2 3 Position
Tone 3 4
Tone 1
-6
Tone 1
-6
H1-H2 (dB)
2
4
31
5
3 elderly males: H1-H2 at 5 positions, averaged across items and speakers
1
2
Tone 2 3 Position
Tone 3 4
5
1 elderly female: H1-H2 at 5 positions, averaged across items
*: higher T3 than T1 and T2, p < .05.
Results: H1-H2
H1-H2 (dB)
-5
0
5
*
-10
-5
*
*
0
*
-10
H1-H2 (dB)
5
32
Tone 1 1
2
Tone 2 3
Tone 3 4
Position
4 young males: H1-H2 at 5 positions, averaged across items and speakers NO VARIATIONS between speakers with larger or smaller F0 range
Tone 1 5
1
2
Tone 2 3
Tone 3 4
5
Position
3 young females: H1-H2 at 5 positions, averaged across items *: higher T3 than T1 and T2, p < .05.
Summary of H1-H2 results 33
Positional effect: more pronounced difference at the 1st than the 2nd half of the vowel. Values at the beginning: “muddy” H1-H2 = 3 dB for elderly males and 7 dB for young males; difference of around 5 dB between “muddy” and modal for both young and elderly males. No difference for female speakers. Less variation among male speakers.
General discussion 34
Cross-age difference:
Cross-gender difference:
Among male speakers, all elderly males produce OQ difference between “muddy” and “clear”. Only young males with a smaller F0 range and lower F0 produce this difference, which is smaller than that of elderly males. => a trend toward loss of breathiness. However, H1-H2 results show more consistent difference for all male speakers. Female speakers, elderly or young, do not use OQ or H1-H2 to distinguish the “muddy” and “clear” voice. F0 seems to be sufficient. (NB: only one elderly female) Female speakers usually have large F0 dynamics, perhaps this is why they need not phonation difference.
Breathy voice is probably not an automatic effect of low tones in today’s Shanghai Chinese. This redundant feature is more likely to be historical leftover.
To do list 35
Additional EGG and H1-H2 data Additional analyses: H1-A1, A1-A2, etc. Perception test: is the breathiness perceived by native speakers? and cross-linguistically?
Acknowledgement 36
We would like to thank Andrea
Levitt and Marc Brunelle for the correction of this paper that is going to be presented at International Conference on Phonetics of the Languages in China; Alexis Michaud for adapting “peakdet.m” to our use; our dear participants; YOU
FOR YOUR ATTENTION.