Investigation of a possible process identity between DRM ... - Infoscience

May 19, 1997 - IDIAP Research Report 97-19. Investigation of ... In particular, it prevents the use of the Levinson algorithm for ... A Application of the Correlation method with the RMS criterion in the case of the ...... This operation just amounts.
265KB taille 2 téléchargements 309 vues
R E P O R T

Martigny - Valais - Suisse

Investigation of a possible process identity between DRM and Linear Filtering Sacha KRSTULOVIC a

IDIAP{RR 97-19

I D I AP

R E S E A R C H

IDIAP

May 1997

D al le Mol le Institute for Perceptual Artificial Intelligence P.O.Box 592 Martigny Valais Switzerland phone +41 ; 27 ; 721 77 11 fax +41 ; 27 ; 721 77 12 e-mail [email protected] internet http://www.idiap.ch

a IDIAP

IDIAP Research Report 97-19

Investigation of a possible process identity between DRM and Linear Filtering

Sacha KRSTULOVIC

May 1997

Abstract. The classical analogy between linear ltering and acoustical ltering by tubes is applied

in the non-classical case where the tubes are made of unequal-length sections (such as the DRM case). It is shown that the ltering process identity is substantially more complicated than in the case of equal-length sections. In particular, it prevents the use of the Levinson algorithm for inverting the ltering process and recovering the tube characteristics from sound alone.

IDIAP{RR 97-19

2

Contents

1 Introduction 2 Eect of i = 0 in a step of the Levinson recursion 3 Filtering process of an acoustic tube 3.1 3.2 3.3 3.4 3.5

3 4 6

Fluid dynamics roots of the problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 From uids to signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Case 1: the length of the sections is uniform . . . . . . . . . . . . . . . . . . . . . . . . 11 Case 2: the length of the sections is not uniform . . . . . . . . . . . . . . . . . . . . . 12 Relation between polynomial coe cients, reection coe cients and the Yule-Walker equation system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

4 Conclusion 16 A Application of the Correlation method with the RMS criterion in the case of the DRM 17

IDIAP{RR 97-19

3

1 Introduction It is traditionally recognized that the Linear Prediction Coding (or LPC) modelling method has a relationship with the process of acoustical ltering occurring in a set of connected cylindrical pipes. The purpose of the present study is to disclose this relationship in the case of the Distinctive Regions and Modes articulatory model (DRM), which precisely consists in a pile of connected pipes. Such a relationship can then be exploited for the design of an acoustic-articulatory inversion sytem, to determine the parameters of the tube by means of inverse linear ltering. The tube-LPC relation is rather obvious in the case of a pile of equally lengthy tubes, as we show in section 3. But we also show that the expected process identity is a lot more complicated in the case of the DRM, which is made of unequally lengthy tubes. In particular, the DRM does not appear to be compatible with a lattice structure for the corresponding inverse lter. We try to use two dierent approaches to adress the problem: starting from the Levinson-Durbin equations, we try to recover the acoustical ltering process equations starting from acoustical phenomenons, we try to recover a recursive algorithm that could allow the implementation of a lattice inverse lter Both of these approaches make the object of a section in the following.

IDIAP{RR 97-19

4

2 Eect of

i = 0 in a step of the Levinson recursion A natural way of trying to solve our process identication problem is by starting with the Levinson recursion and disturbing it by adding the constraints inherited from the special structure of our unequal-lengths tube model. As a matter of fact, we can consider that the DRM is made of a pile of equal-length tubes, some of them being fastened together in order to form a set of longer and unequally lengthy sections (see gure 1). This amounts to setting some reection coe cients to zero in the course of our AutoRegressive predictor design. The classical form of the Levinson-Durbin algorithm for AR lter design is described in RJ93] by : E (0) = r0 m+1 = 8 +1) > < a(mm+1 > (m+1) : ai

=

"

# m X ( m) rm+1 ; ai rm+1;i =E (m) i=1

m+1

= a(im) + m+1 a(mm+1) ;i ;

with:

(1)

i = 1     m



E (m+1) = 1 ; 2m+1 E (m)

(2) (3) (4)

am : LPC coe cients m : partial correlation or reection coe cients P ;1;m rm = Nn=0 x(n)x(n + m) : values of the (estimated) autocorrelation function. What does m+1 = 0 brings about the correlation and the LPC coe cients ? From equation (3), it simply means that the predictor has not changed between step m and step m + 1 of the algorithm. From equation (4), it means that the energy of the prediction error stays the same. From equation (2), setting m+1 = 0 induces : rm+1 =

m X a(im) rm+1;i i=1

Figure 1: The DRM tube as a concatenation of 30 equal-length sections.

(5)

IDIAP{RR 97-19 Adopting a matrix notation, we have :

5 2

rm

3

i 6 rm;1 7 h (6) rm+1 = a(1m)  a(2m)     a(mm) 664 .. 775 . r1 This is a way of constraining the autocorrelation matrix. Given that : the autocorrelation matrix is supposed to be estimated from the original signal the LPC coe cients at step (m) are determined and xed at the previous step then this constraint amounts to imposing some values in the autocorrelation matrix. We do not just try here to neglect some terms in the matrix. This is therefore equivalent to constraining the original sound signal itself, which we just wish to analyse. Trying to incorporate the DRM constraints in the Levinson recursion leads to the above contradiction. This shows that the original Levinson recursive algorithm cannot be used to solve our problem. We have therefore to address the problem by its other end, i.e. start from the uid dynamics of the tube, deduce the general form of the tube's transfer function, and nally nd an estimator for the transfer function's parameters.

IDIAP{RR 97-19

6

3 Filtering process of an acoustic tube 3.1 Fluid dynamics roots of the problem

The following section takes up the mathematical development exposed by Wakita in Wak73], where the emergence of AR ltering equations from the acoustical ltering process is clearly shown in the case of equal-length tubes. The original work is extended to the case of unequal-length tube portions.

Basic system :

The vocal tract is considered to be an acoustic tube divided in M sections of any (time-independent) length.

Assumptions:

sound waves are plane uid waves (see Fla72], pp.24-25, or MI68], p.467) the tube is rigid (no wall impedance) losses due to viscosity and heat conduction are neglected.

Equation set :

Posing the problem in terms of uid dynamics, we can consider that the volume velocity um (t d) and the pressure pm (t d) in section m derive from a potential m (t d): um (t d) = ;Sm @m@d(t d) (7) pm (t d) =  @m@t(t d) (8)

with:

t : time variable d : distance variable Sm : surface of mth section  : density of air The evolution of the uid state is thereafter described by Webster's equation : @ 2m (t d) ; 1 @ 2 m (t d) = 0 @d2 c2 @t2 where c denotes the sound velocity.

(9)

Equation solving:

If we assume that the excitation source (the \glottis" of the tube) delivers a sinusoidal signal, then the solution of this equation is of the classical form: m (t d) = A expj!(t;d=c) +B expj!(t+d=c) (10) where A and B are constants1 . Remarking that um (t d) can be decomposed into a forward-travelling wave u+m (t d) and a backward-travelling wave u;m (t d), the above solution can be decomposed in the

1 If the excitation signal is made of a linear combination of sine waves, which can be obtained from any signal when applying the Fourier transform, the corresponding solution is a linear combination of the solutions for any individual sinusoidal component. Therefore, the relations developped hereafter do not loose their generality (in the limits of the assumptions made at the beginning) when a non-sinusoidal excitation such as the wave coming out of the vocal cords is applied.

IDIAP{RR 97-19 following way :

7 8 ; < um (t d) = u+ m (t d) ; um (t d) :

with

pm (t d) = Scm fu+m (t d) + u;m (t d)g 8 + < um (t d) = j!Scm A expj!(t;d=c) : ; um (t d) = j!Scm B expj!(t+d=c)

(11) (12)

At the connection between section m and section m + 1, the volume velocity and pressure must be continuous. We therefore have the additional relations : 8 < um+1 (t dm ) = um (t dm ) (13) : p m+1 (t dm ) = pm (t dm) with dm being the distance between the glottis and the connection between sections m and m + 1 (see gure 2). Since the speed of sound is constant, the distance variable can be related to the time variable and can thus be eliminated. Since there is no loss in a particular section, we also have, inside the limits of a section : 8 + l + < um (t d) = u+ m (t d ; lm ) = um (t ; cm ) (14) : ; um (t d) = u;m (t d ; lm ) = u;m (t + clm ) with lm being the length of the considered piece of tube. Wakita explains that point very clearly in Wak73]: \ Since no loss is assumed, the volume velocity component u+m+1 (t dm ) is equal to that component of the volume velocity that started at dm+1 at time l=c or lm =c in our case] earlier, and the volume velocity component u;m+1 (t dm ) is equal to that component of the volume velocity that will arrive at dm+1 at time l=c lm =c] later. Thus the solution of the continuous problem can be obtained by knowing only the values at each junction." This step is very important, as dropping the distance variable allows us to express our problem in terms of time series analysis. Furthermore, the fact that the problem can be solved considering only the junctions will allow us to work in a discrete world.

3.2 From uids to signals

Starting from uid dynamics, we end up with the following relations between the forward and backward travelling waves at each junction :

where :

8 + + ; < um+1 (t ; m t) ; u; m+1 (t + m+1 t) = um (t) ; um (t)  : c u+ (t ;  t) + u; (t +  ; m m+1 t) = Smc+1 fu+ m+1 m (t) + um (t)g Sm+1 m+1

Now dening the coe cient :

m t = lcm

Sm ; Sm+1 Sm + Sm+1 and applying to the above equations, we obtain : m=

(15)

IDIAP{RR 97-19

CROSS SECTIONAL AREA

8

∆m+1 l

∆m l

SECTION m SECTION 1 + Um (t,d m-1

SECTION m+1

U

+ (t,d m+1 ) m+1

+

)

+

U0 (t,d 0 )

+

=

U1 (t,d 1 )

U (t,d m ) m

-

Um (t,d m

Um+1 (t,d m+1 )

dM

dm+1

dm

U1 (t,d 1 ) ) dm-1

d1

d0

DISTANCE GLOTTIS

LIPS

Figure 2: Non-uniform acoustic tube model of the vocal tract. (Inspired from Wak73].)

IDIAP{RR 97-19

9 8 + < um+1 (t ; m t) = : u; (t +  t) = m m+1

+ ; m fum (t) ; m um (t)g

1 1+

(16)

+ ; m f; m um (t) + um (t)g

1 1+

Dening a unit length lunit as the greatest common divisor of the lenghts lm , we can apply the Z-transform with z dened as z = ej!2lunit =c = ej!2unitt , and we obtain: 8 ;nm > < z 2 Um++1 (z) = > : z n2m U ; (z) = m+1

i.e.

and, in matrix notation: 

1 1+

m

Um+ (z) ; m Um; (z)]

(17)

+ ; m ; m Um (z) + Um (z)]

1 1+

8 > >
> :

;nm Um;+1 (z) = z1+2 m ; m Um+ (z) + Um; (z)]

nm



n



Um++1 (z) = z 2m 1 ; m Um;+1 (z) 1 + m ; m z ;nm z ;nm



Um+ (z) Um; (z)

(18) 

(19)

If we assume that the lips end is connected to a tube of innite section, it amounts to the following boundary condition at front end (or lips end) of our model: S;1 = 1 ) 0 = 1 Applying this condition, we can write :  +    Um+1 (z) = z 21 Pmk=0 nk K Dm+ (z) U + (z) ; U ;(z) (20) m D; (z) 0 0 Um;+1 (z) m with  +        Dm (z) = 1 ; m 1 ; m;1 1 ; 1 1    Dm; (z) ; m z ;nm z ;nm ; m;1 z ;nm;1 z ;nm;1 ; 1 z ;n1 z ;n1 ;z ;n0 (21) and m Y Km = 1 +1 (22) i=0

1 Pm nk k=0

i

and the gain Km , the true transfer function for the forward travelling volume velocity is there denoted by Dm (z), and can be built recursively by applying: Neglecting the overall delay z 2 



+



Dm+ +1 (z) = 1 ; m+1 ; m+1 z ;nm+1 z ;nm+1 Dm; +1 (z)



Dm+ (z) Dm; (z)



(23)

We can also show by mathematical induction from equation (21) that we have : Dm; (z) = ;z ;

Pm

+ k=0 nk Dm (1=z)

(24)

IDIAP{RR 97-19

10 Developing (23) and applying (24), we obtain:

8 > < > :

Dm+ +1 (z) = Dm+ (z) + m+1 z ; Dm+ +1 (1=z) = m+1 z

Pm

Pm

+ k=0 nk Dm (1=z)

+ k=0 nk Dm (z) + Dm+ (1=z)

(25)

We can remark that if we change the variable z to 1=z in the rst of the above formulae, we obtain the formula in the second line. Both formulae are equivalent with regard to the relationship they imply between Dm+ +1 (z) and Dm+ (z).

We now develop this relationship in order to study more precisely the form and the growth of the transfer function. This developpment is made in the case of equal length sections and then in the case of unequal-length sections such as in the DRM.

IDIAP{RR 97-19

11

3.3 Case 1 : the length of the sections is uniform In this case, the transmission delay induced in every piece of tube is the same. Therefore, we can set nk = 1 8k, i.e. z ;nk = z ;1 8k in all the above equations. z is then dened as z = ej!2l=c , l being the length of every piece of tube. We know from equation (21) that Dm+ (z) is of the form: Dm+ (z) =

m X a(im) z ;i i=0

(26)

mX +1 a(im+1) z ;i i=0

(27)

and we have also: Dm+ +1 (z) = The relation (25) gives : mX +1 m m X X (m+1) ;i (m) ;i ; (m+1) ai z = ai z + m+1 z a(im) z i i=0 i=0 i=0

(28)

mX +1 m m X X a(im+1) z ;i = a(im) z ;i + m+1 a(im) z i;(m+1) i=0 i=0 i=0

(29)

i.e.

or, changing the mute index i to (m + 1 ; i) in the second sum: mX +1 m mX +1 X a(im+1) z ;i = a(im) z ;i + m+1 a(mm+1) ;i z ;i i=0 i=0 i=1

(30)

Identifying the coe cients of the polynoms in z ;i on each side of the equal sign, we obtain : 8 > > > > > > < > > > > > > :

a(0m+1) = a(0m) a(1m+1) = a(1m) + m+1 a(mm) .. . a(mm+1) = a(mm) + m+1 a(1m) a(mm+1+1) = m+1 a(0m)

(31)

which can be formalized in one line as : a(im+1) = a(im) + m+1 a(mm+1) ;i

(32)

These equations are similar to those linking partial correlation coe cients i to prediction coe cients ai in the Levinson-Durbin algorithm for LPC modelling. If we analyse speech with a sampling frequency of Fs = 2c l , and if we estimate the reection coe cients in a proper way (for instance using Itakura's covariance method, see Ita71]), the equivalence between an LPC model of order M and the ltering process of the tube needs no more assumptions to hold.

IDIAP{RR 97-19

12

3.4 Case 2 : the length of the sections is not uniform

If the length of the sections is not uniform, we must deal with the irregular delays and the z ;nk not being equal to z ;1 . To formalize the growth of the transfer function in a readable way, we will borrow the notation of the summation indexes to the set theory. Let m be the set of all possible indexes k for the discrete delays nk , and let ; be a set containing one of the possible index combinations2. We know from equation (21) that Dm+ (z) is a polynomial in z and we can now express its form as : Dm+ (z) = or : Dm+ +1 (z) =

X

;

f01mg

a(;m) z ;

X

f01m+1g

P

k2; nk

a(;m+1) z ;

(33)

P

k2; nk

(34)

;

The relation (25) now gives : X

 m+1

am

( +1) ;

;

P P X (m) ; P nk ; k2m+1 nk X (m) Pk2; nk ; n k k 2 ; k 2 ; z = a; z + m+1 z a; z (35) ; m ; m

i.e. X

 m+1

a(;m+1) z ;

P

k2; nk

=

;

X

 m

a(;m) z ;

P

k2; nk

+ m+1

;

X

 m

a(;m) z

P

P

k2; nk;

k2m+1 nk

(36)

;

For a particular subset ; of our index set m , we can show the following: P

k2; nk ;

P

k2 m+1 nk

=

X

nk ;

X

nk ;nm+1 k2; k2 m | } P{z ;nm+1 ; k2; nk

(37)

= ; being the complementary set of ; so that ;  ; = m . Equation (36) then becomes : X

 m+1

;

a(;m+1) z ;

P

k2; nk

=

X ;

 m

a(;m) z ;

P

k2; nk

+ m+1

X

 m

a(;m) z ;

P

k2; nk +nm+1

(38)

;

In this case, the analytical identication of the polynomial coe cients has to be performed on a case-by-case basis. For instance, let us express it in the case of the DRM. In this case, we have 8 sections of unequal length with lunit = L=30 (L being the total length of the full tube). The lengths of the sections are distributed as follows from lips to glottis: l0 = 3lunit, l1 = 2lunit, l2 = 4lunit, l3 = 6lunit, l4 = 6lunit, l5 = 4lunit, l6 = 2lunit, l7 = 3lunit. The recursion allowing to compute the DRM transfer function is then of the form: 2 We have = f0 1    mg and ;  . This means that ; belongs to the set of all subsets of . m m m We P can remark that this later set denes a -algebra on the set of delays nk . A measure on this set could be dened as k2; nk . We don't know if such measure theory notions have already been used in the framework of polynomial transfer functions analysis, but an expert in measure theory might nd here a lead to an alternate way of investigating our problem.

IDIAP{RR 97-19







D+ (z) = 1 ; 7 D; (z) ; 7 z ;3 z ;3

13



1

;

;

6

6 2

z ;2 z ;



 

1

;

;

2

2 4

z ;4 z ;



1

;

;

1

z ;2 z ;

1 2



1

;z ;3

(39)

When observing the growth of the transfer function between step 3 and step 4 for instance (see equations developed in gure 4), and replacing the ; indexes by integer indexes corresponding to the place of the increasing negative powers of z, we obtain the following set of equations :

8 > > > > > > > > > > > > > > > > > > > > > > > > > > < > > > > > > > > > > > > > > > > > > > > > > > > > > :

a(4) 0 a(4) 1 a(4) 2 a(4) 3 a(4) 4 a(4) 5 a(4) 6 a(4) 7 a(4) 8 a(4) 9 a(4) 10 a(4) 11 a(4) 12 a(4) 13

= = = = = = = = = = = = = =

a(3) = 1 0 a(3) 1 a(3) 2 a(3) 3 a(3) 4 a(3) + 4 a(3) 5 7 (3) a6 (3) 4 a6 (3) a7 + 4 a(3) 5 (3) a 4 4 (3) 4 a3 (3) 4 a2 (3) 4 a1

(40)

4

In the general case, we see that if we try to operate a polynomial coe cients identity starting from equation (25), we cannot meet the Levinson-Durbin relation tying prediction coe cients ai and reection coe cients i . As the basic idea of the Levinson algorithm is to nd a relation between an mth order predictor and its (m + 1)th order successor, we try in the following section to come up with something similar in the irregular lengths case.



IDIAP{RR 97-19

14

"

#

1

z ;1 " " " "

1 ; z1 ; z1 + z ;2

# #

1 + 2 1z;1 ; z22 ; z2 + 2 z12;1 + z ;3

1 + 2 1 ;z1 +3 2 + ;2 +3 z12;3 2 1 ; z33 ; z3 + ;2 +3 z12;3 2 1 + 2 1 ;z31 +3 2 + z ;4 1+

2

1

; z4 + ;

; 1+

4

3

z

2

4

3+ 3

; 3+

z2

4

+ ; 2+

2 1

;

4

1

2

4

3

2

1

+ ; 2+

;

4

4 3

3 2

1+ 3

z2 1

;

4

.. .

#

1 3

;

3

1+ 4

2

1+ 3

1

z3

;

3

2 2

+; 4 1+ 4

3 2

2

+

; 3+

1

;

4

; 1+

4

3+ 3

z3

2

1

4

z4

1

2

; z44 2

#

+ z;5

Figure 3: Regular tube transfer function growth. Note the regular increase in the polynomial degrees. (Equations computed with the help of a symbolic computation program.)

"

1 ;z ;3

"

1 + z31 ; z21 ; z ;5

" " "

# #

1 + 2z21 + z31 + z52 ; z42 ; z61 ; 2z71 ; z ;9

#

1 + 2z21 + z31 + 3z42 + z52 + 3z61 + 3 z72 1 + z93 ; z63 ; 3 z82 1 ; 3z91 ; z102 ; z3112 ; z121 ; z2131 ; z ;15 1 + 2z2 1 + z31 + 3z4 2 + z52 +

; z64 ;

4

2

z8

1

; 4z9 1 ;

4

3

z10

2

4

3+ 3

z6

1

; z4112 ;

+ 4

2 1 z7 + 3 1+ 3 ; z12 3

.. .

4 4

3

z8 3

2 2

z13

1 1

#

+

;

4 3

3

1+ 3

z9

2

z14

1

3 2 + 4 1 + 4 2 1 + 4 + z4102 + 4 z11 z12 z13 z15 + ; 4 3z15 3 1 ; z162 ; z3172 ; z181 ; z2191 ; z;21

Figure 4: DRM tube transfer function growth. Note the disturbance in the polynomial degrees.

#

IDIAP{RR 97-19

15

3.5 Relation between polynomial coecients, reection coecients and the Yule-Walker equation system

In the case of equal-length tube sections, the degree of the polynomial tube transfer function increases by ;1 at each step of its growth. When trying to estimate the transfer function by solving the YuleWalker equations in a recursive fashion such as using the Levinson-Durbin algorithm, the problem is the following: given a set of m polynomial coe cients a(im) , resulting from solving a m  m system of Yule-Walker equations at step m, and given one reection coe cient depending upon the (m + 1)  (m + 1) correlation matrix, what are the (m+1) polynomial coe cients a(im+1) of the transfer function at step (m+1) (or what is the solution of the (m + 1)  (m + 1) Yule-Walker system at the next step) ? In the case of the non-equal length tubes, the degree of the polynomial increases by a certain amount p, very often dierent from 1. If we want to apply the classical RMS criterion for estimating our predictor at a particular step m (see appendix A) , the estimation still corresponds to solving a linear system of the form: h

i

1 a(1m)     a(mm) Rm = 0 0     0] (41) But here, due to the application of irregular delays for the computation of the correlation matrix Rm , the matrix looses the Toeplitz structure and in some cases the symmetry. The problem is therefore : given a set of m polynomial coe cients resulting from solving Yule-Walker-like, nonToeplitz equations at step m, and given a single reection coe cient related to new correlation values, what are the (m+p) polynomial coe cients of the transfer function at step (m + 1) ? This is an ill-posed problem, as we miss (p ; 1) known parameters to solve our (m + p)  (m + p) Yule-Walker-like system of equation. Even though we get p new correlation values, they are merged into one reection coe cient, and we loose (p ; 1) degrees of liberty. The problem is therefore uncompatible with a simple inverse ltering scheme using a simple (\monodimensional") lattice structure. Recursive solutions of an other nature may possibly be found in the domain of numerical analysis, but their design and implementation would exceed the scope of the present study. One could argue that knowing the structure of the transfer function, and given the correlation matrix, we could solve the Yule-Walker-like system at step m and m+1 and then deduce the reection coe cients m from the obtained a(im) and a(im+1) . Experimental attempts to do so have led to numerical errors (probably due to ill-conditioned correlation matrices) making the method untractable. For instance, we haven't been able to verify the relation between the predictor at step 3 and the predictor at step 4 in the DRM case illustrated by equation set (40).

16

IDIAP{RR 97-19

4 Conclusion As we show in the present study, the DRM articulatory model leads to an ill-posed problem when trying to identify its acoustical ltering action with a simple AR linear ltering process. Although an inverse lter might be found in a numerical analysis framework or in an acoustical theory framework, the di culty of reaching a solution diminishes the interest of using the DRM model in an acousticarticulatory inversion system that would be based on an inverse ltering scheme.

References Fla72] J.L. Flanagan. Speech analysis, synthesis and perception. Springer Verlag, 1972. Ita71] F. Itakura. Extraction of feature parameters of speech by statistical method. In Proceedings of the 8th symposium on Speech Information Processing, volume II, pages 5{2 { 5{12, February 1971. MI68] P.M. Morse and K.U. Ingard. Theoretical acoustics. Mc Graw-Hill, 1968. RJ93] L. Rabiner and B.H. Juang. Fundamentals of Speech Recognition. Prentice Hall, 1993. Wak73] H. Wakita. Direct estimation of the vocal-tract shape by inverse ltering of acoustic speech waveforms. IEEE Transactions on Audio and Electroacoustics, AU-21:417{427,October 1973.

Acknowledgement: Thanks to E. Mayoraz for allowing some time to the fruitful discussions we have had.

IDIAP{RR 97-19

17

A Application of the Correlation method with the RMS criterion in the case of the DRM In the case of the DRM, we know that the inverse lter transfer function should have the form: P X Y (z) (42) A(z) = a; z ; k2; nk = X(z) ; m with m = f0 1     mg and given m discrete delays nk related to the geometry of the tube. This form corresponds to the following dierence equation : X yn = a; xn;P nk (43) k2;

 n

;

The input of the speech AR model is dened as an impulse train n . The inverse lter modelling error can therefore be expressed by substracting the input of the model to the output of the inverse lter. Hence the expression dening the output error : "n = yn ; n (44) X P = a;xn; k2; nk ; n (45)  n

;

Since we want our speech model to have the form X(z) = A(z) , the impulse train can equivalently be replaced by an impulse of height  in the above equation : X "n = a;xn;P nk ; n0 (46) ;

k2 ;

 n

Finding the optimal inverse lter parameters corresponds to minimizingthe mean squared error dened as : E = =

P N ;1; k2n nk X

"2n

n=0 P N ;1; k2n nk " X X

 n

n=0

;

(47) a; xn;Pk; nk ; n0

#2

(48)

The dierentiation of E with respect to each a; (except a , which is set to 1) gives : @E @a;

P 3 2 P 3 N ;1; k2n nk N ;1; k2n nk X X =2 a 64 xn;Pk2 nk xn;Pk2; nk 75 ; 2 64 xn;Pk2; nk n0 75 n=0 n=0  n X

2

Using the autocorrelation function usually dened as : Ri;j =

N +X M ;1 n=0

xn;ixn;j =

but considering only the terms of the form: R; = =

P N ;1; k2n nk X

n=0 P N ;1; k2n nk X n=0

(49)

N ;1X ;ji;j j n=0

xnxji;j j

(50)

xn;Pk2 nk  xn;Pk2; nk xn  xn+P

P

k2; nk ;

k2 nk

(51)  

(52)

IDIAP{RR 97-19

18 and setting equation (49) to zero, we obtain the linear equation system: X



 m

aR; = 0

(53)

with ;  m and ; 6= . Considering that for ! = we have a = 1, we can nally express this system as : X

 m



a R; = R

(54)

with ;  m , ; 6= and !  m , ! 6= . This system does not have a Toeplitz structure. We can also notice that for some given sets of delays nk , some of the values of R; for dierent ;s and !s will be the same3. This implies that the equation system (54) contains some duplicate lines and columns. To make the system solvable, duplicate lines have to be removed and duplicate columns merged into one by addition. This operation just amounts to reducing the number of unknowns to make it equal to the order of the polynomial transfer function we want to determine. This is where the system looses its former symmetry.

P when two dierent subsets ;1 and ;2 correspond to the same polynomial degree in A(z), i.e. P 3 This happens k2;1 nk = k2;2 nk . See for instance the case of the DRM at step 4 : we have nk 2 f3 2 4 6g considering ;1 = f2 4g and ;2 = f6g, we have R ;1 = R ;2 .