130-JCA JCA2006 INVESTIGATION OF 3-D

May 9, 2006 - ... des Fluides et d'Acoustique, UMR CNRS 5509, École Centrale de Lyon, ...... 13. J. Fawcett, Modeling three-dimensional propagation in an ...
5MB taille 4 téléchargements 268 vues
May 9, 2006 19:56 WSPC/130-JCA

JCA2006

Journal of Computational Acoustics c IMACS

INVESTIGATION OF 3-D ACOUSTICAL EFFECTS USING A MULTIPROCESSING PARABOLIC EQUATION BASED ALGORITHM K. CASTOR Laboratoire de D´etection et de G´eophysique, D´epartement Analyse, Surveillance, Environnement, Commissariat ` a l’´energie Atomique, BP 12, FR-91680 Bruy`eres-le-Chˆ atel, France [email protected] F. STURM ´ Laboratoire de M´ecanique des Fluides et d’Acoustique, UMR CNRS 5509, Ecole Centrale de Lyon, 36, avenue Guy de Collongue, FR-69134 Ecully Cedex, France [email protected] Received (Day Month Year) Revised (Day Month Year) A parallelised algorithm based on an existing 3-D Wide Angle Parabolic Equation model is developed to perform numerical simulations of underwater acoustic propagation on a massively parallel computer. The parallelization method used is a suitable two-level procedure: A frequency decomposition and a spatial decomposition of the calculations are respectively dedicated to reduce CPU times for broadband and CW signal propagation. The high performance computing is examined for the ASA 3-D wedge shaped waveguide. CPU times are presented and both speedup and efficiency are analysed. An investigation of significant 3-D effects at higher frequencies and at longer propagation ranges than in earlier works e.g. [F. Sturm, J. Acoust. Soc. Am. 117 (3) 1058-1079 (2005)] is now accessible with reasonable CPU times by using the new parallel algorithm. Further, the feasability of the procedure applied to a realistic environment problem involving both sound speed profiles and bathymetry data sets is also illustrated. Keywords: Sound propagation modeling; parabolic equation; azimuthal coupling; parallel processing; high-performance computing.

1. Introduction In some underwater acoustics problems, the horizontal refraction effects are weak enough to allow 2-D models to predict sound propagation accurately. However, for some particular oceanic environments involving bathymetric slopes or horizontal sound speed gradients, it has been demonstrated experimentally and numerically1,2,3 that, far from the source, significant 3-D effects can be induced. It is then necessary to use fully 3-D models that account for the coupling of the propagating acoustical energy from one vertical plane to another. Among these models, parabolic equation (PE) based models give good results for some benchmark problems. The main drawback is that they can be highly computational time consuming, especially for broadband calculations and/or for long-range paths where 1

May 9, 2006 19:56 WSPC/130-JCA

2

JCA2006

K. Castor and F. Sturm

the 3-D effects are clearly accentuated. The aim of this work is to solve realistic acoustical propagation problems that were unreachable in reasonable CPU time until now. Numerical simulations are carried out using the 3-D parabolic equation based model 3DWAPE4,5,6 on a massively parallel computer providing a high computational efficiency. The Message-Passing Interface (MPI) communication library is used in the parallel algorithm which is based on two principal levels respectively dedicated to reduce CPU times for broadband and CW signal propagation. The paper is organized as follows: Section 2 deals with computational complexities and communication issues associated to N×2-D and 3-D computations at a single frequency or for a broadband spectrum source. The parallelization strategy is presented in detail in Sec. 3.1 and the computational performances are analysed in Sec. 3.2. Finally, Sec. 4 shows that the new parallel algorithm allows to investigate significant 3-D effects at higher frequencies and at longer propagation ranges for the ASA 3-D wedge benchmark (Sec. 4.1) and for a realistic oceanic environment (Sec. 4.2).

2. Computational complexity analysis of 3DWAPE Complexity deals with the resources required during computation to solve a given problem. The most common resources are time (how many steps it takes to solve a problem) and space (how much memory it takes). Here, we only consider the time complexity which is the number of steps that an algorithm takes to solve a given problem, as a function of the size of the input. Because CPU time is directly related to complexity, it is necessary to analyse complexity to optimize an algorithm. An analysis of complexity gives the required number of operations, and then, can provide a relationship between the CPU time and the number of processors used. Ideally, for a parallel computation, complexity and CPU time are inversely proportional to the number of processors used. However, in a practical case, it is not true when the number of processors increases since communications between processors occuring in a parrallel computation are non-negligible and tend to lower the parrallel algorithm efficiency. Let us analyse now the computational complexity of the 3-D PE Code 3DWAPE. This model considers a multi-layered waveguide composed of one water layer and one or several fluid sedimental layers. The geometry of each layer is fully three-dimensional. Cylindrical coordinates are used where r, θ, and z, represent respectively the horizontal range from the source, the azimuthal angle, and the depth (increasing downwards) below the ocean surface. The 3DWAPE model uses a parabolic equation based approach and solves the acoustic problem in the frequency domain assuming a harmonic point source emitting at frequency f . It calculates the acoustic field ψ = ψ(r, θ, z; ω) (with ω = 2πf ) satisfying the

May 9, 2006 19:56 WSPC/130-JCA

JCA2006

3-D Acoustical Effects using a Multiprocessing PE

3

following outgoing equation ∂ψ (r, θ, z; ω) = ik0 ∂r

np X

1 ak,np X Y + 2 1 1 + bk,np X 1 + 4Y k=1

!

ψ(r, θ, z; ω)

(1)

and satisfying the initial condition ψ(r = r0 , θ, z; ω) = ψ (0) (θ, z; ω), with ψ (0) simulating the source. In Eq. (1), np is the order of the Pad´e terms, k0 = ω/c0 , with c0 a reference sound speed, and X , Y are operators defined by    ρ ∂ 1 ∂ 1 ∂2 2 X = nα − 1 + 2 and Y = 2 2 2 , (2) k0 ∂z ρ ∂z k0 r ∂θ

with nα (r, θ, z) = (c0 /c(r, θ, z)) (1 + iηα) the complex (to account for lossy layers) index of refraction, and ρ the density. The operator Y handles the azimuthal diffraction. By neglecting this term in Eq. (1), but conserving the azimuthal dependence in nα (r, θ, z), the 3-D model becomes a N×2-D model, or pseudo 3-D (i.e. without azimuthal coupling). The acoustic field ψ is related to the acoustic pressure Pb = Pb(r, θ, z; ω) by (1) Pb(r, θ, z; ω) = H0 (k0 r)ψ(r, θ, z; ω),

(3)

The acoustic field ψ satisfies a pressure-release boundary condition at the ocean interface z = 0, an outgoing radiation condition at infinity, a 2π-periodicity condition in azimuth and appropriate boundary conditions at each sedimental interface. The source located at b z = zS and r = 0 is represented by its spectrum S(ω).

The numerical method used to solve the 3-D parabolic equation (1) is similar to an alternating direction implicit (ADI) method. The first step is to split the 3-D parabolic equation into the following system     ρ ∂ 1 ∂ 2 ak,np nα − 1 + 2 np X ∂ψ k0 ∂z ρ ∂z    ψ(r, θ, z; ω), (r, θ, z; ω) = ik0 (4)  ρ ∂ 1 ∂ ∂r 2 k=1 I + bk,n n − 1 + α p k02 ∂z ρ ∂z   1 ∂ 2 ∂ψ i ∂2ψ 1+ 2 2 2 (r, θ, z; ω) = (r, θ, z; ω). (5) 2k0 r2 ∂θ2 4k0 r ∂θ ∂r

Let ∆r, ∆θ and ∆z be the increments respectively in range, in azimuth and in depth. Let also Nr , M and N be integers indicating the maximum number of mesh points respectively in range, azimuth and depth. Each of these equations are solved successively at any discrete range rn , 1 ≤ n ≤ Nr . Equation (4) is solved using a Crank-Nicolson integration in range and an accurate finite-element Galerkin method in depth. If one wishes to use the N×2-D approach, then only Equation (4) is solved. For a 3-D computation, both Eq. (4) and Eq. (5) must be solved. The azimuthal coupling part handled by Eq. (5) is then discretized using a (2` + 1)-point stencil schemes in azimuth, coupled with a Crank-Nicolson type rangestepping procedure. This (2` + 1)-point stencil scheme which corresponds to a higher-order

May 9, 2006 19:56 WSPC/130-JCA

4

JCA2006

K. Castor and F. Sturm

centered FD scheme and can be seen as an extension of the 2nd-order FD scheme (` = 1), allows one to reduce the required number of points in the azimuthal direction while still obtaining accurate solutions. At each range step, a N×2-D calculation process involves the inversion of M algebraic linear systems of order N (with tridiagonal matrices), which corresponds to the acoustic field calculation at successive adjacent azimuths θ1 , θ2 , . . . , θM , as shown in Fig. 1(a). Each linear system is solved using a Gaussian algorithm optimized for tridiagonal matrices. (The number of arithmetic computations required to invert a such system is of the order of N ). Each inversion must be repeated np times, i. e. for each term of the Pad´e series expansion. Consequently, the total number of operations required to solve Eq. (4) is expressed as 2D Nop = O (Nr np M N ) .

(6)

At each range step, the azimuthal coupling part requires the inversion of N algebraic linear systems of order M with entries in the upper left and lower right corners of the banded matrices corresponding to the continuity condition in the azimuthal direction. The bandwidth depends on the value of parameter `. As shown in Fig. 1(b), this now corresponds to the acoustic field calculation at successive fixed depths z1 , z2 , . . . , zN . For benchmark problems, a symmetry in θ is often considered to simplify the algorithm. As the matrices depend only on the discretization in range, a LU decomposition followed by a substitution technique is used to solve the equation at each range step. The number of required operations for the LU decomposition and for the substitution are respectively `2 M and `N M . For the azimuthal part, the operation cost is then  θ Nop = O Nr `2 M + `M N ≈ O (Nr `M N ) (7)

since N is large compared to `. Finally, the total computational complexity for fully 3-D problems is then the sum of the computational complexities as following 3D N×2D θ Nop = Nop + Nop .

(8)

The formula obtained show that complexity depends linearly on the number of Pad´e terms, numbers of discrete ranges Nr , depths N and azimuths M . Assuming now a broadband source, solving a pulse propagation problem with the Fourier synthesis approach requires to decompose the source pulse using a Fourier transform, then to select a frequency spacing and solve the 3-D propagation problem for each discrete frequency within a frequency-band of interest, and lastly to perform inverse Fourier transforms of the frequency-domain solutions to obtain the time signal at any given receiver. The pulse response at a specific receiver is obtained via a Fourier transform of the frequency-domain solution using Z +∞ 1 (1) −iωt b S(ω)H dω, (9) P (r, θ, z; t) = 0 (k0 r)ψ(r, θ, z; ω)e 2π −∞

May 9, 2006 19:56 WSPC/130-JCA

JCA2006

3-D Acoustical Effects using a Multiprocessing PE r + ∆r r

5

r + ∆r z1 z2 z3

r

zN θM

θ1

θ2

θ3

(a) N×2-D part

(b) Azimuthal part

Fig. 1. Resolution schemes for (a) Eq. (4) and (b) Eq. (5).

b where S(ω) is the source spectrum and Pb(r, θ, z; −ω) = Pb(r, θ, z; ω) so that the time-domain acoustic pressure P = P (r, θ, z; t) is real-valued. Solving the 3-D propagation problem for each discrete frequency is performed using the parabolic approach presented above. In summary, the computational complexity analysis gives an indication about the required CPU time for a calculation and shows that the 3DWAPE model can be naturally parallelised. The relevant parallelization strategy is straightforward and consists in two stages: First, a broadband computation is handdled by distributing the calculations at each frequency on different processors. Second, the calculations at one single frequency are accerated by distributing all the required matrix inversions on different processors. In the following section, this parallelization strategy is described in detail. 3. Multiprocessor Implementation 3.1. Parallelization strategy The parallelization method is based on two principal algorithms: a frequency decomposition (FD) and a spatial decomposition (SD). We describe here each parallelization algorithm when they are used separately or simultaneously. Some examples are given to illustrate the method. In the following, some classical parameters are used to characterize the computational performances of a parrallel calculation. Suppose P processors are used. The speedup is defined as the time to complete an algorithm with only one processor divided by the time to complete the same algorithm with P processors. The efficiency is defined as the ratio of the speedup over P . Ideally, in a parallel calculation, the CPU times are inversely proportional to the number of processors used. However, communications between processors and data storage can deteriorate the efficiency of the parallel calculations. 3.1.1. First parallelization algorithm: frequency decomposition (FD) The first parallelization algorithm is specially dedicated to handle efficiently broadbandsignal propagation. Computing the time signal at a given receiver requires to perform the following steps: First, the source pulse is decomposed using a Fourier transform. Then the 3-D propagation problem is solved independently at each frequency. Several numerical pa-

May 9, 2006 19:56 WSPC/130-JCA

6

JCA2006

K. Castor and F. Sturm

rameter values (e.g. ∆r, ∆θ, ∆z) depend on the acoustic wavelength. Consequently, higher frequencies are more CPU time consuming. Thus, in order to optimize the first parallelization algorithm, a cyclical repartition of all the discrete frequencies is used to equilibrate the processor workload. For example, suppose that the source spectrum is sampled using 256 discrete frequencies denoted fi , 1 ≤ i ≤ 256, and that P = 64 processors are available. The total number of frequencies is divided into 64 frequency groups: F1 = {f1 , f65 , f129 , f193 }, F2 = {f2 , f66 , f130 , f194 }, . . . , F64 = {f64 , f128 , f192 , f256 }. All the frequencies within the same frequency group are then handled by the same processor. Hence, each processor handles both lower and higher frequencies within the frequency bandwidth. At the end of all the calculations, the frequency-domain solutions at the desired receiver position are collected by only one processor in order to perform an inverse Fourier transform. Note that the communications between processors occur only at the beginning and at the end of the whole process. Communication time is negligible and good performances with an efficiency close to 100 percent are thus expected. 3.1.2. Second parallelization algorithm: spatial decomposition (SD) The second parallelization algorithm is based on a spatial decomposition of the 3-D PE calculations. It is thus dedicated to accelerate the calculations at one single frequency. Suppose the acoustic field is known at a given range r. As explained in Section 2, the computation of the solution at the next discrete range r +∆r is achieved in two successive steps corresponding to the N×2-D part and the azimuthal coupling part. The first step requires inverting M algebraic linear systems (see Fig. 1(a)). The parallelization strategy consists in distributing these inversions on different processors. Once this first step is accomplished, the results of each single processor are re-distributed on the other processors to get prepared for the azimuthal coupling part. The same parallelization strategy is then used to invert the N linear systems of the second step (see Fig. 1(b)). The results need to be re-distributed between processors before starting the computation at the next discrete range. For this second parallelization algorithm, the efficiency may be limited due to non-negligible communication times.

Processor ID N×2-D Azimuthal coupling

P0 Θ1 = {θ1 , . . . , θ120 } Z1 = {z1 , . . . , z200 }

P1 Θ2 = {θ121 , . . . , θ240 } Z2 = {z201 , . . . , z400 }

P2 Θ3 = {θ241 , . . . , θ360 } Z3 = {z401 , . . . , z600 }

Table 1. Illustration of the spatial segmentation of the calculation domain occuring in the second parallelization algorithm at a single frequency (example with M = 360, N = 600 and P = 3).

For example, suppose that a single-frequency calculation requires 360 azimuthal points (M = 360) and 600 depth points (N = 600), and that 3 processors are used (P = 3).

May 9, 2006 19:56 WSPC/130-JCA

JCA2006

3-D Acoustical Effects using a Multiprocessing PE

7

Then, for this second parallelization algorithm, the spatial domain is first decomposed into 3 groups of azimuth denoted Θ1 , Θ2 , Θ3 . All the azimuths which belong to the same azimuthal group are then handled by the same processor. Once this N×2-D step is achieved, the azimuthal coupling is then performed by applying the same procedure in the depth direction. The spatial domain is decomposed into 3 groups of depth denoted Z1 , Z2 , Z3 . All the depths within the same depth group are then handled by the same processor. This example is summarized in Table 1.

3.1.3. Combination of both parallelization algorithms To perform broadband spectrum signal computations, both parallelization algorithms can be combined. In this case, the aim is to allocate more than one processor when performing each single-frequency calculation. Consequently, the number of frequency groups, denoted G, must be less than the number of processors P . Note that G = P means that only the first parallelization algorithm is used. The processors are also portioned into G groups: P1 , P2 , . . . , PG . A frequency group Fi is associated to a processor group Pi . For each successive frequency of Fi , all the processors of Pi are used simultaneously to handle the second parallelization algorithm. To clarify the method, an example is explicited in Table 2.In this example, P = 10 processors are used to handle all the calculations. The acoustic field is calculated for an impulsive source discretized in 256 frequencies. For the first parallelization algorithm, we choose to split this number of frequencies in G = 3 frequency groups.

Freq. groups Proc. groups

F1 = {f1 , f4 , . . . , f254 } P1 = {P0 ,P1 ,P2 ,P3 }

F2 = {f2 , f5 , . . . , f255 } P2 = {P4 ,P5 ,P6 }

F3 = {f3 , f6 , . . . , f256 } P3 = {P7 ,P8 ,P9 }

Table 2. Illustration of the frequency distribution occuring in the first parallelization algorithm (example with 256 frequencies, P = 10 processors and G = 3 frequency groups).

3.2. Computational performances The parallelised version of the 3DWAPE code has been tested and validated on several threedimensional benchmarks7 . In this section, some results of the high performance computing are presented for the ASA 3-D wedge shaped waveguide which configuration is described in detail in Ref. 11,5,12,4 . This original benchmark presents a computational cost that is particularly well suitable for studying the performances of both parallelization algorithms. Indeed, in order to be able to show the evolution of the computational performances when the number of processors increases, a compromise is necessary. On the one hand, when using only one processor to perform all the calculations, the CPU time cost of the benchmark should not be too high (i.e. let say less than two days) for convenience. On the other hand, the CPU time has to be sufficiently high when performing the calculations with the highest

May 9, 2006 19:56 WSPC/130-JCA

8

JCA2006

K. Castor and F. Sturm

number of processors available (here 64 processors) in order to limit the communication time cost relatively to the calculation time cost. Indeed, for a high number of processors, a low efficiency is expected when the computational cost of a benchmark is particularly low. The source pulse is centered at 25 Hz with a 40 Hz-bandwidth which is decomposed in 281 discrete frequencies. The maximum computation depth and range are respectively 1000 m and 25 km-range. In this case, a calculation at 25 Hz requires M = 3240 azimuthal points and N = 500 depth points4 . The parallel machine used is a HP SC45 cluster with 214 nodes, each of which contains 4 processors running at 1.25 GHz and 4 GB of RAM. Table 3 presents CPU times corresponding to a calculation at 25 Hz and 25 km-range for N×2-D and 3-D calculations. It is important to note that, even if only a small number of azimuthal points is necessary for the N×2-D calculations, the same number (M = 3240) than in the 3-D calculations is used here in order to estimate the computational time cost only related to the azimuthal coupling. Table 4 shows the results for the source pulse propagated at 16 km-range. Note that all the results presented here are not averaged. Hence, the CPU times can slighly change from one computation to another, leading to efficiency values greater than 1. Good performances for both parallelization algorithms are obtained. As expected, the first parallelization algorithm (Tab. 4) provides a better efficiency than the second one (Tab. 3) since fewer communications between processors are required: With 64 processors, a 76 %-efficiency is provided by the first parallelization algorithm whereas the efficiency of the second parallelization algorithm is around 50 to 60 %.

Table 3. 3-D ASA wedge benchmark results at a single frequency (f = 25 Hz). Number of Processors

N×2-D CPU time (s)

N×2-D speedup

N×2-D efficiency

3-D CPU time (s)

3-D speedup

3-D efficiency

1 2 4 8 16 32 64

20’ 45.73” 9’ 59.61” 4’ 49.9” 2’ 23.6” 1’ 20.67” 47.28” 34.09”

1 2.08 4.30 8.67 15.44 26.35 36.54

1 1.04 1.07 1.08 0.97 0.82 0.57

52’ 1.54” 22’ 4.31” 9’ 56.57” 6’ 41.6” 4’ 4.76” 2’ 22.55” 1’ 40.13”

1 2.36 5.23 7.77 12.75 21.90 31.17

1 1.18 1.31 0.97 0.80 0.68 0.49

Table 3 shows that a 3-D calculation at a single frequency of 25 Hz with only one processor takes about one hour of computation. For the same calculation at 50 Hz, the incremental steps in depth, in range and in azimuth must be doubled. According to Eq. 8, the complexity is then multiplied by a factor 8 when doubling frequency. For example, we expect to have a CPU time respectively around 8 hours and 64 hours for a CW calculation at 50 Hz and 100 Hz. These computational costs are not acceptable and forbid completely broadband calculations. However, Table 3 shows that the calculation at 25 Hz using 64 processors takes less than 2 minutes. Then, supposing a good efficiency for higher frequencies, the same

May 9, 2006 19:56 WSPC/130-JCA

JCA2006

3-D Acoustical Effects using a Multiprocessing PE

9

Table 4. 3-D ASA wedge benchmark results for the pulse propagation. Number of Processors

4-D CPU time (s)

4-D speedup

4-D efficiency

1 2 4 8 16 32 64

1day 10H. 59’ 52.9” 17H. 18’ 59.03” 9H. 33’ 32.1” 4H. 42’ 6.89” 2H. 26’ 29.89” 1H. 13’ 44.57” 43’ 3.64”

1 2.02 3.66 7.44 14.33 28.48 48.77

1 1.01 0.92 0.93 0.90 0.89 0.76

calculation at 100 Hz would take about 2 hours which is now reasonable and would allow broadband computations with an acceptable CPU time. 4. Investigation of 3-D acoustical effects 4.1. 3-D wedge benchmark at higher frequencies The parallelised 3-D PE algorithm allows us now to reach and to investigate the azimuthal coupling effects at higher frequencies and at longer propagating ranges with a reasonable computational time. In this section, we present some results obtained for the ASA 3-D wedge benchmark with a CW source for different frequency values: 25 Hz, 50 Hz, 75 Hz, and 100 Hz. The maximal computation range is 80 km. A Pad´e 1 approximation (np = 1) is used in depth here, since no significant difference has been observed previously6 with the solution using a Pad´e 2 approximation (np = 2). All the parameter values that define the computational grid (i.e. the incremental steps in range ∆r, and in depth ∆z, but also the arclength segment ∆S at the maximum computation range) need to be reduced as a fraction of the acoustic wavelength when increasing frequency. N×2-D and 3-D computations were carried out using ∆r = 10m and ∆z = 1m at 25Hz in accordance with the previous paper Ref.4 that describes in detail the calculations of the 3-D wedge benchmark at this frequency value. These values correspond to ∆r = λ/6 and ∆z = λ/60 where λ denotes the acoustic wavelength. By using this criteria, the values of ∆r and ∆z are then adjusted for the other frequencies. The use of an eighth-order FD scheme in azimuth allows to reach an accurate solution with a less restricting criterion for ∆S with respect to the acoustic wavelength. The adjustement of the number of azimuthal points M that determine the value of the azimuthal increment ∆θ is particularly important to reach a solution that describes accurately all the 3-D effects. A convergence study is required to determine the number of azimuthal points that is well-adapted for each frequency: Here, the convergence is determined by an observation of a stabibilized solution at θ = 90◦ when doubling M , the number of azimuthal points. CPU times for all the calculations when increasing M are reported in Table 5. For each frequency, the highest value of M in Table 5 corresponds to the value for which convergence is reached. As an evidence, the azimuthal mesh is oversampled at short range. Using as in Ref.

17

May 9, 2006 19:56 WSPC/130-JCA

10

JCA2006

K. Castor and F. Sturm

Table 5. CPU Time (s) for the 3-D ASA wedge benchmark at different CW source frequencies using 64 processors. The number of azimuthal points is doubled until the convergence is reached. M 360 720 1440 2880 5760 11520 23040 46080

25 Hz 2’ 6.36” 2’ 21.45” 3’ 16.42” 5’ 27.56” 10’ 25.83” 20’ 35.17”

3’ 4’ 8’ 15’ 32’ 57’

50 Hz

75 Hz

100 Hz

47.43” 13.57” 37.61” 59.73” 33.96” 16.54”

7’ 35.37” 11’ 49.82” 18’ 10.72” 37’ 13.8” 1H. 10’ 8.1” 2H. 27’ 8.93” 4H. 38’ 51”

10’ 0.87” 15’ 21.48” 26’ 21.01” 48’ 15.32” 40’ 52.75” 16’ 31.11” 23’ 51.86” 45’ 27.56”

1H. 3H. 6H. 12H.

an azimuthal increment that depends on range would be preferable since it would certainly reduce the CPU times. Besides, recall that due to the 1/r-term in the azimuthal Eq. (5), the problem has a singularity at r = 0. Numerical simulations showed that using too many points in azimuth can lead to some numerical problems (arithmetic overflow). Adapting the value of the azimuthal increment with range would also permit to avoid this overflow problem around the source. This is currently not implemented. Here, the computations were performed using double precision arithmetic to avoid any numerical overflow close to the source. Vertical slices of the transmission loss in the cross-slope direction are shown in Fig. 2 for the N×2-D and the 3-D solutions. For a better comparison between the 2-D and 3-D cases, TL-vs-range curves (in dB ref 1 m) corresponding to the cross-slope direction and to a receiver depth of 30 m are displayed in Fig. 3. The 2-D and 3-D algorithms have been initialized at r = 0 using a Greene’s source. For each frequency, the 2-D field exhibits for all ranges the interference pattern of all the propagating modes initially present at the source. For the 3-D solutions, the interference patterns are not the same during propagation. These changes are due to the horizontal refraction of each propagating mode. 3-D effects appear at shorter ranges for higher modes than for lower modes since their grazing angle is higher. This has been reported in detail in the literature4,1,2 . Indeed, at 25 Hz, across-slope, one can identify clearly the cut-off range of mode 3, mode 2 and mode 1 respectively at 11 km, 16 km, and 40 km ranges. These cut-off ranges are also present with increasing frequency values and can be seen on each suplot. For convenience, the cut-off ranges only for modes 1, 2, 3 are marked on each subplot. Note that, looking at Fig. 2, the cut-off range of mode 1 is present before 80 km at both frequencies 25 Hz and 50 Hz but not at 75 Hz and 100 Hz. In order to observe more precisely the horizontal refraction effects, the 3-D PE algorithm can be initialized by each individual propagating mode. We performed all the calculations for each propagating mode at each of the four frequency values but only one example is given here for an excitation of mode 3. Figure 4 displays the modal ray diagrams of mode 3 in the horizontal plane for different frequencies. These modal ray paths were calculated

May 9, 2006 19:56 WSPC/130-JCA

JCA2006

50

80

50

80

60

100

60

150

40

150

40

200

20

200

20

0

250

250

0

Nx2−D, 25Hz 40 RANGE (km)

60

3−D, 25Hz

80

20

40 RANGE (km)

80

50

80

50

80

60

100

60

150

40

150

40

200

20

200

20

250

0

250

0

Nx2−D, 50Hz 20

40 RANGE (km)

60

3−D, 50Hz

80

20

40 RANGE (km)

60

80

50

80

50

80

100

60

100

60

150

40

150

40

200

20

200

20

250

0

250

DEPTH (m)

DEPTH (m)

60

100

DEPTH (m)

DEPTH (m)

20

0 3−D, 75Hz

Nx2−D, 75Hz 20

40 RANGE (km)

60

80

20

40 RANGE (km)

60

80

50

80

50

80

100

60

100

60

150

40

150

40

200

20

200

20

0

250

250

DEPTH (m)

DEPTH (m)

11

100

DEPTH (m)

DEPTH (m)

3-D Acoustical Effects using a Multiprocessing PE

0

Nx2−D, 100Hz 20

40 RANGE (km)

60

80

3−D, 100Hz 20

40 RANGE (km)

60

80

Fig. 2. Transmission loss (vertical slices for a receiver at a fixed azimuth of 90◦ ) corresponding to 2-D (left plots) and 3-D (right plots) calculations at 25 Hz, 50 Hz, 75 Hz, and 100 Hz. The cut-off ranges for mode 1, 2, 3 are marked by the dashed lines on each 3-D subplot.

using as in Ref.13,4 . In Fig. 5, the vertical slices of the transmission loss in the cross-slope direction with 3-D PE calculations are represented. We observe a good agreement between the two sets of subplots. Indeed, across-slope, one can observe a succession of three distinct zones, denoted I, II, and III, corresponding to one single modal-ray arrival (zone I), multiple modal-ray arrivals (zone II), followed by a shadow zone (zone III) for which there is no arrival. The transition between zones II and III exhibits a higher intensity zone (caustic). Zone II starts roughly at the same range independently of the frequency source. This behaviour is in concordance with the modal ray paths in the horizontal plane (Fig. 4): A higher frequency source leads to a further excursion of the acoustic field in the up-slope direction. The width of the multiple modal-ray arrival zone increases with frequency (Note

May 9, 2006 19:56 WSPC/130-JCA

12

JCA2006

K. Castor and F. Sturm

−50

TRANSMISSION LOSS (dB re 1m)

2−D 3−D (M=5760) −60

−70

−80

−90

−100

−110 0

25Hz 10

20

30

40 50 RANGE (km)

60

70

80

−50

TRANSMISSION LOSS (dB re 1m)

2−D 3−D (M=11520) −60

−70

−80

−90

−100

−110 0

50Hz 10

20

30

40 50 RANGE (km)

60

70

80

−50

TRANSMISSION LOSS (dB re 1m)

2−D 3−D (M=23040) −60

−70

−80

−90

−100

−110 0

75Hz 10

20

30

40 50 RANGE (km)

60

70

80

−50

TRANSMISSION LOSS (dB re 1m)

2−D 3−D (M=46080) −60

−70

−80

−90

−100

−110 0

100Hz 10

20

30

40 50 RANGE (km)

60

70

80

Fig. 3. Transmission loss for a receiver at 30 m-depth and at a fixed azimuth of 90◦ , corresponding to N×2-D (dashed lines) and 3-D (solid lines) calculations at 25 Hz, 50 Hz, 75 Hz, and 100 Hz.

May 9, 2006 19:56 WSPC/130-JCA

JCA2006

4

4

3

3

2

2

1

1

UP−SLOPE (km) →

UP−SLOPE (km) →

3-D Acoustical Effects using a Multiprocessing PE

0

−1

−2

−3

0

10

20

30 40 50 ACROSS−SLOPE RANGE (km)

60

70

−4

80

4

4

3

3

2

2

1

1

UP−SLOPE (km) →

UP−SLOPE (km) →

0

−1

−2

−3

−4

0

−1

−2

0

10

20

30 40 50 ACROSS−SLOPE RANGE (km)

60

70

80

0

10

20

30 40 50 ACROSS−SLOPE RANGE (km)

60

70

80

0

−1

−2

−3

−4

13

−3

0

10

20

30 40 50 ACROSS−SLOPE RANGE (km)

60

70

80

−4

Fig. 4. Modal ray diagrams (top view) obtained for mode 3 at 25 Hz, 50 Hz, 75 Hz, and 100 Hz.

that it is really null at 25Hz), and so the onset of the shadow zone is accordingly shifted in range. These 3-D effects have already been demonstrated by (e.g.) Harrison16,14,15 : In Ref.14 , analytical expressions describing ray paths and shadow zone boundaries were derived for several bottom geometries (e.g. wedge, ridge, seamount) using a ray approach. Note that for mode 3, the shadow zone starts before the maximum computation range (80 km) at each of the 4 frequencies considered and can thus be clearly observed on each subplot of Fig. 4. Finally, it is worthfully to note that 3-D PE calculations predict the presence of mode 2 across-slope (evident in the shadow zone) althought the initial field only included mode 3 (Fig. 5). This is due to coupling phenomena probably occuring during the downslope propagation of mode 3. Of course, the presence of mode 2 cannot be in the modal-ray solutions based on adiabatic mode theory. 4.2. Computations in a Realistic Environment Realistic numerical simulations are possible using some classical geophysical models for the ocean bathymetry and the sound speed profiles. The feasibility of the procedure is illustrated by focusing on an example in the Mediterranean sea close to the East coast of Corsica (Fig. 6). The point source is located at 30 m-depth, +42.5◦ -latitude and +9.7◦ longitude. The Smith and Sandwell data set8 is used in the following calculations providing an average sampling of 2’ (i.e. approximately 1.4 km in longitude and 1.8 km in latitude in this region). The GDEM-V data set9 with a 30’ resolution is used to include the 16 sound speed profiles in the region of interest. Some preliminary results concerning this example have been already published for a

May 9, 2006 19:56 WSPC/130-JCA

14

JCA2006

K. Castor and F. Sturm

90

50

90

50

80

80

100

70

60 50

150

40 30

200

DEPTH (m)

DEPTH (m)

70

100

60 50

150

40 30

200

20

20

10

250

ZONE I

3−D, 25Hz 20

40 RANGE (km)

10

250

0

ZONE III

60

ZONE I ZONE II

0

ZONE III

−10

3−D, 50Hz

80

20

40 RANGE (km)

60

90

50

90

50

80

80 70

60 50

150

40 30

200

20

DEPTH (m)

DEPTH (m)

70

100

100

60 50

150

40 30

200

20

10

250

ZONE I

ZONE II

0

ZONE III

3−D, 75Hz 20

40 RANGE (km)

60

−10

80

10

250

ZONE I

ZONE II

−10

0

ZONE III

3−D, 100Hz

80

20

40 RANGE (km)

60

−10

80

Fig. 5. Transmission loss (vertical slices for a receiver at a fixed azimuth of 90◦ ) for an excitation of mode 3, corresponding to 3-D calculations at 25 Hz, 50 Hz, 75 Hz, and 100 Hz. The three distinct zones are delimited by vertical dashed lines on each subplot.

Table 6. CPU Time (s) for the realistic example in the Mediterranean sea at different CW source frequencies using 64 processors. The number of azimuthal points is doubled until the convergence is reached. M 360 720 1440 2880 5760 11520

15 Hz

45 Hz

60 Hz

36.91” 42.53” 49.88” 1’ 2.5” 1’ 40.98”

2’ 26.25” 3’ 0.65” 4’ 28.81” 7’ 20.44” 13’ 10.5” 25’ 23.85”

29’ 31.31” 42’ 4.41” 1H. 2’ 38.65” 1H. 22’ 57.02” 2H. 20’ 28.31” 4H. 18’ 42.7”

15 Hz-CW source10 . Calculations for a CW source at 15 Hz, 45 Hz, and 60 Hz have been performed here. The algorithm is initialized at r = 0 with a Greene’s source and the maximal computation range is 50 km. Here, a Pad´e 2 approximation (np = 2) in depth is used. A fourth-order FD scheme in azimuth is used since the eighth-order FD scheme used for the benchmark (Sec. 4.1) is only implemented for an environment with a symetry about the θ = 0 plane, so it cannot handle a varying bottom topography. The incremental steps in

May 9, 2006 19:56 WSPC/130-JCA

JCA2006

3-D Acoustical Effects using a Multiprocessing PE

15

Bathymetry

Depth (m)

Bathymetry

2000

2500

100

2500

80

40

θ=240°

500

20

0

0

0

−500

500

(km)

−500

1500 1000

60

−20

Corsica −40

00

10

1000

0

00

−600 −800

1000

−1000

500

−1200

0

−1400

−500

−1600 −1800 −2000 −50

00

0 (km)

15

0

50

−400

1500

−1500

0

−50 −

2000

−1000

θ=90°

−1

1000

500

−100 −100

500

0

−60 −80

0 −200

50

50 0 0

50

−50 (km)

100

(km)

Fig. 6. Maps of the region of interest in the Mediterranean sea (East coast of Corsica). The point source is represented by the white central dot on the left subplot, and by the black dot on the vertical dashed line of the right subplot.

2−D 3−D

TRANSMISSION LOSS (dB re 1m)

−60

−70

−80

−90

−100

−110

−120 0

15Hz 10

20 30 RANGE (km)

40

50

Fig. 7. Transmission loss for a receiver at 30 m-depth and at a fixed azimuth of 90◦ , corresponding to N×2-D (dashed line) and 3-D (solid line) calculations at 15 Hz.

range, and in depth, correspond respectively to ∆r = λ/5 and ∆z = λ/20 where λ is the acoustic wavelength. As previously explained in Sec. 4.1, the number of azimuthal points is determined by a convergence study corresponding to an observation of a stabibilized solution at θ = 90◦ when doubling M , the number of azimuthal points. The CPU times for all the

May 9, 2006 19:56 WSPC/130-JCA

16

JCA2006

K. Castor and F. Sturm

0

DEPTH (m)

100 80

200

60 400

40 20

600 0

10

Nx2−D, 15Hz 20 30 40 50 RANGE (km)

0

DEPTH (m)

100 80

200

60 400

40 20

600 0

10

3−D, 15Hz 20 30 40 50 RANGE (km)

Fig. 8. Transmission loss (horizontal slices for a receiver at 30 m-depth and vertical slices for a receiver at a fixed azimuth of 90◦ ) corresponding to N×2-D (left plots) and 3-D (right plots) calculations at 15 Hz.

calculations when increasing M are reported in Table 6. For each frequency, the highest value of M in Table 6 corresponds to the value for which convergence is reached. Figure 8 shows the transmission losses at 90◦ -azimuth (see left subplot of Fig. 6) corresponding to N×2-D and 3-D calculations at 15 Hz. For the same azimuth, Fig. 7 displays the TL-vsrange curve (in dB ref 1 m) corresponding to a receiver depth of 30 m. Both figures confirm the significant presence of horizontal refraction effects: The modal structure of the field is strongly modified along the East coast of Corsica. The N×2-D solution clearly shows three propagating modes at all ranges along the θ = 90◦ -azimuthal direction, whereas the 3-D solution exibits an horizontal deviation of the acoustical energy which leads to shadowing effects for mode 3 and mode 2 respectively at 25 km and 45 km-ranges along θ = 90◦ . These effects are qualitatively the same as the ones that have been analyzed and quantified in detail in ASA 3-D wedge benchmark studies5 although the environment parameters considered here are different. We also illustrate the feasability of the procedure at a higher frequency of 60 Hz by showing transmission losses corresponding to N×2-D and 3-D calculations (Fig. 10) at 240◦ -azimuth (see left subplot of Fig. 6). Fig. 9 displays for the same azimuth the TL-vs-range curve (in dB ref 1 m) corresponding to a receiver depth

May 9, 2006 19:56 WSPC/130-JCA

JCA2006

3-D Acoustical Effects using a Multiprocessing PE

2−D 3−D

−60

TRANSMISSION LOSS (dB re 1m)

17

−70

−80

−90

−100

−110

−120 0

60Hz 10

20 30 RANGE (km)

40

50

Fig. 9. Transmission loss for a receiver at 30 m-depth and at a fixed azimuth of 240◦ (right plot), corresponding to N×2-D (dashed line) and 3-D (solid line) calculations at 60 Hz.

of 30 m. At 60 Hz, significant 3-D effects are also clearly present: Fig. 10 shows that mode 3 disappears around 25 km-range due to horizontal refraction. This shadowing effect of mode 3 during propagation can be verified by observing the mode cut-off in the penetrating bottom between 30 km and 40 km-ranges. 5. Conclusion Sound propagation modeling in 3-D and/or 4-D has been often limited by computational time issues in the past. To overcome this difficulty, an existing 3-D PE code has been implemented in a multiprocessor environment. The parallelization strategy chosen is a suitable two-level procedure: a frequential and a spatial decomposition of the 3-D PE calculations. The parallelised algorithms have been validated and the computational performances have been analysed for the 3−D ASA wedge benchmark. With the first parallelization algorithm, a broadband signal calculation is efficiently accelerated by distributing independently on different processors the calculations for each frequency. As expected, good computational performances are obtained in this case since only few communications between processors are needed. The second parallelization algorithm consists in accelerating the calculations at one single frequency by distributing on different processors all the required matrix inversions. This second algorithm provides also good results although communications are no more negligible. The parallelised 3-D PE code allowed a preliminary investigation of 3-D effects at higher frequency and longer propagation range for the wedge benchmark. When increasing fre-

May 9, 2006 19:56 WSPC/130-JCA

18

JCA2006

K. Castor and F. Sturm

0

DEPTH (m)

100 100 80 200

60 40

300 20 400 0

10

Nx2−D, 60Hz 20 30 40 50 RANGE (km)

0

DEPTH (m)

100 100 80 200

60 40

300 20 400 0

10

3−D, 60Hz 20 30 40 50 RANGE (km)

Fig. 10. Transmission loss (vertical slices for a receiver at a fixed azimuth of 240◦ ) corresponding to N×2-D (left plots) and 3-D (right plots) calculations at 60 Hz.

quency, we observed that for each propagating mode, the modal deviation due to the 3-D wedge is in agreement with modal ray path calculations. Furthermore, we gave a highlight to an interesting result concerning modal coupling phenomena occuring during the downslope propagation. It has been illustrated in this paper with the presence of mode 2 in the shadow zone of mode 3. This mode coupling effect has been pointed out but it would need to be analysed in detail in future works. In this paper, we only focused on the 3-D acoustical effects at one single frequency. Hence, our concern will also focus on broadband signal propagation associated with a study of modal dispersion as in Ref.4 . Reasonable CPU times are expected. Parallel computations overcome CPU time limitations and make possible the analysis of 3-D acoustical effects for different propagation scenarii with higher signal frequency and/or propagation distance. It is also now possible to use the parallelised version of the code in more realistic configurations including geophysical data: An example has been presented in this paper. Modal deviation and shadowing effects due to bathymetric slopes have been clearly observed in our example in the Mediterranean Sea for two distinct azimuths and

May 9, 2006 19:56 WSPC/130-JCA

JCA2006

3-D Acoustical Effects using a Multiprocessing PE

19

two different frequencies. The analysis of 3-D effects in other realistic oceanic environments including different varying bathymetry and/or significant sound speed gradients can be addressed.

Acknowledgments The authors would like to express their thanks to Olivier Bertrand for his technical contribution to the parallelization of the code. References 1. S. Glegg, G. Deane, I. House, Comparison between theory and model scale measurements of the three-dimensional sound propagation in a shear supporting penetrable wedge, J. Acoust. Soc. Am., 94(4), 2334-2342 (1993). 2. A. Tolstoy, 3-D Propagation Issues and Models, J. Comp. Acoust., 4(3), 243-271 (1996). 3. C. Chen, J., T. Lin, D. Lee, Acoustic three-dimensional effects around Taiwan strait: computational results, J. Comp. Acoust., 7(1), 15-26 (1999). 4. F. Sturm, Numerical study of broadband sound pulse propagation in three-dimensional oceanic waveguides, J. Acoust. Soc. Am., 117(3), 1058-1079 (2005). 5. F. Sturm, Examination of Signal Dispersion in a 3-D Wedge-Shaped Waveguide using 3DWAPE, Acta Acustica united with Acustica, 88, 714-717 (2002). 6. F. Sturm, J.A. Fawcett, On the use of higher-order azimuthal schemes in 3-D PE modeling, J. Acoust. Soc. Am., 113, 3134-3145 (2003). 7. K. Castor, F. Sturm, P. F. Piserchia, Acoustical Propagation Modeling Using the ThreeDimensional Parabolic Equation Based Code 3DWAPE within a Multiprocessing Environment, J. Acoust. Soc. Am., 116(4), 2549-2550 (2004). 8. W. H. Smith, D. T. Sandwell, Global Seafloor Topography from Satellite Altimetry and Ship Depth Soundings, Science, 277, 1956-1962 (1997). 9. W. Teague, M. Carron, P. Hogan, A comparison between the generalized digital environmental model and levitus climatologies, J. Geophys. Res., 7167-7183 (1990). 10. K. Castor, F. Sturm, P. F. Piserchia, Analysis of 3-D acoustical effects in a realistic oceanic environment, — Proc. Int. Conf. Underwater Acoustic Measurements: Technologies & Results, Heraklion, Crete, Greece, 28th june–1st july 2005. 11. F. Jensen, C. Ferla, Numerical solutions of range-dependent benchmark problems in ocean acoustics, J. Acoust. Soc. Am., 87(4), 1499-1510 (1990). 12. E. Westwood, Broadband modeling of the three-dimensional penetrable wedge, J. Acoust. Soc. Am., 92(4), 2212-2222 (1992). 13. J. Fawcett, Modeling three-dimensional propagation in an oceanic wedge using parablic equation methods, J. Acoust. Soc. Am., 93(5), 2627-2632 (1993). 14. C. H. Harrison, Acoustic shadow zones in the horizontal plane, J. Acoust. Soc. Am., 65(1), 56-61 (1979). 15. C. H. Harrison, Wave solutions in three-dimensional ocean environments, J. Acoust. Soc. Am., 93(4), 1826-1840 (1993). 16. C. H. Harrison, Three-dimensional ray paths in basins, troughs, and near seamounts by use of ray invariants, J. Acoust. Soc. Am., 62(6), 1382-1388 (1977). 17. F. Sturm, N. A. Kampanis, Accurate treatment of a general sloping interface finite-element 3-D narrow-angle PE model, In Press.