Hardware solutions for radar processing FPGA / GPU / CPU Trade-offs J.F. Degurse – Y. Mourot – H.Cantalloube – L. Savy
Sondra Wokshop in Singapore March 21th 2012
1
Hardware components for radar processing
FPGA 2
GPU
CPU
Characteristics of candidate embedded products AMD Phen. II 1090T CPU
AMD Radeon E6760 GPU
NVIDIA GeForce GT240 GPU (GRA 111)
Xilinx Virtex-6 SW475T FPGA
SP GFLOPs
153.6
576
250
550
GFLOPS / W
1.23
16.5
7.1
13.7
GFLOPS / $
0.54
N/A
2.7
0.14
CPU 3
GPU
FPGA
Why use FPGAs in radar processing? Nowadays ADCs can sample signals at rates up to a few GigaSamples per second signal sampling on carrier frequency or at the first intermediate frequency in the radar receiver
Often two configurations are incoutered on modern versatile radars (different modes at different times) 1.Wideband + a few channels (1 to 4) 2.Narrowband + High dynamic + numerous channels (10 to 100 or even 1000 for the full digital radar)
Need to reduce the data rate Real time digital beamforming Real time filtering and decimation 4
FPGA
Digital Receiver
Synoptic of a digital receiver (One channel)
Analog Signal
ADC
16 bits 16 bits
FPGA
5
Benefits of Digital Receiver
Benefits Cost effective compared to analog solution I&Q are well balanced, as they are digitally filtered Highly reconfigurable
BUT Requires highly skilled people to program FPGAs
6
Why GPU for radar processing instead of CPU ? GFlops (Double precision) 20 000
(Double precision)
1000 T20 448
12
Kepler C3070
630 515 77
27
cm
cm
Tesla C2075 Potential benefits
Best ratios Gflops/W, Glops/$ 500 Gflops for 200 W on Telsa C2075 Cost (~ 2 K$) No ITAR issue (mass market) 100% costs and open source Scientific library available(CuBLAS for linear algebra) Existing programming environnement (CUDA)
Open questions How to formalize radar processing for GPU efficient implementation? What effective computing power for which radar processing?
• High performance on massively parallel algorithms • High level parallelization tasks e.g GPU : NVIDIA Tesla C2075 500 Gflops (double precision) 2.5 GFlops per Watt
Application case 1 at Onera : space surveillance ground radar Current supercomputer (100+ PowerPC G4 processors) is being replaced by a single GPU card
Mercury SC Price = 1 M$ (2003) TDP > 8 kW 9
Tesla C2075 Price = 2.5 K$ (2011) TDP = 215 W
Application case 1 at Onera : space surveillance ground radar
Performance Speedup GPU/CPU = x7.2 * All 8 cores active 10
Application case 1 at Onera : space surveillance ground radar
Digital beamforming + detection 1200 beams
Performance Speedup GPU/CPU = x16.5 * All 8 cores active 11
Application case 2 at Onera : STAP on Airborne Radar Rugged GPGPU product available for defense and aerospace market MAGIC 1
GE-IP’s rugged GRA111 NVIDIA GeForce GT240 GPU
CPU only = 16 GFlops peak, 60 W, 0.27 GFlops/W CPU+GPU = 250 GFlops, 100 W, 2.5 GFlops/W Courtesy of GE Intelligent Plateforms
12
Application case 2 at Onera : STAP on Airborne Radar Reference Processing Matrix inversion
w H = ( wqH Γth wq ) ⋅
N spatial channels
STAP Data Cube
M taps
τ * 1
w
w
τ
N channels
STAP Transversal filter
τ
w
τ w
w
* K ( M −1) +1
τ
…
* K +2
* 2
τ
…
* K +1
w
* K ( M −1) +2
K
STAP STAP Transversal Transversal filter filter
wqH Γth Γ −1Γth wq
τ * K
w
τ * K +K
w
…
M
K range gates
Mp pulses
wqH Γth Γ −1
τ
… * KM
w
FFT STAP channel
wqH = [1 1...1 000...0000] N times (N-1)M times
13
+
Application case 2 at Onera : STAP on Airborne Radar Matrix inversion
Double precision matrix inversion
1000 matrix of size 60x60
Performance Speedup GPU/CPU = x6.0 * All 4 cores active 14
** DP Matrix inversion
Application case 3 at Onera : SAR Processing
Processing time for one SAR image Stripmap mode Tacq=2’ Image size RangexAzimut =10 000 x 80 000 pixels (+1*Intel X5650)
To reach Tesla C2050’s performances, 11 CPUs are needed ! 15
Application case 3 at Onera : SAR Processing
Comparaison GPU/CPU at same performances : Price(€)
Electricity consumption (W) 12000
1200
10000
1000
8000
800
6000
600 4000
400
2000
200
0
0 (+1*Intel X5650)
Nvidia C2050 (+1*Intel X5650)
11*X5650
Nvidia C2050 11*X5650 (+1*Intel X5650)
Reaching the same performances with CPU will cost and consume 5 time more ! 16
Hardware solutions for radar processing Summary
FPGA are the only solution to face high data rate Broadband, data rate reduction (DDC), performances D igi R ec tal Highly complex programming, hard to modify eiv er Cost
CPU does efficiently sequential work (and controls the GPU) Easy programming, portability, easy algorithms evolution Fast on iterative / non easy parallelizable operations (CFAR,…) Performances , electric consumption, cost
GPU performs very well on huge parallel tasks Performances on parallel processing, easy algorithms evolution Efficiency, small size Data transfers, iterative operations 17