A DYNAMICALLY RECONFIGURABLE BLUETOOTH ... - Xun ZHANG

user design to remain in place while another portion is being updated. .... modules were translated into VHDL while others into Verilog. Final Assembly: all the ...
211KB taille 7 téléchargements 339 vues
A DYNAMICALLY RECONFIGURABLE BLUETOOTH BASE BAND UNIT John Esquiagola, Guilherme Ozari, Marcio Teruya, Marius Strum, Wang Chau Microelectronics Laboratory – Polythecnic School of São Paulo University Av. Prof. Luciano Gualberto, Trav. 3, Nro 158. Cidade Universitária São Paulo – Brazil jedward, galmeida, teruya, strum, wang @lme.usp.br

The approach of applying reconfigurable logic for data processing has been demonstrated in some areas such as video transmission, image-recognition and various pattern matching operations [3, 4]. Another area of interest is wireless systems, where tremendous computational capabilities are needed to allow for high data rates in the future. In the Bluetooth area, there are several implementations utilizing processors and specific hardware modules [5, 6]. These implementations utilize embedded processors like Xilinx Microblaze or Altera Nios, but do not present dynamic reconfiguration. The specification at a high abstraction level is possible in environments such as SystemC. SystemC is an emerging standard modeling platform based on C++ that supports design abstraction at the RTL, behavioral and system level [7, 8].

ABSTRACT In this paper, we present two different implementations of a dynamically reconfigurable Bluetooth Baseband Unit (BB_Unit) using Xilinx Virtex-II FPGAs. The design flow started from a non-RTL SystemC model that has been functionally verified against an untimed SystemC golden model developed at Synopsys. This model was progressively refined (using Synopsys tools) until a synthesizable RTL model was obtained. The Xilinx modular design methodology was used to derive the partial and total bitstreams. Two partitions of the BB_Unit were tested: header-payload (P1) and RX-TX (P2). The best results were obtained for the P2 partition on a XC2V250 component. Such a partition requires three sequential time intervals to process a Bluetooth packet: t_rec + t_ini + t_proc. The reconfigurable area occupied 4 columns and 24 bus-macros requiring a reconfiguration time of t_rec=480 Ps, much smaller than the time slot of 625 Ps specified to either transmit or receive a packet.

Our goal is to compare two different implementations (partitions) of a dynamically reconfigurable baseband module of a Bluetooth controller using a commercial Virtex-II Xilinx FPGA. The design started from a high level SystemC model that has been refined (and functionally verified) to the RTL level and finally synthesized using standard CAD tools. In section 2, the Bluetooth standard is described. Section 3 details the overall system architecture. Section 4 shows how the system has been partitioned to obtain two dynamic reconfiguration implementations. The design methodology is described as well. Section 5 presents the results obtained for each partition. Section 6 presents our conclusions and future work.

1. INTRODUCTION The concept of reconfigurable computing [1] was established many years ago. However, only recently it became achievable due to the availability of hardware offering the kinds of services necessary to apply them to "real-world" applications. Dynamic reconfiguration of FPGAs has recently become viable with the introduction of devices that allow high speed partial reconfiguration, e.g. the Xilinx Virtex Series [2]. Many of the systems designated as reconfigurable architectures can only be statically reconfigured. Static reconfiguration means to completely configure the device before system execution. A dynamically reconfigurable device allows a portion of a user design to remain in place while another portion is being updated. Currently, there is a growing trend in the industry to provide dynamically reconfigurable devices with varying degrees of configuration flexibility.

2. BLUETOOTH STANDARD Bluetooth is a standard for short distance wireless communications developed by the Bluetooth Special Interest Group (SIG) [9]. The Bluetooth technology was developed to replace cables connecting portable or desktop devices and to build low-cost wireless networks for mobile and portable devices. The Bluetooth Stack in Fig. 1. goes from the high level application layer to the low level radio frequency layer. The baseband layer has been implemented in this work. The Bluetooth standard operates at 2.4 GHz

This work was supported by CNPq (Conselho Nacional de Desenvolvimento Científico e Tecnológico) grant.

0-7803-9362-7/05/$20.00 ©2005 IEEE

148

in the ISM band (Industrial, Scientific, Medicine) with GFSK modulation (Gaussian Frequency Shift Keying).

(StreamProc) and the temporal synchronization (Corr and BTClock). The streams acquire their timing characteristics in the next level (PacketProc). The top level modules deal with the interfacing aspects that allow the connection of the BB_Unit with the processor that executes the Bluetooth upper layers tasks (ACCESSCTR, PACKETBUFFER, SRAM) and with the radio link (RADIOCTR). In the present work the BB_Unit was designed to communicate with a LEON [10] processor through an AMBA APB [11] interface.

The Bluetooth devices share the same channel from a network called piconet, with a single unit acting as a master, the other units acting as slaves. Up to seven slaves can be active in the piconet. The channel represents a pseudo-random hopping of 79 or 23 RF frequencies. The hopping sequence is unique for each piconet and is determined by the Bluetooth device address of the master. The nominal hop rate is 1600 hops/s. The standard defines two types of link between master and slaves: Synchronous Connection-Oriented link (SCO), and Asynchronous Connection-Less link (ACL).

3.1. BB_Unit Modules The TX/RX bit-stream processing varies depending on the type of the packet. The RX processing is always different from the TX processing. Fig. 3. shows the header bitstream processing tasks according to the Bluetooth specification [9]. An error check code is first added to the header (HEC). The resulting stream is then scrambled (Whitening) and codified (FEC 1/3). Fig. 4. shows the processing tasks of the payload. It is

Fig. 1. Bluetooth Stack Fig. 3. Header bitstream processing

3. OVERALL SYSTEM ARCHITECTURE similar to the header processing. An extra encryption step may be added after the CRC generation. FEC 2/3 coding is used instead of FEC 1/3. The whitening is the only obligatory process for each payload, all other processes are optional and depend on the type of packet and the enabled mode. The inverse processes (for both, the header and the payload) are carried out in the receiver. The BB_Unit has been decomposed into 8 modules, each

The overall system architecture of the Bluetooth Baseband Unit (BB_Unit) implements all the base band functionality required to establish ACL links. This behavior was described using SystemC 2.1.0. at a non-RTL level. The core scheme is presented in Fig. 2. Each module communicates directly with all modules of the same level and with the module that contains it. The hierarchy also represents functional affinity. The inner level of hierarchy corresponds to the bit processing modules (CRC, FEC, Cipher, FHEC and Whitening). The next level addresses the stream level processing

Fig. 4. Payload bitstream processing

performing one or more tasks: CRC (generation+checking), FEC 2/3 (encoding+decoding), FHEC (FEC 1/3 encoding+decoding + HEC generation + checking ), Whitening (whitening + de-whitening), BTClock, Corr, PacketBuffer, StreamProc, PacketProc. The bit level operations (error detection and correction, DC bias reduction and data encryption) require to deal with the imperfections of the RF channel. The CRC module detects the presence of errors in the payload bits and the FEC

Fig. 2. Bluetooth Baseband Unit

149

module corrects these errors. The FHEC does the same for the packet header bits. The Whitening module reduces the DC bias in the packet bitstream. At the stream level, the bit level modules are used in the StreamProc module to compose and decompose packet streams. The Corr module identifies the packets directed to the Bluetooth unit from the RF channel. It also constructs the access code used to communicate with the other units. The BTClock module regulates the timing in the baseband. PacketProc stands at the top and coordinates the operation of all lower modules. The PacketBuffer holds incoming and outgoing packets. The remaining blocks (CIPHER, ACCESSCTL, SRAM) were not implemented in the present version.

transmit all the required signals. For each RX/TX process, two reconfigurations of the system are required in order to process one packet (header and payload) within one Bluetooth time slot.

4. DYNAMIC SYSTEM IMPLEMENTATION A suitable Virtex –II FPGA was chosen for the static case to serve as a reference design. In order to reduce the size of the hardware, the "Virtual Hardware" technique was used. This technique uses the capacity of partial and total dynamic reconfiguration of the current FPGAS, allowing a big circuit to be partitioned into small sub-circuits. The Virtex-II FPGA devices from Xilinx have been used because of their partial reconfiguration capability. Based on the system’s architecture (set of modules) and on the tasks sequence, we performed two partitions: temporal (tasks schedule) and physical (module assignment to the fixed and reconfigurable areas). In our study, we tested two different temporal partitions for the dynamic reconfiguration of the system: header-payload and RX-TX.

Fig. 5. First dynamic architecture

4.2. RX - TX Partition Partition P2: the baseband bitstream processing functionality was split into packet reception (RX) and transmission (TX). This partition affected the StreamProc functionality. In the static solution, this module performs the following tasks: control of the multiplexer connecting the PacketBuffer to the other modules, modules initialization, reading the packet type to be transmitted, transmission and reception of the header and the payload. The static StreamProc was implemented as a set of five state machines1. Two new StreamProc modules were

4.1. Header - Payload Partition Partition P1: the baseband bitstream processing functionality was split into header and payload processing. Figs. 3 and 4 show that the header and payload bitstreams require different modules and their execution flow is sequential (header first and then payload). The header requires FHEC and Whitening modules, while the payload requires CRC, Whitening and FEC modules. The SystemC reference StreamProc code was modified in order to initialize the bitstream processing modules in different times: t1 for FHEC and t2 for CRC and FEC. A new state machine performs the reconfiguration process. More interfaces than in the static case are needed since the communication between the fixed and reconfigurable parts must be assured. The Whitening module is instantiated from inside the StreamProc because it is used for both, the header and the payload bitstream processing. The physical partition split the FPGA into a fixed and a reconfigurable area. Fig. 5 shows the partial architecture for time t2. Bus-macros are used to assure data communication between the fixed and reconfigurable modules. As bus-macros have only 4-bits width, we instantiated as many such modules as required to

Fig. 6. Second dynamic architecture

created by adding or adapting the original state machines. The PacketProc module controls when to load StreamProc TX or RX (times t1 or t2). The physical partition split the FPGA into three areas: two fixed and one reconfigurable. Fig. 6 shows the partial architecture for time t1. Only one reconfiguration is required within one Bluetooth time slot.

1

Each FSM controls: initialization, header TX, payload TX, header RX and payload RX.

150

x Final Assembly: all the modules are merged and referenced as in the top-level design. The previous placements and routings are preserved. All bitstreams configurations are created in this phase.

4.3. Design Flow Figure 7 shows the top-down design flow of the Bluetooth base band design. It is based on Synopsys and Xilinx CAD tools. We first developed a non-RTL SystemC model for the base band static solution. We adopted a Bluetooth Golden Model developed at Synopsys as the reference model. A set of testbenches were developed in order to perform hierarchical functional verification of the design [12]. The non-RTL SystemC model was refined to create an RTL SystemC model. The functional verification was repeated to validate this model. This model was further refined to create the models for each partition (see sections 4.1, 4.2). These models were again verified using adapted testbenches. Finally the RTL SystemC models were automatically translated into synthesizable VHDL

5. RESULTS Severals Virtex-II FPGAs have been tested, in order to find the best reconfiguration time for each partition (P1 and P2) of the application. According to the Virtex-II platform user guide, the maximum configuration bitstream programming rate is 50MHz (66MHz) without (with) handshaking. Using the SelectMap interface, one byte at a time can be written. The reconfigurable part of P1 required 2 columns and 9 bus-macros while P2 required 4 columns and 24 busmacros. Both situations correspond to the minimal number of columns suggested by the Xilinx modular design methodology. As the adopted clock frequency was 50MHz (20ns) and the bitstream size depends on the selected component, the reconfiguration time can be obtained from: Trec (Ps) = bitstream_size (Kbytes)*20 The synthesis results (equivalent gates, maximum clock frequency, bitstream size and reconfiguration time) for partitions P1 and P2 are shown in Table 1 for different components of the VIRTEX-II family. Table. 1. Synthesis results for the header/payload (P1) and the RX/TX (P2) partitions.

XC2V1000 P1 P2

Fig. 7. Design flow

Equiv. gates

Max. Clock. Freq.

Bitstream Size

T.rec.

2,897 6,026

126.17MHz 73.46MHz

24KB 38KB

480us 760us

Equiv. gates

Max. Clock. Freq.

Bitstream Size

T.rec

2,897 6,026

126.17MHz 73.46MHz

21KB 31KB

420us 620us

Equiv. gates

Max. Clock. Freq.

Bitstream Size

T. rec

2,897 6,026

126.17MHz 73.46MHz

16KB 24KB

320us 480us

XC2V500

(Verilog)2 using the Synopsys Design Compiler. The verification of these models was done using SystemCVHDL (or SystemC-Verilog) cosimulation. Then, the Xilinx Modular design methodology [13, 14] was applied to implement each solution. This methodology comprises the following phases:

P1 P2

XC2V250

x Initial Budgeting: a top level design, instancing all modules, is created. The user constraint file is also created (*.ucf). This file is manually generated. The correct construction of this file is essential because all steps of the modular design are based on it. x Module Implementation: each module, reconfigurable or not is placed and routed separately. The bitstreams for each module are then created.

P1 P2

In the Bluetooth system, a complete packet transmission occurs during one time slot of 625Ps. The first partition requires two reconfigurations in order to process the Bluetooth packet. Using the smallest device XC2V250, the reconfiguration time is 320*2 = 640us, which exceeds the Bluetooth time slot. The system only needs one reconfiguration for the second partition. The

2

Due to problems when using the Synopsys DC_Shell tool, some modules were translated into VHDL while others into Verilog.

151

reconfiguration time is 480us (best case). Furthermore, the reconfigurable area is well utilized as shown in Fig. 8. The number of equivalents gates is 6,026, consuming almost all CLBs available in 4 columns.

BB_Unit. 7. REFERENCES [1]

A. DeHon and J. Wawrzynek, “Reconfigurable computing: what, why, and implications for design automation,” in Proc. 26th Conference on Design Automation, 1999, pp. 610615.

[2]

Xilinx Inc, “Virtex-II Platform User Guide,” UG002 (v 1.5).

[3]

J. Hadley and B. Hutchings, “Design methodologies for partially reconfigurable systems,” in Proc. IEEE Symposium on FPGA´s for Custom Computing Machines (FCCM´95), 1995, pp. 78–84.

[4]

D. Ross, O. Bellacot, and M. Turner, “An FPGA-based hardware accelerator for image processing,” in Proc. International Workshop on Field Programmable Logic and Applications on more FPGAs, 1994, pp. 299-306.

[5]

Mathew D´Souza, “Embedded bluetooth stack implementation with NIOS softcore processor,” School of Information Technology and Electrical Engineering. The University of Queensland, October 2001.

[6]

Jens Eliasson, “Design and evaluation of a Bluetooth enabled system based on a Microblaze soft processor,” Master thesis, Lulea University of Technology, Sweden, 2003.

[7]

J. Bhasker, “A SystemC Primer,” Star Galaxy Publishing, 2002.

[8]

Open SystemC Iniciative (OSCI), SystemC Documentation: http://www.systemc.org

[9]

Bluetooth System Specification, Volume 1, Core ver 1.1, Feb.2001.

6. CONCLUSIONS

Fig. 8. Layout of the RX/TX partition for the XC2V250

Dynamic reconfiguration offers new challenges for complex designs. This work presented a model of a runtime reconfigurable Bluetooth Baseband Unit (BB_Unit) and a design flow in order to use this technique. We tested two different dynamically reconfigurable implementations of the system. The current state of the implementation and the major aspects of the design process were presented and discussed.

[10] LEON Specification, http://www.gaisler.com, 2004. [11] AMBA Specification, ARM, 1999.

Two different partitions, P1(header-payload) and P2 (RXTX) were tested. The overall execution of the BB_Unit functionality for partition P2 (best result) requires three sequential time intervals: t_rec + t_ini + t_proc. The system is first (re)configured, next the modules are initialized with their respective parameters and finally the process (RX or TX) is carried out. According to Table 1, there are (625-480=145)Ps available for (t_ini + t_proc) (XC2V250 component). During t_rec (TXoRX or RXoTX), the PacketProc, PacketBuffer and Corr modules are executing their functions.

[12] E. Romero, M. Strum, Wang Jiang Chau "Comparing Two

Testbench Methods for Hierarchical Functional Verification of a Bluetooth Baseband Adaptor", International Conference on Hardware/Software Codesign and System Synthesis, CODES + ISSS, 2005 [13] Xilinx

Inc., “XAPP290 – Two flows for partial reconfiguration: Module based or diffrence based,” http://www.xilinx.com.

[14] Xilinx

Inc., “Development System Reference Guide. Chapter 4 – Modular Design,” http://toolbox.xilinx.com/docsan/xilin4/pdf/docs/dev/dev.pdf

We are know improving (and debugging) our RTL SystemC model by adding isolation switches in order to simulate the dynamic reconfiguration using the DCS tool [15]. We are also looking for better partitions for the

[15] P Lysaght, J Stockwood, “A Simulation tool for dynamically

reconfigurable FPGAs”, in: IEEE Transactions on VLSI Systems, September, 1996.

152