A HEURISTIC APPROACH TO SCHEDULE PERIODIC ... - Xun ZHANG

cept of servers that reserve area and execution time for other .... does not exceed the available area, i.e., ∑Ti∈R Ai ≤ 1. A running instance of a task Ti can be preempted by an- ..... eters above, and added to Γ until a given limit on the task.
312KB taille 1 téléchargements 254 vues
A HEURISTIC APPROACH TO SCHEDULE PERIODIC REAL-TIME TASKS ON RECONFIGURABLE HARDWARE Klaus Danne ∗

Marco Platzner

Department of Computer Science University of Paderborn

Department of Computer Science University of Paderborn

ABSTRACT

Config.Files CF n ... CF 2 CF 1

Real-time systems are embedded computing systems that must react within precise time constraints to events from their environment. Example application domains, as reported in [1], include control of power plants, railway switching systems, automotive applications, flight control systems, robotics, telecommunication systems and many more. For most of these systems it is already common practice, or at least conceivable for the near future, to include reconfigurable hardware devices to implement computations. Reconfigurable hardware devices, the most prominent one being the field-programmable gate array (FPGA), are general-purpose devices that can be programmed after fabrication. SRAM-based FPGA variants can be re-programmed arbitrarily often, opening up the way to FPGA-based multitasking. While real-time scheduling has been intensively studied for microprocessor based systems [1, 2, 3], the investigated task scheduling and placement strategies for reby the DFG Research Training Group 776

0-7803-9362-7/05/$20.00 ©2005 IEEE

Config BUS

DAC

Fig. 1. Target architecture of an embedded FPGA computer.

1. INTRODUCTION

∗ Supported

Data BUS

I/O DAC

...

This paper deals with scheduling periodic real-time tasks on reconfigurable hardware devices, such as FPGAs. Reconfigurable hardware devices are increasingly used in embedded systems. To utilize these devices also for systems with real-time constraints, predictable task scheduling is required. We formalize the periodic task scheduling problem and propose two preemptive scheduling algorithms. The first is an adaption of the well-known Earliest Deadline First (EDF) technique to the FPGA execution model. Although the algorithm reveals good scheduling performance, it lacks an efficient schedulability test and requires a high number of FPGA configurations. The second algorithm uses the concept of servers that reserve area and execution time for other tasks. Tasks are successively merged into servers, which are then scheduled sequentially. While this method is inferior to the EDF-based technique regarding schedulability, it comes with a fast schedulability test and greatly reduces the number of required FPGA configurations.

FPGA Config. Controller

568

configurable hardware devices have mostly focused on non real-time application models [4, 5, 6, 7]. Most authors assume a 2-dimensional area model and partial reconfigurability, and treat tasks as relocatable rectangles which can be placed anywhere on the FPGA device. Placement and scheduling strategies in off-line and on-line application scenarios are considered, mostly optimizing cost functions such as the total make span or the average response time. To the best of our knowledge, [8] is the only related work considering FPGA real-time scheduling. There, problems of non-preemptively scheduling aperiodic tasks to the 1- and 2-dimensional area models are treated. The practical realization of multitasking on current FPGA technology rises several issues: First, partial reconfiguration is often limited in practice by device architectures and insufficient tool support. Some FPGA families are not partially reconfigurable at all. Second, the issue of communication between tasks is rarely considered in the models used. Finally, most related projects require tasks to be relocatable, which might be difficult to achieve for modern FPGA architectures that are not fully homogeneous. Our work differs in that we use full FPGA reconfiguration and focus on preemptive periodic real-time scheduling. The full reconfiguration model can be used on all SRAMbased FPGAs and can be realized using standard design implementation tools. Task preemption requires a runtime system to be able to save the state of a task and, later on, resume it. Concepts and implementations of preemptive execution environments on FPGAs can be found in [9] [7]. The typical embedded reconfigurable target architecture is shown in Fig. 1, and comprises an FPGA, a controller, memory, and various I/O devices. Besides the embedded software and data sections, the memory stores the configu-

rations (i.e., the programming bitstreams) for the logic resource. For such an architecture, we are interested in devising scheduling algorithms for periodic real-time tasks respecting following objectives:

Table 1. Example task set Γ∗ Ti T1 T2 T3

• high scheduling performance: We want to be able to generate feasible schedules for a wide range of task sets.

Pi 4 6 12

T1T2T3

• efficient schedulability test: We want to quickly decide whether all tasks will meet their deadlines in a given schedule.

FPGA area

• small number of required FPGA configurations: We want to minimize the number of FPGA configurations which, in turn, minimizes the required amount of embedded memory. In this paper, we present the formal modeling of the scheduling problem and two scheduling algorithms: EDFNF and MSDL. EDF-NF is a straight-forward adaption of the EDF algorithm to our specific system model. While revealing remarkable scheduling performance, EDF-NF lacks an efficient schedulability test and requires an unbearable number of FPGA configurations. MSDL comes with a test of acceptable efficiency and keeps the number of required configurations small, at the price of a decreased scheduling performance. The basic principles of these two algorithms have been published previously [10]. This paper extends our initial ideas and includes as novel contributions i) the detailed analysis of the MSDL algorithm and ii) simulation experiments delivering a quantitative evaluation of the required number of configurations for both scheduling techniques.

1

T2,T3 2

3

T2

T1,T2 4

T1 T2,T3

5

T3

UiS 1/4 5/24 3/16 0.65

T1

6

7

T2 T1

UiT 1/2 5/6 1/4 1.58

Ai 1/2 1/4 3/4

T1

T1,T2 0

Ci 2 5 3

T2

T1T2T3

T1,T2 8

9

T2 10

11

12

T2 T1

T3

T1

Fig. 2. Preemptive schedule for three periodic FPGA tasks. The considered reconfigurable hardware device offers a certain amount of computational resources, e.g., the configurable logic blocks of an FPGA, which is also referred to as the area of the device. We normalize this area to 1. The device can execute any set R ⊆ Γ of tasks simultaneously, as long as the amount of resources required  by the task set does not exceed the available area, i.e., Ti ∈R Ai ≤ 1. A running instance of a task Ti can be preempted by another task Tj before its completion and, later on, be resumed. More general, any set of running tasks R can be preempted ˜ Technically, the runtime to execute a new set of tasks R. system has to interrupt the execution of R and to save the contexts of all tasks Ti ∈ R. Then, the FPGA is fully recon˜ figured with a new configuration including all tasks T j ∈ R. When R is scheduled for execution again, the previously saved contexts of Ti ∈ R are restored and R is restarted. The time for the preemption and restore processes is neglected in our scheduling analysis. For current FPGA devices, these times are in the range of a few to a few tens of milliseconds. We currently assume task execution times of at least one order of magnitude higher than that, and intend to model preemption overheads in future work. As an example, Fig. 2 displays a possible schedule for the task set shown in Table 1. The upper part of Fig. 2 indicates the release times and deadlines for the tasks, as well as the running tasks. The lower part of Fig. 2 illustrates the tasks’ areas and the sharing of the FPGA area over time. Overall, four different FPGA configurations are needed for this schedule. The schedule shown can easily be proven feasible, because every task instance meets its deadline for the entire hyper-period of the task set (which amounts to 12 time units). The hyper-period is the least common multiplier of all task periods in the task set. A feasible schedule defined over the hyper-period can be repeated an infinite number of

2. THE SCHEDULING PROBLEM 2.1. Task and Resource Models We consider a set of periodic tasks Γ. Each task Ti ∈ Γ refers to some computation which has to be performed periodically. The instances Ti,j of task Ti are released with period Pi . That is, the release time of instance Ti,j+1 is given by ri,j+1 = ri,j + Pi , where ri,j is the release time of instance Ti,j . Ci denotes the worst case computation time of task Ti , which is the same for all of its instances. The finishing time of task instance Ti,j is denoted by fi,j . In our model, we assume real-time tasks with deadlines equal to periods. Hence, the deadline of a task instance Ti,j is given by the release time of the next instance, ri,j+1 . Finally, the amount of reconfigurable logic resources a task requires is given by Ai . We normalize all resource requirements to the available resource offered by the FPGA. Assuming that no single task requires more resources than available, we get Ai ∈ [0 . . . 1].

569

times without any missed deadline. Formally, a schedule for the task set Γ assigns a set of running tasks Rk ⊆ Γ to every point in time k, such that  Ti ∈Rk Ai ≤ 1. No instance of a task must start execution before its release time. We call the schedule feasible, if each task instance finishes its execution before its deadline, i.e., ∀i, j : fi,j ≤ ri,j +1 .

EDF - Next Fit (EDF-NF). We use EDF-NF as an off-line scheduling procedure that precomputes a number of FPGA configurations which are dispatched at runtime. Similar to the original EDF algorithm, EDF-NF keeps a list of all released but not yet finished tasks in a ready queue. The ready queue is sorted by increasing absolute task deadlines. To determine the set R of running tasks, EDF-NF scans through the ready list. A task Ti is added to the set of running tasks R, as long as the sum of the area of all running tasks remains less or equal to one. Whenever the next task cannot be added, EDF-NF proceeds in the ready queue and tries to add tasks with longer absolute deadlines. At this point, EDF-NF diverges from the pure EDF rule. The motivation for adding tasks in next-fit manner is to improve the device utilization. If no more tasks can be added, the running set is closed and compiled to an FPGA configuration. Whenever a new task instance is released or running instances of tasks terminate, the FPGA configuration may change. To prove schedulability and generate the required FPGA configurations, EDF-NF simulates task executions and terminations for the complete hyperperiod. Unfortunately, to our knowledge there is no efficient schedulability test. Further, the number of FPGA configurations can grow fairly large which is a major disadvantage of this algorithm.

2.2. Utilization Metrics We define two utilization metrics to measure the computational load generated by a task set Γ. These metrics are central to the scheduling algorithm proposed in Section 4. Similar to the processor utilization factor defined in single processor real-time scheduling, we defi ne the time-utilization  i factor of a task set Γ to be U T (Γ) = Ti ∈Γ C Pi . For the special case that all tasks are executed sequentially, U T is the fraction of time the FPGA spends executing tasks whereas 1 − U T is the idle time. While such a sequential schedule can mean an enormous waste of resources, it has two advantages. First, it allows to rely on efficient schedulability tests known from single processor scheduling. Second, the number of required FPGA configurations is bound by the number of tasks. Improved scheduling techniques will try to better utilize the FPGA resources and execute several tasks in parallel. To describe the computational load for such a situation, we define as a more expressive metric factor  the system-utilization S i A . U presents the of a task set Γ as U S (Γ) = Ti ∈Γ C i Pi fraction of the area-time product occupied by a task set. Visually, U S corresponds to the gray areas in the schedule of Fig. 2. The white areas in the schedule of Fig. 2 correspond to the unused computational resource. Obviously, we cannot find a feasible schedule for a task set with U S > 1. Whether a feasible schedule exists for a task set with U S ≤ 1 depends on the specific relations among the task properties, in particular the area requirements Ai . U T (Γ) and U S (Γ) are also defined for single tasks, as they are (minimal) instances of task sets. Table 1 shows the time and system utilization factors for the example tasks as well as for the complete task set. As we cannot expect to fully utilize the FPGA area, the resulting system utilization will generally stay below 1. In this paper, we use U S to experimentally rate the quality of a scheduling algorithm. We do not attempt to derive bounds for U S that could be used to decide schedulability for a given algorithm.

4. SERVER-BASED SCHEDULING In this section, we present a scheduling technique called Merge Server Distribute Load (MSDL). To construct a schedule MSDL uses the concept of server tasks, or briefly servers. A server is a periodic task that reserves execution time and FPGA area for other tasks. We define a server as Si = (Ri , Pi , Ci , Ai ), where Ri = {Ta , Tb , . . . } ⊆ Γ is a set of tasks for which execution time and area is reserved. Pi , Ci , Ai denote the period, the computation time and the area of the server, respectively. The area of a server is set to equal the  sum of the areas of tasks represented by the server, Ai = Tk ∈Ri Ak . Consequently, whenever the server Si is running, all tasks it represents are running. The rationale of the MSDL algorithm is to construct a set of servers Ω from the original task set Γ, such that any feasible schedule for Ω implies a feasible schedule for Γ. More specifically, MSDL constructs a set of servers Ω by properly merging tasks together for parallel execution. The resulting servers are then scheduled for sequential execution on the FPGA with single processor EDF. Feasibility for the resulting set of servers is thus efficiently checked by the utilization test: U T (Ω) ≤ 1.

3. EDF-NF SCHEDULING

4.1. The Merge-server Distribute Load (MSDL) Algorithm

We adopt the simple EDF strategy, which has been successfully used in single and multiprocessor environments, for our execution model and propose the scheduling algorithm

Algorithm 1 shows the pseudo code for the MSDL technique. First, each of the initial tasks is turned into a server

570

Alg. 1 Merge Server - Distribute Load

Table 2. Servers generated for the example task set Γ∗ by the MSDL (Merge Server Distribute Load) algorithm

1: procedure MSDL(Γ) 2: Ω←∅ 3: for all Ti ∈ Γ do  init 4: Si ← ({Ti }, Pi , Ci , Ai ) 5: Ω ← Ω ∪ Si 6: loop 7: Sx , Sy ← s e le c tV a lid P a ir T o M e r g e (Ω) 8: if no pair found then 9: return Ω  exit 10: Sz ← (Rx ∪ Ry , Py , Cy , Ax + Ay )  Py ≤ Px 11: Cx ← Cx − ta k e O v e r T im e (Sx , Sz ) 12: Ω ← Ω ∪ Sz  add server 13: Ω ← Ω \ Sy 14: if Cx ≤ 0 then 15: Ω ← Ω \ Sx

Ri T1 T2 T3

Pi 4 6 12

Ci 2 5 3

Ai 1/2 1/4 3/4

 S 1 

T1 T2 T3 T1 , T2

4 6 12 4

0 3 3 2

1/2 1/4 3/4 3/4

T2 T3 T1 , T2 T2 , T3

6 12 4 6

0 0 2 3

1/4 3/4 3/4 1

S2 S3 S4 S  2  S  3  S4 S5

(line 3). Then the main loop is entered in which, iteratively, a server pair is identified and merged if possible. The selection of the two servers Sx and Sy that should be merged is done by the function selectValidPair() (line 7). For the implementation of this function, several heuristics are conceivable. In our current version we employ a greedy strategy that selects the pair of servers giving the greatest reduction in time utilization U T (Ωo ld ) − U T (Ωn e w ) per increase of system utilization U S (Ωn e w ) − U S (Ωo ld ).1 Any valid pair of servers Sx and Sy must have a disjunct set of represented tasks (Rx ∩ Ry = ∅) and must jointly fit onto the FPGA (Ax + Ay ≤ 1). If no valid server pair could be found, the algorithm exits and returns Ω as the final set of servers (line 9). Otherwise, the servers Sx and Sy are merged. Without loss of generality, we can assume that Sy is the server with the shorter period. Then, a new server Sz is created representing all tasks of the two original servers (line 10). The period and the computation time for Sz are set to equal those of Sy . Therefore, Sz is a full replacement of Sy , and Sy can be removed from Ω. The computation time of Sx is reduced, since the new server Sz reserves area and computation time for the tasks of Sx as well. The actual reduction of computation time depends on how often the new server Sz executes within the period of Sx . A pessimistic approximation for the reduction is given by: ta ke O v e r T im e (Sx, Sz ) = Cz (Px / Pz − 1)

Si S1 S2 S3

S4,S5

S4

S4

FPGA area

0

1

T2 S1 T1

S5

S5 2

3

T2 S2 T3

5

UiS 1/4 5/24 3/16 0.65 1/4 1/8 3/16 3/8 0.69 1/8 3/16 3/8 1/2 0.88

S4

S4 4

UiT 1/2 5/6 1/4 1.58 1/2 1/2 1/4 1/2 1.25 1/2 1/4 1/2 1/2 1

6

S5 7

S1 T1

S4 8

T2 T2 S5

S4,S5

9

T2 S1 T1

S5 10

11

12

T2 S5

Fig. 3. Schedule of server task set generated by MSDL

Ω∗0 = {S1 , S2 , S3 } are created. In the first iteration, S1 and S2 are selected and merged into S4 . S2 receives the new computation time C2 ← C2 − 2 = 3 . The server with the shorter period, S1 , is removed. In the second iteration, the residual S2 and S3 are merged into S5 . Not only the server with the shorter period is removed, but also S3 since its computation time is reduced to zero. Ω∗2 is the final server set, since neither R4 , R5 are disjunct nor A4 + A5 ≤ 1. As shown in Table 2, the time utilization factor U T (Ω∗2 ) = 1. Consequently, Ω∗2 can be feasibly scheduled by EDF. The resulting schedule is shown in Fig. 3. The figure also indicates the original tasks of Γ∗ executed inside the servers. Compared to the schedule given in Fig. 2, MSDL requires only two FPGA programming files. Table 2 also lists the system utilization factor UiS which increases over the iterations, since larger servers will reveal more idle areas and times inside their reservations. In essence, MSDL trades system utilization for time utilization to allow for an efficient schedulability test and to reduce the number of FPGA configurations.

(1)

As an example, we apply the MSDL algorithm to the example task set from Section 2. Then, in Section 4.2, we provide a more involved analysis to compute the exact computation time reduction. Table 2 shows the set of servers Ω∗k generated in each iteration k of the MSDL algorithm. Initially, the servers 1Ω

o ld denotes the set of servers before, whereas Ωo ld denotes the server set after merging the selected pair of servers.

571

4.2. Computation time reduction

case A: Sx: 0

1

2

3

4

5

6

7

8

9

10

11

12

13 l

14

15

16

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

2

3

4

5

6

7

8

9

10

11

12

13

14 l

15

16

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

e

In Equation 1, we made the pessimistic assumption that a server Sz with Pz ≤ Px executes only for m = Px / Pz −1 times between the release time and deadline of server Sx . Therefore, the computation time of Sx was reduced by mCz . A further reduction of Cx is possible, if we take into account that the server instances of Sz which are not fully contained between the release time and deadline of server Sx can still be useful. The precise amount of this reduction depends on the actual phase between Sx and Sz . Fig. 4 illustrates the two cases that have to be distinguished:

Sz: 0

1

case B: Sx: 0

1 e

Sz: 0

1

Fig. 4. Case analysis of computation time reduction 5. EXPERIMENTAL RESULTS

Case A shows an example with Px = 9 and Pz = 4 . Pessimistically, the server Sz is guaranteed to execute m =  49 − 1 = 1 times between the release time and deadline of Sx . As Fig. 4-A illustrates, there is one instance of server Sz being e time units too early to be included in the considered period of Sx , and one instance of server Sz being l time units too late. In a worst-case schedule, the server Sz executes at the beginning of its early instance and at the end of its late instance, resulting in some wasted computation time (denoted by black boxes) for Sx . However, some amount of the computation time of Sz may still be useful to execute Sx , as denoted by the gray box of the first instance of Sz . Let δ = e + l denote the time of the early and late server instances which are outside the considered period of Sx . δ can be computed by δ = (m + 2) ∗ Pz − Px . Let C e l denote the computation time of the early and late server instances, which is guaranteed to be within the considered period of Sx . Then, C e l can be computed by C e l = m a x (2 × Cz − δ, 0). Therefore, in case A the time C e l is exactly the amount by which Cx can be reduced in addition to Equation 1. Case B of Fig. 4 illustrates the second case, where the server Sz is executed m + 1 times within the considered period of Sx . In this case, δ changes to δ˜ = (m+3 )∗Pz −Px ˜ 0), respectively. and C e l changes to C˜ e l = m a x (2 × Cz − δ, It follows that in case B the time by which Cx can be reduced in addition to Equation 1 is given by Cz + C˜ e l . Since we have to consider the worst case (out of case A and case B), the precise reduction of the computation time is determined by:  ta k eO v er T ime(Sx , Sz ) = m in  Cz (Px / Pz −1)+m a x 2Cz −((Px / Pz + 1)Pz − Px ), 0 ,    δ

 Cz Px / Pz +m a x 2Cz −((Px / Pz + 2)Pz − Px ), 0    δ˜

(2)

572

To evaluate the scheduling performance of the EDF-NF and MSDL algorithms, we have created synthetic task sets but adopted the task area requirements from realistic FPGA designs reported in the literature. To generate random task sets with varying values for the system utilization factor U S (Γ) we have proceeded as follows: We have chosen tasks areas uniformly distributed from 20%, which is approximately the size of a Discrete Wavelet Transform design on an XILINX VirtexII XC2V3000 FPGA [11], up to 40% which is about the size of an MPEG 2 Video Decoder on the same FPGA [12]. The task computation times and periods were chosen such that the time utilization factors U T (Ti ) are uniformly distributed in [0.2, 0.4 ]. To create a benchmark task set Γ, tasks have been created one by one according to the parameters above, and added to Γ until a given limit on the task set’s system utilization has been exceeded. These parameters result in task sets of approximately 10 tasks on average. The simulation result on a series of 1400 tests is shown in Fig. 5, labeled “n=small”. The Figure displays the percentage of feasibly scheduled task sets for MSDL and EDF-NF over the system utilization factor. As expected, EDF-NF clearly outstrips MSDL in scheduling performance. EDF-NF is able to schedule about 50% of the task sets with a system utilization factor around 85% and accepts almost all task sets with U S less than 75%. In contrast to that, MSDL is able to schedule only few task sets with a U S exceeding 70%, and achieves an acceptance rate of 50% for task sets with a U S around 55%. On the other hand, Fig. 6 demonstrates the key advantage of MSDL by displaying the average number of FPGA configurations. For an MSDL schedule, the number of configurations equals the number of servers, which is bounded by n. For an EDF-NF schedule, the number of configurations equals the number of different sets of running tasks R. We further took into account that for EDF-NF FPGA configurations can be redundant, i.e., task set R  is a subset of another task set R and thus only one configuration is needed. The resulting curve is labeled “without subsets”. Fig. 6 shows that the number of FPGA configurations grows

key benefits of MSDL. First, MSDL comes with an efficient schedulability test. For larger real-time task sets that need a schedulability guarantee, EDF-NF is not an option.2 Second, the number of required FPGA configurations is bounded by the number of tasks, which makes this approach also feasible for larger task sets. Future work will concentrate on the development and evaluation of different heuristics for selecting servers to be merged in the MSDL algorithm. One goal is to further reduce the number of required FPGA configurations and, thus, lower the memory requirements. Moreover, we will incorporate the modeling of reconfiguration and read-back in our schedules to increase the accuracy of the results.

percentage of feasibly scheduled task sets

100 90 80 70 60 50 40 30

MSDL n=small MSDL n=medium EDF−NF n=small

20 10 0 0

0.2

0.4

S

0.6

0.8

1

U (Γ)

Fig. 5. Percentage of tasks feasibly scheduled by MSDL and EDF-NF, depending on the system utilization factor

7. REFERENCES

number of requiered FPGA programming files

250

200

EDF−NF with subsets EDF−NF without subsets MSDL

[1] G. C. Buttazzo, Hard Real-Time Computing Systems: Predictable Scheduling Algorithms and Applications. Kluwer Academic Publishers, 2000.

150

[2] B. Andersson and J. Jonsson, “Fixed-priority preemptive multiprocessor scheduling: to partition or not to partition.” in RTCSA, 2000, pp. 337–346.

100

[3] J. Goossens, S. Baruah, and S. Funk, “Real-time scheduling on multiprocessor,” in Proceedings of the 10th International Conference on Real-Time System, 2002.

50

0 0

2

4

6 8 number of tasks: |Γ|

10

12

[4] K. Bazargan, R. Kastner, and M. Sarrafzadeh, “Fast template placement for reconfigurable computing systems,” IEEE Design and Test of Computers, pp. 68–83, Mar. 2000.

Fig. 6. Number of required FPGA configurations for MSDL and EDF-NF, depending on the number of tasks

[5] J. Teich, S. Fekete, and J. Schepers, “Optimization of dynamic hardware reconfigurations,” The J. of Supercomputing, vol. 19, no. 1, pp. 57–75, May 2000.

exponentially when using EDF-NF and reaches an average of about 100 for task sets of size 10. This would correspond to approximately 125 MB memory storage using the XC2V3000 FPGA compared to only 12.5 MB in the MSDL case. In order to generate task sets with more tasks, we have run a second test series, labeled “n=medium” in Fig. 5. Here, smaller tasks have been used by distributing the areas in [0.1, 0.2 ] (e.g. a 256 point complex FFT uses 10% of the XC2V3000 area [11]). The time utilization factors UiT have been equally distributed in [0.1, 0.2 ]. These settings result in task sets of approximately 40 tasks on average. MSDL performs slightly worse than on smaller task sets. For EDF-NF, however, we could not gain results as the EDF-NF schedulability test did not terminate in reasonable time.

[6] H. Walder and M. Platzner, “Reconfigurable Hardware Operating Systems: From Design Concepts to Realizations,” 3rd Intern. Conference on Engineering of Reconfigurable Systems and Architectures (ERSA). CSREA Press, 2003. [7] H. Simmler, L. Levinson, and R. Manner, “Multitasking on FPGA coprocessors,” in FPL, 2000, pp. 121–130. [8] C. Steiger, H. Walder, and M. Platzner, “Operating Systems for Reconfigurable Embedded Platforms: Online Scheduling of Real-time Tasks,” IEEE Transactions on Computers, vol. 53, no. 11, pp. 1392–1407, November 2004. [9] K. Danne, “Memory management to support multitasking on fpga based systems,” in Proceedings of the International Conference on Reconfigurable Computing and FPGAs (ReConFig). Mexican Society of Computer Science, 2004. [10] K. Danne and M. Platzner, “Periodic real time scheduling for FPGA computers,” in The Third IEEE International Workshop on Intelligent Solutions in Embedded Systems (WISES), Hamburg University of Technology, 2005.

6. CONCLUSION AND FUTURE WORK We have discussed the problem of real-time scheduling periodic tasks onto FPGA computers and have presented two scheduling algorithms, EDF-NF and MSDL. EDF-NF performs much better than MSDL in the sense that it can generate feasible schedules for task sets with higher system utilization. The experiments, however, emphasized the two

[11] XILINX CORE Generator,www.xilinx.com. [12] Amphion Semiconductor Ltd.,www.amphion.com. 2 The hyperperiod of a task set grows extremely fast. For example, for task sets with periods bounded by 100, the worst case hyperperiod exceeds 4 × 1 0 9 for 5 tasks, and 3 × 1 0 1 8 for 10 tasks.

573