This thesis investigates hardware acceleration for embedded real-time operating systems (RTOS).
An RTOS supports multi-tasking, but brings along unwanted overhead and unpredictability that could cause deadline misses. Hardware acceleration can assist the RTOS to meet hard real-time application deadlines.
This research implements a computing system on an FPGA with the Leon3 soft-core processor and some on-chip peripherals for the AMBA AHB/APB bus. With the support of FPGA logic, it is possible to customize the RTOS with acceleration circuits to suit different applications. For example, users can customize the hardware acceleration circuits with a defined number of tasks and scheduling algorithm. The FPGA solution improves resource efficiency and performance while reducing design cost.
The research proposes a configurable hardware-accelerated RTOS platform â€“HplusRTOS, which targets small-scale embedded applications with stringent timing constraints and frequent task switching. eCos is selected as the software RTOS platform. Its kernel is redesigned to interface with the acceleration modules.
The HplusRTOS platform includes three hardware acceleration modules: Hardware Scheduler (Hscheduler), Task Control Manager (TCM) and Fast Context Switch (FCS) Task Dispatcher. These three modules accelerate three phases of task switching, thus improving the multi-tasking performance of the RTOS. The Hscheduler implements the kernel function of task scheduling. TCM processes external interrupts and time services and transforms them into task controls to feed Hscheduler. FCS Task Dispatcher performs task switching according to requests from Hscheduler.
A new task queue architecture that better supports practical task controls and has good resource scaling is proposed. With this architecture, this research implements two priority based schedulers (Bitmap and MLFQ) and two deadline based schedulers (EDF and LST). The hardware scheduler has a configurable architecture to support different scheduling algorithms. The TCM model accelerates the process of different sources of task controls, which are mainly interrupts and time services in the current HplusRTOS platform. The TCM module also incorporates the function of a normal interrupt controller and implements general time services that do not incur task controls.
The FCS Task Dispatcher implements two models of customized register file that support fast context switching. The Vector Register File (VRF) model has the advantage of speed and simplicity but consumes many BRAM resources in the FPGA. The Symmetric Register File (SRF) model is resource efficient but the operation overhead is larger. The VRF model has been proved not to degrade processor operating frequency if prototyped on an FPGA based soft-core system, and so it is chosen as the register file architecture in the HplusRTOS platform.
To better co-operate with the software part of the HplusRTOS kernel and reduce development time, the software controlled task dispatch model is adopted. An FCS co-processor is designed to control the task dispatch process and is used as the control interface between the processor and FCS register file.
As a result, the proposed hardware modules accelerate the operation of kernel functions. For the hardware scheduler, most task controls are processed within three clock cycles. Only the aperiodic task release in deadline based scheduler can cause a longer delay 3+D, where D represents the task queue depth. For the TCM models, interrupts and time services can be transferred to specified task controls within two clock cycles. The FCS Task Dispatcher reduces the context switch of the task register set to 2 cycles in the VRF model and 136 cycles in the SRF model.
The task switch process is also accelerated. For the VRF model based FCS Task Dispatcher, the overhead of the normal task dispatch process is reduced to 70 clock cycles, and the dispatch process triggered by the hardware scheduler requires 118 clock cycles.
Experiments are conducted to compare performance between HplusRTOS platform and the purely software eCos platform. A software based EDF scheduler is developed within the eCos kernel for testing.
Two performance metrics are proposed to measure system level real-time performance. The trigger dispatch latency can be used to measure the reaction speed of an RTOS. Meanwhile, the system capacity reflects the total processor utilization available under different application settings.
These experiments demonstrate that HplusRTOS platform enhances the systemâ€™s real-time performance. The HplusRTOS platform greatly reduces trigger dispatch latency (more than 10 times smaller) and increases system capacity up to 25%. Interrupt latency is reduced to 27.4% of the software eCos latency and the worst case context switch overhead is reduced to 13.7%.
Nevertheless, hardware acceleration brings along resource costs. When compared to the software eCos platform, the HplusRTOS platform doubles the register and BRAM resource cost. Its logic resource increase is much less, about 40%. The FCS Task Dispatcher consumes about half of the increased LUTs and most memory resources. Despite large memory usage, it is found that HplusRTOS platform leads to a much smaller application program, whose size is about 58% of the software eCos platform.