William Stallings book „Operating Systems“ is a good reference to learn about scheduling. In chapter 9.2 he summarizes, that the following aspects play a role when writing a scheduler for an operating System:
- Turnaround Time: Time between the submission of a process and its completion
- Response Time / Determinism: Time from the submission of a request until the response to be received
- Deadline: The processes start or completion deadline
- Predictability: A job should run in the same amount of time
- Throughput: The scheduler should maximize the number of processes completed per unit of time
- Processor utilization: Percentage of time, the processor is busy
- Fairness: Processes should be treated the same so that no process should suffer starvation
- Enforcing priorities: The scheduler should favour higher priorities
- Balancing resources: The scheduler should keep the resources of the system busy
If the OS is a Real-Time Operating System (RTOS) for embedded systems, than determinism and response time becomes evident. But also the processor utilization should be optimized. An inefficient scheduler can force you to use a more powerful (and expensive) processor in order to meet the required run-time behaviour. The complexity for the scheduler rises with a multi-core processor as applications can be executed concurrently on the available cores.
A safety certification of a real-time application running on a multi-core system is the ultimate challenge for a scheduler. Applications running on different cores of a multi-core processor are not executing independently from each other. Even if there is no explicit data or control flow between these applications, a coupling exists on the processor platform level since they are implicitly sharing resources. A platform property, which may cause interference between independent applications, is called a hardware interference channel. They can be categorized as:
- CPU-core interference: We assume that cores execute independent from each other as long as they do not share hardware. Inter Processor Interrupts should be handled by the operating system
- Cache Sharing: Usually second level caches are shared amongst the cores
- Cache Coherency: If applications run on several cores, the consistency of local caches connected to the shared resources has to be ensured
- Memory bus: Usually the bandwidth of the memory bus is shared between the corers
- Shared I/O: Concurrent access to shared I/O device may cause a performance loss, if the bus reaches its bandwidth or if the bus can only handle one request at a time
- Shared Interrupts: A hardware interrupt is typically routed to one core. If multiple devices are attached to one interrupt line and the same core does not serve the devices, the core who receives the interrupt must pass this interrupt to the other core(s) forcing them to check the interrupt status of their devices
On Software level, interference may be caused by the load-balancing algorithm (e.g. SMP, AMP, BMP), which has to provide means to execute the application on different cores or switch from one core to another during run-time.
The above-mentioned software and hardware interferences on a multi-core platform are a hurdle for the deterministic behaviour of an embedded safety system. Most safety standards (IEC 61508, EN 50128, ISO 26262, …) require a timing analysis and the determination of the Worst Case Execution Time (WCET) for the safety system. This can be quite difficult on multi-core processors due to the HW and SW interferences.
An adaptive time partitioning scheduler is able to handle the HW/SW interference challenges on a multi-core platform and provide optimized CPU usage with a guaranteed response time (WCET). Using a patented Time Partition Scheduler, SYSGO was awarded with the world wide first EN 50128 SIL 4 certificate for the PikeOS real-time hypervisor on a multi-core platform.