Safety-critical systems can suffer from hardware obsolescence and scalability issues. Moreover, they require high availability and ease of hardware re-usability and reconfiguration. Cloud computing can help resolve such issues and requirements. However, the lack of strong isolation and shared resource (e.g., CPU, cache, memory controller, and network) guarantees in the current cloud paradigm limits the use of clouds for safety-critical applications.
We propose to monitor, control, and coordinate the cloud nodes and their shared resources at the node and global level by adding to the cloud a resource orchestration and coordination layer inspired by the framework developed, among others, in the DREAMS and ACTORS projects. It helps ensure that the safety-critical applications meet their end-to-end deadline and provides support for enabling fault tolerance and improving the Quality-of-Service achieved by non-critical applications.