Building a reliable operating system

Building a reliable operating system(2008)

引用 23|浏览4
暂无评分
摘要
Despite many decades of research, the management of errors in a live operating system remains a challenging problem. This thesis presents CuriOS, an operating system that incorporates several new error management techniques that significantly improve reliability. Errors detected by both hardware and software are signaled using language exception handling mechanisms. Unhandled exceptions do not crash the operating system and are dispatched to recovery routines. The architecture of CuriOS is influenced by microkernel design principles. Individual operating system services are assigned separate protection domains. This componentization provided by traditional microkernel designs helps confine errors. However, an error that occurs in a microkernel operating system service can potentially result in state corruption and service failure. A simple restart of the failed service is not always the best solution for reliability. Blindly restarting a service which maintains client-related state such as session information results in the loss of this state and affects all clients that were using the service. CuriOS adopts a novel design that uses lightweight distribution, isolation and persistence of client-related state information maintained by operating system services. This helps mitigate the problem of state loss during a restart. This design also achieves inter-client isolation by curtailing error propagation within services. Fault injection experiments show that it is possible to recover from 87% or more manifested errors in operating system services such as the file system, timer, scheduler and network while maintaining low performance overheads.
更多
查看译文
关键词
service failure,live operating system,individual operating system service,file system,reliable operating system,microkernel operating system service,client-related state information,operating system,system service,failed service,client-related state
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要