Advanced Formal Methods for Reliable Critical Systems Software

AFOSR F49620-95-1-0508

Progress Report

1 August 96 - 31 July 97

Insup Lee
Department of Computer and Information Science
University of Pennsylvania
Philadelphia, PA 19104-6389
Tel: (215) 898-3532, Fax: (215) 898-0587
Email: lee@cis.upenn.edu

Rance Cleaveland
Department of Computer Science, N.C. State University

Philip M. Lewis
Department of Computer Science, SUNY at Stony Brook

Scott A. Smolka
Department of Computer Science, SUNY at Stony Brook

Oleg Sokolsky
Computer Command and Control Company, Philadelphia, PA

OBJECTIVES

STATUS OF EFFORT

As a major step toward our goal of attaining a specification formalism for critical systems software, we have extended the process algebra ACSR with a notion of probability based on resource failure. The resulting formalism seems well suited to modeling and analyzing the reliability of critical systems software. We have made significant progress toward the implementation of a process-algebra-based toolkit for the specification and verification of critical systems software. We have developed a graphical property specification language and an efficient new technique for model checking in the modal mu-calculus based on partial-order reduction. Experimental results show that our technique can be highly effective in combating state explosion. We used the Concurrency Factory specification and verification environment to verify some key properties of the Rether real-time ethernet protocol.

ACCOMPLISHMENTS/NEW FINDINGS

A Probabilistic Extension of ACSR

As a major step toward our goal of attaining a specification formalism for critical systems software, we have extended the process algebra ACSR with a notion of probability based on resource failure. The resulting formalism seems well suited to modeling and analyzing the reliability of critical systems software.

ACSR is a process algebra for real-time systems based upon a notion of resource: accessing a resource takes time, contention for resources must be avoided, and priority levels can be used to regulate access to resources. ACSR is a versatile specification formalism and has been used in a number of case studies including real-time process control systems and communication protocols.

Our real-time extension of ACSR involves associated with each resource in the system a probability of failure. System evolution now proceeds in a probabilistic fashion, depending upon which resources have failed at any given point in time. We are using this formalism to model several fault-tolerant systems and are currently developing accompanying analysis techniques based on the idea of probabilistic bisimulation.

Towards a PACS Toolkit

We have made significant progress toward the implementation of a process-algebra-based toolkit for the specification and verification of critical systems software. Our approach to implementing the PACS toolkit is based on integrating various aspects of the Paragon, VERSA, and Concurrency Factory verification environments, which have been previously implemented by the PIs.

In particular, we are using the graphical user interface of Paragon, which also provides a graphical simulation facility for real-time systems specified in graphical ACSR, the state-space exploration and minimization algorithms of VERSA, and the Concurrency Factory's local model checker for the modal mu-calculus.

Besides our system integration work, we are also extending the Paragon GUI to handle probability information as present in Probabilistic ACSR. Similar changes are being incorporated into VERSA and the Concurrency Factory.

Efficient Model Checking via Partial-Order Reduction

We have developed a partial-order reduction technique for local model checking of hierarchical networks of labeled transition systems in the weak modal mu-calculus. Partial-order reduction speeds up the model-checking process by avoiding redundant computation paths in the system state space due to the interleaving of independent events. Ours is the first partial-order reduction technique to be proposed for the modal mu-calculus, arguably the most expressive temporal logic in use today.

We have implemented our technique in the Concurrency Factory specification and verification environment; experimental results show that partial-order reduction can be highly effective in combating state explosion in modal mu-calculus model checking.

Verification of the Rether Real-Time Ethernet Protocol

Rether is a software-based real-time ethernet protocol developed at SUNY Stony Brook. The purpose of this protocol is to provide guaranteed bandwidth and deterministic, periodic network access to multimedia applications over commodity ethernet hardware. It has been implemented in the FreeBSD 2.1.0 operating system, and is now being used to support the Stony Brook Video Server (SBVS), a low-cost, ethernet LAN-based server providing real-time delivery of videos to end users from the server's disk subsystem.

Using local model checking, as provided by the Concurrency Factory specification and verification environment, we showed (for a particular network configuration) that Rether indeed makes good on its bandwidth guarantees to real-time nodes without exposing non-real-time nodes to the possibility of starvation. In the course of specifying and verifying Rether, we identified an alternative design of the protocol that warranted further study due to potential efficiency gains. Again using model checking, we showed that this alternative design also possesses the properties of interest.

Modeling and Analyzing Real-Time Systems

We developed several process-algebra-based techniques for modeling and analyzing real-time systems. In one piece of work we developed a formalism for distributed real-time, in which different system components might be timed by different clocks. We showed how the traditional semantic theory of a timed version of CCS could be adapted to account for distributed real-time, and we illustrated the utility of our theory by showing how a multi-clock hardware design could be rendered and its timing properties checked.

In another piece of work we showed a tight correspondence between a traditional real-time process algebra and a process algebra in which actions can have different priorities. The gist of the approach lies in assigning lower priorities to actions that require longer delays before being enabled; the net effect is that actions that can happen ``sooner'' pre-empt those that can happen ``later''. The chief theoretical result of the work is that this re-interpretation of time does not effect the timing or temporal properties of the system. The practical impact of this work is that priority-oriented models are generally much smaller than those involving real-time. To demonstrate this, we analyzed the timing properties of the SCSI-2 protocol that governs interactions between computers and peripheral devices. Using the Concurrency Workbench, we exhibited a 10-fold reduction (60,000 to 6,000) in the number of states required to model the system.

A Graphical Property Specification Language

We developed a language for specification of high-level properties of safety critical systems. The language is powerful enough to express safety and liveness properies of real-time systems. The semantics for the language is based on real-time modal mu-calculus. However, the complexity of specification that is inherent in mu-calculus formulas is now hidden from the end user. In the design of the language, we took the following two-level approach. At the top level, expressions of the language are visually organized as acyclic graphs. The nodes and edges of such graph are labeled with behavioral notions like "eventually", "always", "send a certain event", "within a time period", "avoiding a certain event". This set of notions can be easily adjusted to include concepts specific to the user's problem domain. On the lower level, the primitives of the first level are represented as fragments of mu-calculus formulas. Therefore, to add a new user-level primitive, a formal methods expert needs to capture the user's intuition as a mu-calculus fragment. This mode of interaction between the end user, who does not necessarily have a formal methods background, and a formal methods expert, who may not have a detailed knowledge of the specific application domain, seems to be natural and will, in our opinion, lead to more intutive specifications.

To be able to check properties expressed in this language, we developed an efficient local model checker for real-time modal mu-calculus. The model checker is implemented as a distributed program using the Sun threads package on Solaris. Unlike most other distributed systems, the model checker is not trying to achieve additional speed-up through the use of threads. Instead, the implementation assigns different threads to explore separate executions. In doing so, it employs the scheduling algorithm of the underlying operating system in order to ensure fair progress along each execution path. This design of the model checker summarizes our prior experiences in the implementation of model checking algorithms for the Concurrency Factory tool and aims to conquer the limitations of these algorithms uncovered previously.

PERSONNEL SUPPORTED

Faculty

Research Scientist

Postdoctoral Fellows

Graduate Students

PUBLICATIONS

Book

Journals

Conferences

INTERACTIONS/TRANSITIONS

Participation/Presentations At Meetings, Conferences, Seminars, Etc.

Consultative And Advisory Functions To Other Laboratories And Agencies

Transitions

The formal methods developed through the current research are utilized by Computer Command and Control Company (CCCC). The company uses the advances in graphical specification languages and real-time verification techniques, in an effort to build an industry-strength specification and analysis environment for real-time systems. Specifically, the environment is built around GCSR specification language developed by the Real-Time Systems group (RTG) in the University of Pennsylvania. Intuitive primitives of the language allow efficient specifications, while operational semantics of GCSR enable interactive simulation of the specifications. Analysis capabilities are based on efficient state-space exploration techniques developed by the RTG. The person in CCCC responsible for the application of formal methods is Dr. Oleg Sokolsky (sokolsky@cccc.com) and is currently conducting a joint work with Allied Signal for the analysis of the Redundancy Management System (RMS) of the single-stage-to-orbit space craft.

NEW DISCOVERIES, INVENTIONS, OR PATENT DISCLOSURES

None.

HONORS/AWARDS

None.