# TESS: Automated Support for the Evolution of Persistent Types

Barbara Staudt Lerner, University of Massachusetts, Amherst

Persistent data often has a long lifetime. During its lifetime, the types that are used to structure the data may undergo evolution to support new requirements or provide more efficient services. This evolution often makes the persistent data inaccessible unless it also evolves with the types. Existing systems that support type and data evolution focus on changes isolated to individual types, thereby limiting what can be easily accomplished during maintenance. We extend this work by presenting a model of compound type changes that can also describe changes simultaneously involving multiple types and their effects on data. We then describe Tess, a system to automate the evolution of types and their associated data when the types undergo compound type changes.

# Precise Specification and Automatic Application of Design Patterns

Amnon H. Eden, Tel Aviv University
Joseph Gil, Technion
Amiram Yehudai, Tel Aviv University

Despite vast interest in design patterns, the specification and application of patterns is generally assumed to rely on manual implementation. We describe a precise method of specifying how a design pattern is applied: by phrasing it as an algorithm in a meta-programming language. We present a prototype of a tool that supports the = specification of design patterns and their realization in a given program. Our prototype allows automatic application of design patterns without obstructing the source code text from the programmer, whom may edit it at will. We conclude with demonstrating such specification using meta-programming techniques and a sample outcome of its application.

# Formal Specification of Human-Computer Interaction by Graph Grammars under Consideration of Information Resources

Bettina Eva Sucrow, University of Essen, Germany

A successful design of an interactive system requires a clear understanding of human-machine interaction. For the specification of such a system a precise consideration of the user's context during each step of the development process is therefore necessary. Moreover, a formal specification method for expressing interaction is highly desirable in order to achieve a precise and continuous specification process between the requirements and design stages.

In this paper several of the environmental cues influencing the user during the interaction with a system are considered. These environmental cues are modelled using the concept of information resources. Interaction is described using these resources and formally specified by the notation of graph grammars in order to be able to reason about a system, to assess a system wrt important properties and, finally, to achieve a continuous specification process between the requirements and design stages. This approach will be demonstrated by specifying a safety-critical system concerning the interaction between the pilot and the flight management system on the flight deck of an aircraft.

# Automatic High-Quality Reengineering of Database Programs by Temporal Abstraction

Yossi Cohen, Tel Aviv University
Yishai A. Feldman

The relational database model is currently the target of choice for the conversion of legacy software that uses older models (such as indexed-sequential, hierarchical, or network models). The relational model makes up for its lower efficiency by greater expressive power, and by optimization of queries, using indexes and other means. However, sophisticated analysis is required in order to take advantage of these features, since converting each database access operation separately does not use the greater expressive power of the target database and does not enable it to perform useful optimizations.

By analyzing the behavior of the host program around the database access operations, it is possible to discover patterns such as filtering, joins, and aggregative operations. It is then possible to remove those operations from the host program and re-implement them in the target database query language. This paper describes an automatic system, called MIDAS, that performs high-quality reengineering of legacy database programs in this way. The results of MIDAS were found to be superior to those of the naive one-to-one translation in terms of readability, size, speed, and network data traffic.

# On the Verification of VDM Specification and Refinement with PVS

Savi Maharaj, University of Stirling
Juan Bicarregui, Imperial College

Although the formal method VDM has been in existence since the 1970's, there are still no satisfactory tools to support verification in VDM. This paper deals with one possible means of approaching this problem by using the PVS theorem-prover. It describes a translation of a VDM-SL specification into the PVS specification language using, essentially, the very transparent translation methods described in \cite{Age96}. PVS was used to typecheck the specification and to prove some non-trivial validation conditions. Next, a more abstract specification of the same system was also expressed in PVS, and the original specification was shown to be a refinement of this one. The drawbacks of the translation are that it must be done manually (though automation may be possible), and that the shallow embedding'' technique which is used does not accurately capture the proof rules of VDM-SL. The benefits come from the facts that the portion of VDM-SL which can be represented is substantial and that it is a great advantage to be able to use the powerful PVS proof-checker.

# Applying Concept Formation Methods to Object Identification in Procedural Code

Houari Sahraoui, CRIM
Walcelio Melo, Oracle Brazil
Hakim Lounis, CRIM
Francois Dumont, CRIM

Legacy software systems present a high level of entropy combined with imprecise documentation. This makes their maintenance more difficult, more time consuming, and costlier. In order to address these issues, many organizations have been migrating their legacy systems to new technologies. In this paper, we describe a computer-supported approach aimed at supporting the migration of procedural software systems to the object-oriented (OO) technology, which supposedly, fosters reusability, expandability, flexibility, encapsulation, information hiding, modularity, and maintainability. Our approach relies heavily on the automatic formation of concepts based on information extracted directly from code to identify objects. The approach tends, thus, to minimize the need for domain application experts. We also propose rules for the identification of OO methods from routines. A well-known and self-contained example is used to illustrate the approach. We have applied the approach on medium/large procedural software systems, and the results show that the approach is able to find objects and to identify their methods from procedures and functions.

# Using KIV to Specify and Verify Architectures of Knowledge-Based Systems

Dieter Fensel, University Karlsruhe
Arno Schonegge

We discuss the use of the Karlsruhe Interactive Verifier (KIV) for the verification of conceptual and formal specifications of knowledge-based systems. KIV was originally developed for the verification of procedural programs but it fits well for verifying knowledge-based systems. It specification language is based on algebraic specification means for the functional specification of components and dynamic logic for the algorithmic specification. It provides a semi-automatic theorem prover integrated into a complex tool environment supporting aspects like the automatic generation of proof obligations, generation of counter examples, proof management, proof reuse etc. Only through this support, verification of complex specification become possible. We provide some examples on how to specify and verify tasks, problem-solving methods, and their relationships with KIV.

# Processing Natural Language Requirements

The importance of requirements, which in practice often means natural language requirements, for a successful software project cannot be underestimated. Although requirement analysis has been traditionally reserved to the experience of professionals, there is no reason not to use various automatic techniques to the same end.

In this paper we present Circe, a Web-based environment for aiding in natural language requirements gathering, elicitation, selection, and validation and the tools it integrates. These tools have been used in several experiments both in academic and in industrial environments.

Among other features, Circe can extract abstractions from natural language texts, build various models of the system described by the requirements, check the validity of such models, and produce functional metric reports. The environment can be easily extended to enhance its natural language recognition power, or to add new models and views on them.

# A Structured Approach for Synthesizing Planners from Specifications

Biplav Srivastava, Arizona State University
Subbarao Kambhampati
Amol Mali

Plan synthesis approaches in AI fall into two categories: domain-independent and domain-dependent. The domain-independent approaches are applicable across a variety of domains, but may not be very efficient in any one given domain. The domain-dependent approaches can be very efficient for the domain for which they are designed, but would need to be written separately for each domain of interest. The tediousness and the error-proneness of manual coding have hither-to inhibited work on domain-dependent planners. In this paper, we describe a novel way of automating the development of domain dependent planners using knowledge-based software synthesis tools. Specifically, we describe an architecture called CLAY in which the Kestrel Interactive Development System (KIDS) is used in conjunction with a declarative theory of domain independent planning, and the declarative control knowledge specific to a given domain, to semi-automatically derive customized planning code. We discuss what it means to write declarative theory of planning and control knowledge for KIDS, and illustrate it by generating a range of domain-specific planners using state space and plan space refinements. We demonstrate that the synthesized planners can have superior performance compared to classical refinement planners using the same control knowledge.

# Enhancing the Component Reusability in Data-Intensive Business Programs through Interface Separation

Hee Beng Kuan Tan, Nanyang Technological University

Visual development environments have provided good support in the reuse of graphical user interface, report and query generation, and simpler database retrieval and updating. However, many commonly used components for computation and database processing and updating are still required to be repeatedly designed and developed. The main problem is that current methods do not support the separation of component interface from a component. Component Interface is not an intrinsic property of the component. Incorporating it into the component affects the reusability of the component adversely. A program representation is proposed in this paper to address the problem. The representation enhances the reusability of a component through separating the component interface from the component.

# Mechanising Requirements Engineering: Reuse and the Application of Domain Analysis Technology

W. Lam, University of Hertfordshire
S. Jones

The paper describes efforts that have made to mechanise the requirements engineering process in an industrial avionics domain. Our approach is based on an analysis of both the application domain and the task domain. The paper describes the processes we have used for domain analysis, and the tool we have developed to support mechanisation. We give an initial evaluation of the approach and close with a summary of lessons learnt.

# Mapping Software Architectures to Efficient Implementations via Partial Evaluation

Renaud Marlet, IRISA/INRIA
Scott Thibault
Charles Consel

Flexibility is recognized as a key feature in structuring software, and many architectures have been designed to that effect. However, they often come with performance and code size overhead, resulting in a flexibility vs. efficiency dilemma. The source of inefficiency in software architectures can be identified in the data and control integration of components, because flexibility is present not only at the design level but also in the implementation.

We propose the use of program specialization in software engineering as a systematic way to improve performance and, in some cases, to reduce program size. In particular, we advocate the use of partial evaluation, which is an automatic technique to produce efficient, specialized instances of generic programs. We study several representative, flexible mechanisms found in software architectures: selective broadcast, pattern matching, interpreters, layers, and generic libraries. We show how partial evaluation can systematically be applied in order to optimize those mechanisms.

# Automated Configuration of Distributed Applications from Reusable Software Architectures

H. Gomaa, George Mason University
G. A. Farrukh, Science Applications International Corporation

In this paper a reuse-oriented perspective is taken to designing and implementing configurable distributed applications. An application domain is defined as a family of systems that have some features in common and others that differentiate them. During domain engineering, reusable specifications, architectures and component types are developed, which capture the similarities and variations of the family of systems that compose the application domain. Target systems are generated by tailoring the reusable specification and architecture given the requirements of the target system, and configuring a target system based on the tailored architecture. The paper describes an automated approach for configuring distributed applications from a reusable architecture and library of predefined component types.

# A contribution to program comprehension by program analysis: Application to numerical programs

Yamine Ait-Ameur, ENSMA-LISI

Functional correctness of software has been widely studied by formal methods. This paper deals with non-functional aspects of software. It presents an approach towards the integration of methods in order to handle properties of numerical programs. We develop a program analysis technique, by extending the standard domain, which allows to formalise, evaluate and check non-functional properties of programs. These property evaluations are used in two main areas. The first one is related to program design where non-functional properties help to choose data representations and to perform program transformations. The second is related to reverse engineering and particularly to software reuse and maintenance where non-functional properties evaluations help to find errors during software reuse. As example, a functional language with numerical type only is considered in this paper. The accuracy of the numerical computations is the considered non-functional property.

# Interactive Component-Based Software Development with Espresso

Most component models in use today are language-independent, but also platform-dependent and not designed specifically to support a tool-based visual development paradigm. Espresso is a new component model that was designed with the goal of supporting software development through tool-based visual component composition. Being implemented in Java, Espresso components can run on any Java-enabled platform.

# Automatic Synthesis of Recursive Programs: The Proof-Planning Paradigm

Alessandro Armando, University of Genova
Alan Smaill, University of Edingburgh
Ian Green, University of Edinburgh

We describe a proof plan that characterises a family of proofs corresponding to the synthesis of recursive functional programs. This plan provides a significant degree of automation in the construction of recursive programs from specifications, together with correctness proofs. This plan makes use of meta-variables to allow successive refinement of the identity of unknowns, and so allows the program and the proof to be developed hand in hand. We illustrate the plan with parts of a substantial example -- the synthesis of a unification algorithm.

# Formally Specifying Engineering Design Rationale

Keith Williamson, Boeing
Michael Healy
Mike Uschold
Rob Jasper

This paper briefly describes our initial experiences in applied research of formal approaches to the generation and maintenance of software systems supporting structural engineering tasks. We describe the business context giving rise to this activity, and give an example of the type of engineering problem we have focused on. We briefly describe our approach to software generation and maintenance, and point out the challenges that we appear to face in transferring this technology into actual practice.

# Towards a Design Assistant for Distributed Embedded Systems

Dorothy Setliff, University of Pittsburgh
Jay Strosnider, Carnegie-Mellon University

This paper presents an overview of a distributed embedded system design assistant called the Systems Engineers Workbench (SEW 2.0). SEW 2.0 uses a suite of design frameworks, with supporting representations, to structure the inherent complexity. A set of design assist operatives further improve the ability of the designer to explore the domain space while evaluating performance/resource tradeoffs.

# Correct-Schema-Guided Synthesis of Steadfast Programs

Pierre Flener, Bilkent University
Kung-Kiu Lau, University of Manchester
Mario Ornaghi, University of Milan

It can be argued that for (semi-)automated software development, program schemas are indispensable, since they capture not only structured program design principles, but also domain knowledge, both of which are of crucial importance for hierarchical program synthesis. Most researchers represent schemas purely syntactically (as higher-order expressions). This means that the knowledge captured by a schema is not formalised. We take a semantic approach and show that a schema can be formalised as an open (first-order) logical theory that contains an open logic program. By using a special kind of correctness for open programs, called steadfastness, we can define and reason about the correctness of schemas. We also show how to use correct schemas to synthesise steadfast programs.

# NORA/HAMMR: Making Deduction-Based Software Component Retrieval Practical

Johann Schumann, Technical University of Munich
Bernd Fischer, Technical University of Braunschweig

Deduction-based software component retrieval uses pre- and postconditions as indexes and search keys and an automated theorem prover (ATP) to check whether a component matches. This idea is very simple but the vast number of arising proof tasks makes a practical implementation very hard. We thus pass the components through a chain of filters of increasing deductive power. In this chain, rejection filters based on signature matching and model checking techniques are used to rule out non-matches as early as possible and to prevent the subsequent ATP from "drowning." Hence, intermediate results of reasonable precision are available at (almost) any time of the retrieval process. The final ATP step then works as a confirmation filter to lift the precision of the answer set. We implemented a chain which runs fully automatically and uses MACE for model checking and the automated prover SETHEO as confirmation filter. We evaluated the system over a medium-sized collection of components. The results encourage our approach.

# Modelling the Application Domains of Software Engineering Technologies

Andreas Birk, Fraunhofer Institute for Experimental Software Engineering

The effectiveness of software engineering technologies depends very much on the situation in which they are applied. In order to further improve software development practices we need to investigate and document the application domains of technologies. A first step toward a deeper understanding is to develop an appropriate modelling formalism that allows us to explicitly describe our current knowledge and to utilize the knowledge within software engineering research and practice. This paper suggests a modelling formalism for describing application domains of software engineering technologies. It relies on established techniques from software domain analysis and software engineering measurement. A lifecycle process for acquisition and use of technology domain models is presented. The models can be used for supporting systematic reuse of technologies during planning of software projects and improvement programmes.

# Feedback Handling in Dynamic Task Nets

Carl-Arndt Krapp, RWTH Aachen
Bernhard Westfechtel

While a software process is being executed, many errors and problems occur which require to reconsider previously executed process steps. In order to handle feedbacks in a process-centered software engineering environment, several requirements need to be addressed (adaptability, human intervention, impact analysis, change propagation, restoration of the work context, and traceability). Feedback management in DYNAMITE meets these requirements. DYNAMITE is based on dynamic task nets and specifically supports feedbacks through feedback relations, task versions, and customized semantics of data flows. A methodology for feedback handling is also represented. A programmed graph rewriting system serves to specify analysis, planning, and enactment --- which may be interleaved seamlessly --- in a uniform formal framework.

# Specification and Validation of the CO4 Distributed Knowledge System Using LOTOS

Charles Pecheur, INRIA Rhone-Alpes

This paper relates the specification and validation of a consensual decision protocol used in CO4, a computer environment dedicated to the building of a distributed knowledge base. This protocol has been specified in the ISO formal description technique LOTOS. The CADP tools from the EUCALYPTUS LOTOS toolset have been used to validate different safety and liveness properties. The validation work has confirmed an announced violation of knowledge consistency and has put forth a case of inconsistent hierarchy, four cases of unattended message reception and some further local corrections in the definition of the protocol

# Augmenting Abstract Syntax Trees for Program Understanding

Christopher A. Welty, Vassar College

Program Understanding efforts by individual maintainers are dominated by a process known as discovery, which is characterized by low-level searches through the source code and documentation to obtain information important to the maintenance task. Discovery is complicated by the delocalization of information in the source code, and can consume from 40-60% of a maintainers time. This paper presents an ontology for representing code-level knowledge based on abstract syntax trees, that was developed in the context of studying maintenance problems in a small software company. The ontology enables the utilization of automated reasoning to counter delocalization, and thus to speed up discovery.

# Genetic Algorithms for Dynamic Test Data Generation

Christoph C. Michael, RST Corporation
Gary E. McGraw
Michael A. Schatz
Curtis C. Walton

In software testing, it is often desirable to find test inputs that exercise specific program features. To find these inputs by hand is extremely time-consuming, especially when the software is complex. Therefore, numerous attempts have been made to automate the process. Random test data generation consists of generating test inputs at random, in the hope that they will exercise the desired software features. Often, the desired inputs must satisfy complex constraints, and this makes a random approach seem unlikely to succeed. In contrast, combinatorial optimization techniques, such as those using genetic algorithms, are meant to solve difficult problems involving the simultaneous satisfaction of many constraints. However, test data generation has only been applied to very simple programs in the past. Since they may not present great difficulties to random test data generation, it is difficult to compare the efficacy of different approaches when such programs are used as benchmarks. In this paper, we discuss experiments with a test generation problem that is harder than the ones discussed in earlier literature --- we use a larger program and more complex test adequacy criteria. We find a widening gap between a technique based on genetic algorithms and those based on random test generation.

# An Automated Object-Oriented Testing for C++ Inheritance Hierarchy

Chun-Chia Wang, Tamkang University

This paper proposes a concept named unit repeated inheritance (URI) in Z notation to realize object-oriented testing of an inheritance hierarchy. Based on this unit, an inheritance level technique (ILT) method as a guide to test object-oriented software errors in the inheritance hierarchy is described. In addition, two testing criteria, intralevel first and interlevel first, are formed based on the proposed mechanism. Moreover, in order to make the test process automatic, we use LEX and YACC to automatically generate a lexical analyzer and a parser to demonstrate a declaration of C++ source code. And, we also construct a windowing tool used in conjunction with a conventional C++ programming environment to assist a programmer to analyze and test his/her C++ programs.

# Exploiting Domain-Specific Knowledge To Refine Simulation Specifications

David Pautler, University of Hawaii at Manoa
Steve Woods
Alex Quilici

This paper discusses our approach to the problem of refining high-level simulation specifications. Our domain is simulated combat training for tank platoon members. Our input is a high-level specification for a training scenario and our output is an executable specification for the behavior of a network-based combat simulator. Our approach combines a detailed model of the tank-training domain with non-linear planning and constraint satisfaction techniques. Our initial implementation is successful in large part because of our use of domain-knowledge to limit the branching factor of the planner and the constraint satisfaction engine.

# Strategies of Structural Synthesis of Programs

Mihhail Matskin, Norwegian University of Science and Technology
Enn Tyugu, Royal Institute of Technology, Sweden

Strategies of the structural synthesis of programs (SSP) - of a deductive program synthesis method which is suited for compositional programming in large and is in practical use in a number of programming environments are explained. SSP is based on a decidable logical calculus where complexity of the proof search is still PSPACE. This requires paying special attention to efficiency of search. Besides the general case of SSP, we present synthesis with independent subtasks, synthesis of iterations on regular data structures in terms of SSP and a number of heuristics used for speeding up the search. Application of SSP in the NUT system is demonstrated on an example.

# A Formal Automated Approach for Reverse Engineering Programs with Pointers

Gerald C. Gannod, Michigan State University
Betty H. C. Cheng

As the demands placed on software continue to grow, there is an increasing recognition that software can be error prone. Moreover, the rising costs for software development impose the need to use a given piece of software for a longer period of time, for multiple purposes, and for increasingly larger customer bases. As a result, there is a need for more sophisticated and systematic approaches to maintain software. Given a program S and a precondition Q, the strongest postcondition, denoted sp(S,Q), is defined as the strongest condition that holds after the execution of S, given that S terminates. By defining the formal semantics of each of the constructs of a programming language, a formal specification of the behavior of a program written using the given programming language can be constructed. In this paper we address the formal semantics of pointers in order to address the need for a more realistic model of programming languages that incorporate the use of pointers. In addition we present a tool for supporting the construction of formal specifications of programs that include the use of pointers.

# Facilitating an Automated Approach to Architecture-based Software Reuse

Yonghao Chen, Michigan State University
Betty H. C. Cheng

Software reuse is recognized as having significant potential to improving software development productivity and quality. Over the past several years, a number of reuse techniques have been developed to address various reuse issues, such as component classification, retrieval, and integration. However, it is not adequate to only have reuse techniques that address reuse issues separately. Instead, a seamless integration of these reuse techniques is critical to achieve effective reuse. In this paper, we present a software architecture-based approach to reuse that is systematic and integrates the key steps in the reuse process. We show how this approach can facilitate effective reuse. A prototype system that supports this approach is also introduced.

# Moving Proofs-As-Programs Into Practice

James L. Caldwell, NASA Ames Research Center

Proofs in the Nuprl system, an implementation of a variant of Martin Lof constructive type theory, yield correct-by-construction'' programs. In this paper a new methodology is presented for extracting efficient and readable programs from inductive proofs. The resulting extracted programs are in a form suitable for use in hierarchical verifications in that they are amenable to clean partial evaluation via extensions to the Nuprl rewrite system. The method is based on two elements: specifications written with careful use of the Nuprl set-type to restrict the extracts to strictly computational content; and on proofs that use induction tactics that generate extracts using familiar fixed-point combinators of the untyped lambda calculus. The method differs from existing Nuprl practice by making application of the proofs-as-programs interpretation applicable to all levels of hierarchical verifications. In this paper the methodology is described and its application is illustrated by a number of examples.

# A Metric-based Approach to Detect Abstract Data Types and State Encapsulations

Jean-Francois Girard, Fraunhofer Institute for Experimental Software Engineering
Rainer Koschke, University of Stuttgart
Georg Schied

This article presents an approach to identify abstract data types (ADT) and abstract state encapsulations (ASE, also called abstract objects) in source code. This approach groups together functions, types, and variables into ADT and ASE candidates according to the proportion of features they share. The set of features considered includes the context of these elements, the relationships to their environment, and informal information.

A prototype tool has been implemented to support this approach. It has been applied to three C systems (each between 30-38 Kloc). The ADTs and ASEs identified by the approach are compared to those identified by software engineers who did not know the proposed approach. In a case study, this approach have been shown to identify, in most cases, more ADT's and ASE's than five published techniques applied on the same systems. This is important when trying to identify as many ADTs and ASEs as possible.

# From Formal Specifications to Natural Language: A Case Study

J. P. Tremblay
P. G. Sorenson, University of Alberta
P. S. Findeisen

Formal descriptions, while difficult for most human readers to understand, are convenient for specifying large software systems, where completeness and consistency are important issues. Informal specifications can offer advantages in readability, but ambiguities and contradictions are unavoidable side-effects. Since a specification often acts as a formal contract between the software developer and the customer, it is essential that both sides be able to fully understand the specification document. Systems have been proposed which help the software client better understand the specification by automatically paraphrasing it in natural language. Metaview is a customizable metasystem that facilitates the construction of software specification environments that can be used to support software specification tasks. This paper describes a strategy for producing natural language descriptions of formal software requirements specifications developed using Metaview specification environments. A case study demonstrates, using the Object Modeling Technique specification methodology as an example, how the system generates a textual description from a database of software requirements.

# Declarative Specification of Software Architectures

John Penix, University of Cincinnati
Perry Alexander
Klaus Havelund, Recom Technologies, NASA Ames

Scaling formal methods to large, complex systems requires methods of modeling systems at high levels of abstraction. In this paper, we describe such a method for specifying system requirements at the software architecture level. An architecture represents a way of breaking down a system into a set of interconnected components. We use {\it architecture theories} to specify the behavior of a system in terms of the behavior of its components via a collection of axioms. The axioms describe the effects and limits of component variation and the assumptions a component can make about the environment provided by the architecture. As a result of the method, the verification of the basic architecture can be separated from the verification of the individual component instantiations. We pres ent an example of using architecture theories to model the task coordination architecture of a multi-threaded plan execution system.

# Data Flow Analysis within the ITOC Information System Design Recovery Tool

John V. Harrison, University of Queensland
Anthony Berglas

Most contemporary fourth-generation languages (4GL) are tightly coupled with the database server, and other subsystems, that are provided by the vendor. As a result, organisations that wish to change database vendors are typically forced to rewrite their applications using the new vendor's 4GL. The anticipated cost of this redevelopment can deter an organisation from changing vendors, hence denying it the benefits that would otherwise result, e.g., the exploitation of more sophisticated database server technology. If tools existed that could reduce the rewriting effort, the large upfront cost of migrating the organisation's applications would also be reduced, which could make the transition economically feasible. The ITOC project is part of a large collaborative research initiative between the Centre for Software Maintenance at the University of Queensland and Oracle Corporation. The objective of the project is to design and implement a tool that automatically recovers both the application structure and the static schema definition of 4GL information system applications. These recovered system components are transformed into constructs that populate Oracle's Designer 2000 CASE repository. An essential component of the ITOC process is to determine the relationships between different columns in the database and between references to those columns and fields that appear within the user interface. This in turn requires analysis of the data flow between variables in the 4GL programs. While data flow analysis has been applied in many applications, for example, code optimization and program slicing, this paper presents the results of using data flow analysis in the construction of a novel design recovery tool for 4GL-based information systems. The effectiveness of the tool on legacy commercial applications is described.

# Web-based Formal Methods Tools

Joseph Goguen, University of California, San Diego
Kai Lin
Akira Mori
Grigore Rosu
Akiyoshi Sato

This paper describes some tools to support formal methods for software development over the web, and conversely, some formal methods to support the development of such tools. Our research focuses on producing a web-based formal methods library, with capabilities for cooperative work, remote browsing, and remote execution of proofs and prototypes. Our tools build websites for standard browsers, using \html, Java, and JavaScript. These tools include editors for specifications and proofs, a server for remote proof execution, a protocol that maintains truth for distributed cooperative proving, and a website editor generator. Our methods to aid website design exploit algebraic specification technology, and follow a new approach called algebraic semiotics that combines semiotics with algebraic specification. Several examples are given.

# Retrieving Software Components That Minimize Adaptation Effort

Lamia Labed Jilani, IRSIT
Jules Desharnais, Universite Laval
Marc Frappier, Universite de Sherbrooke
Ali Mili, West Virginia University
Rym Mili, University of Texas, Dallas

Given a software library whose components are represented by formal specifications, we distinguish between two types of retrieval procedures: exact retrieval, whereby, given a query K, we identify all (and only) the library components that are correct with respect to K; approximate retrieval, which is invoked in case exact retrieval fails, and which (ideally) identifies the library components that minimize the required adaptation effort (once such a component is retrieved, the effort of adapting it to satisfy query K is minimal over the set of all the components of the library). To this effect, we define four measures of functional distance between specifications, and devise algorithms that minimize these measures over a structured set of components; then we discuss to what extent these measures can be used as predictors of adaptation effort.

# Modular Flow Analysis of Concurrent Software

Matthew Dwyer, Kansas State University

Application of software validation techniques as soon as designs and implementations are produced can detect and prompt correction of defects that might otherwise be significantly more expensive to fix if found in later phases of development. This is particularly true for errors related to the interaction between system components. In this paper, we describe a static analysis approach that allows validation of sequential and concurrent system components as soon as they become available. This work builds off of an existing approach, FLAVERS, that uses program flow analysis to verify explicitly stated correctness properties of concurrent software. We illustrate our modular analysis approach and some of its benefits by describing part of a case-study with a realistic concurrent multi- component system.

# Research directions for automated software verification: Using trusted hardware

Premkumar Devanbu, AT&T Laboratories
Stuart Stubblebine

Service providers hosting software on servers at the request of content providers need assurance that the hosted software has no undesirable properties. This problem applies to browsers which host applets, networked software which can host software agents, etc. The hosted software's properties are currently verified by testing and/or verification processes by the hosting computer. This increases cost, causes delay, and leads to difficulties in version control. By furnishing content providers with a physically secure computing device with an embedded tool, and an embedded certified private key, such properties can be verified and/or enforced by the secure computing device at the content provider's site; the secure device can verify such properties, statically whenever possible, and by inserting checks into the executable binary when necessary. The resulting binary is attested by a trusted signature, and can be hosted with confidence. This position paper outlines our scientific and engineering goals in this project; implementation work is currently under way.

# Extracting Objects from Legacy Imperative Code

Ricky E. Sward, U.S. Air Force Institute of Technology
Thomas C. Hartrum

This paper presents a methodology for extracting objects from legacy imperative code. The Parameter-Based Object Identification (PBOI) methodology is based on the thesis that object attributes manifest themselves as data items passed from subprogram to subprogram in the imperative paradigm. The PBOI methodology uses this thesis to filter the attributes of candidate objects as imperative subprograms are converted to the object-oriented paradigm. A taxonomy of imperative subprograms based on the number of calls a subprogram makes and the number of data items it produces is also presented in the paper. This taxonomy classifies any imperative subprogram into one of six categories. By using the taxonomy, the task of re-engineering is reduced to building transformations for each of the categories. Program slicing is used to convert subprograms in two of the categories into new subprograms that are classified in two other categories. This focuses the task of re-engineering even more. Several examples of transforming imperative subprograms are provided in the paper. A prototype implementation of the PBOI methodology has been completed, and results of converting 3000 lines of FORTRAN code are presented in the paper. A comparison of the PBOI methodology to related work is also provided.

# Towards Semantic-Based Object-Oriented CASE Tools

Robert France, Florida Atlantic University
Matthew P. Evett
Emanuel Grant

Most popular graphical object-oriented methods (OOMs) provide good mechanisms for handling complexity, and constructs that facilitate the creation of abstract, highly-structured models of systems. Despite their strengths, OOMs often do not produce models that are amenable to rigorous semantic analysis. This is a direct result of their loosely defined semantic bases. Most major commercial CASE tools based on these methods provide excellent support for syntactic analysis of models, but they also do not provide a firm semantic base for the models produced. The Methods Integration Research Group (MIRG) at Florida Atlantic University has developed techniques for generating formal specifications from two major OO modeling techniques, Fusion and UML. The formal spec ifications capture the semantics of the model in a precise and analyzable manner. Our early work on formalizing Fusion models was implemented as an extension of a commercial OO CASE tool. The extensions we implemented facilitate the generation of formal Z specifications from Fusion OO analysis models, and supports the analysis of the models via formal analysis of the generated Z specifications. In this paper we give an overview of the improved semantic rules we have developed and discuss how we plan to modify the tool to incorporate the new rules and other mechanisms we are currently developing.

# Reactive System Validation Using Automated Reasoning Over a Fragment Library

Robert J. Hall, AT&T Laboratories

Reactive systems such as telephone switches, traffic controllers, and software agents are difficult to specify correctly with respect to user expectations, because they must typically perform heterogeneous behaviors which interact robustly with complex, dynamic environments. In previous work, I have introduced and explored the idea of sound scenario generalization as a technique for helping to validate specifications of such systems by automatically discovering a set of general situation/action rules guaranteed to soundly describe the specification's behavior. Acquiring this collection, a fragment library, reduces the validation problem from an infinite set of concrete behaviors to a usually finite set of general behaviors. However, an open problem has been the difficulty sometimes faced by the user in validating the general behaviors with respect to the system's true requirements. In this paper, I describe several techniques that address this problem based on applying automated reasoning tools to an evolving fragment library. These techniques help bridge the gap between recognizable statements of what the user requires and the automatically generalized descriptions of the system's behaviors.

# A Static Analysis for Program Understanding and Debugging

Ronan Gaugne, IRISA/INRIA

The incorrect use of pointers is one of the most common source of bugs in imperative languages. In this context, any kind of static code checking capable of checking user-defined assumptions and detecting potential bugs at compile time is welcome. This paper presents a static analysis technique for a subset of C. The tool supports user-defined assertions inserted in the body of the program. Assertions are of two kinds: - Static assertions automatically verified by the analyser, - Hypothetical assertions treated as assumptions by the analyser. The technique deals with dynamically allocated data structures and it is accurate enough to handle circular structures.

# Application of Formal Methods to the Development of a Software Maintenance Tool

Sandrine Blazy, CEDRIC
Philippe Facon

Partial evaluation is an optimization technique traditionally used in compilation. We have adapted this technique to the understanding of scientific application programs during their maintenance and we have implemented a tool. This tool analyzes Fortran 90 application programs and performs an interprocedural pointer analysis. This paper presents how we have specified this analysis with different formalisms (inference rules with global definitions and set and relational operators). Then we present the tool implementing these specifications. It has been implemented in a generic programming environment and a graphical interface has been developed to visualize the information computed during the partial evaluation (values of variables, already analyzed procedures, scope of variables, removed statements, . ..).

# Tools Supporting the Creation and Evolution of Software Development Knowledge

Software development is a knowledge-intensive activity involving the integration of diverse knowledge sources that undergo constant change. The volatility of knowledge in software development requires that knowledge bases are able to support a continuous knowledge acquisition process where tools are available that can make use of partial knowledge. To address these problems, we have developed a series of prototypes designed to collect and disseminate domain knowledge for software development. These tools use case-based technology to create an organizational memory of software development practices that can serve as the basis for making decisions on development projects. The cases are also used to support an organizational learning process in which a knowledge base is used to create and disseminate knowledge about common development practices. These techniques are being explored and verified in a large in-house development organization that is concerned with creating an infrastructure for software development.

# Modeling Software Processes by Process and Object Ontologies

Takahira Yamaguchi, Shizuoka University

In order to model software processes based on ontologies engineering techniques, we present a methodology to manually construct the following ontologies: an object ontology based on constituent elements for objects, and a process ontology based on the relationships between inputs and outputs, such as subsumption relationships. Next, using the constructed ontologies, the environment produces software process plans good for user queries, with both user interaction and constraints satisfaction by the generate and test paradigm. Furthermore, experimental results show that the environment works well in generating a good software process plan to a query about the intermediate stage of development, between basic design and detailed design.

# Notes on Refinement, Interpolation and Uniformity

Theodosis Dimitrakos, Imperial College
T. S. E. Maibaum

In the 25 years of its history, basic research in the Abstract Data Type community has been challenged to provide answers to some critical questions on the fundamentals of refinement between formal specifications. Why do correct refinements not compose correctly in all cases? What results are dependent on the specification formalism and what are "universal"? Which meta-logical properties of a specification formalism facilitate the refinement pro cess? An example of a meta-logical property which amounts to a "law" of a general theory of specification and refinement is given by the role ofinterpolation in parameter instantiation and the composability of refinements.In addition, the presence of a Uniform version of Craig-Robinsoninterpolation seems to facilitate the verification and construction of refinements and to underlie the potential automation of the refinement process in an appropriate entailment system. On the other hand, many logics that have been used in refinement lack the desirable interpolation properties. This motivates a thorough investigation of methods to expand a specification formalism orthogonally,so that an adequately strong version of the critical interpolation properties is obtained and an easy-to-derive, uniform presentation of the crucial interpolants is supported.