NISTIR 6137

The EFFective Manager Tool

Dolores Wallace
Mark Zimmerman

U.S. DEPARTMENT OF COMMERCE
Technology Administration
National Institute of Standards and Technology
oftware Diagnostics and Conformance Testing Division
Information Technology Laboratory Gaithersburg, MD 20899-0001
April 1998

Abstract

The collection and analysis of software error, fault, and failure data from many high integrity systems may yield reference data for matching development and assurance methods to characteristics of a specific system. Profiles derived from the data may help researchers to identify areas where new methods of error prevention and detection are most needed. The National Institute of Standards and Technology has initiated a program on error, fault, and failure data to address these topics. An initial data collection and analysis tool has been developed for this project.

Keywords

Data; error; fault; failure; high integrity software; reference data; software quality; taxonomy; world wide web (WWW).

DISCLAIMER: Certain trade names and company products are mentioned in the text. In no case does such mention imply recommendation or endorsement by the National Institute of Standards and Technology, nor does it imply that the products are necessarily the best available for the purpose.


TABLE OF CONTENTS

1.INTRODUCTION
2.THE EFF PROJECT
3.CONCEPTS OF EFF DATA COLLECTION AND ANALYSIS
4.FAULT and FAILURE DATA AS A PROJECT MANAGEMENT TOOL
5.THE EFF PROJECT TOOLSET
6.THE EFFTool
6.1 The EFFTool COLLECTION COMPONENT
6.2 THE EFFTool ANALYSIS COMPONENT
7.SUMMARY
8.ACKNOWLEDGMENT
9.REFERENCES
APPENDIX A.USING THE EFFTool
A.1 Collection Component Menus
A.2 Analysis Component Menus and Displays

TABLES
Table 3.1 Questions for Software Error Analysis
Table 3.2 Process for Data Collection and Analysis
Table 3.3 Diversity Among Taxonomies
Table 6.1 The EFFTool Project Information
Table 6.2 The EFFTool Discovery Data
Table 6.3 The EFFTool Resolution Data

FIGURES
Figure A.1 Project (entry) menu
Figure A.2 The display as indicated above
Figure A.3 Fault/ failure menu
Figure A.4 A one record display from the View f/f/data
Figure A.5 Part of Fault/Failure Form
Figure A.6 Analysis Menu
Figure A.7 Query results for Example 1
Figure A.8 Query results for Example 2
Figure A.9 Query results for Example 3
Figure A.10 Query results for Example 4
Figure A.11 Query results for Example 5
Figure A.12 Query results for Example 6

1. INTRODUCTION

The development and assurance of software for high integrity systems requires methods to prevent or detect software faults(1) during development and potential system faults and failures before they result in operational failure. It is difficult to predict how well development and assurance methods succeed in prevention and detection. Because introducing new technologies is costly, companies are reluctant to change unless they have confidence that the new methods will benefit them. Failures in high integrity systems are rare (and usually costly), and a single system usually does not accumulate enough data to permit meaningful statistical evaluations. Without sufficient data from many projects in various domains, researchers have difficulty identifying the types of problems for which new methods are needed. The results of a Call for White Papers issued by NIST revealed a strong need for an objective organization to address these problems [NIST95].

The mission of the Information Technology Laboratory (ITL) at NIST is to stimulate U.S. economic growth and industrial competitiveness through technical leadership and collaborative research in critical infrastructure technology (e.g., tests and test methods) to promote better development and use of information technology. ITL will provide tests and test methods to facilitate a usable, scalable, interoperable, and secure information technology infrastructure. One of the primary goals of ITL is to assure that U.S. industry, academia, and government have access to accurate and reliable test methods, data, and reference material.

Software researchers need project data on errors, faults and failures from many projects to identify characteristics and to develop benchmarks and profiles(2) for selecting methods and software tools. Providing such data is very closely related to ITL=s mission. Consequently, ITL has initiated a project for error, fault, and failure data collection and analysis. While the project name is Reference Data: Software Error, Fault , Failure Data Collection & Analysis Repository Project, it is usually referred to as the EFF project. The EFF project recognizes the data needs for the development of high integrity systems and supports the mission of ITL.

The EFF project is collecting and analyzing data from the development and maintenance of software products or during the operation of a delivered computer system. The information technology industry may use the resulting reference data to develop software methods and tools and to build better end-user products. NIST encourages companies to consider the benefits of a public data base. NIST will accept new or existing data to augment the repository. All identifying information on data accepted by NIST will be removed before being included in the repository.

Several World Wide Web (WWW) tools are being developed by NIST to assist anyone collecting data for their internal analysis and to provide public access to data. The first tool developed for the EFF project is the EFFective Manager Tool (EFFTool), a WWW software tool for fault and failure data discovered during the development or maintenance of software. The EFFTool is a public domain tool that contains a fault management component to provide a benefit to any company who uses this tool. The tool enables a company to track the status of faults and failures and includes a simple analysis tool for tabulating the status of several fault and failure attributes.

Other tools are in the design stages. One is a data collection tool similar to the EFFTool but with data fields consistent with its purpose: collection of failure data from systems in operation. Both data collection tools will be used by industry on servers at their sites and the data may be provided to NIST whenever the contributor chooses. The tool to provide public access to sanitized data will be a WWW accessible data base system with access not only to data collected by NIST but to other repositories and to public domain analytic tools. Data for public access will have company identifiers removed. Existing statistical and graphical tools are being explored for their feasibility for analyzing and displaying data.

While the primary purpose of this report is to describe the EFFTool, it provides a complete project description (Section 2) and a discussion of research on error, fault, and failure data collection and analysis (Section 3). Section 4 provides an overview of using fault and failure data as a project management tool. Section 5 contains a description of the EFF project toolset, with Section 6 providing general description of the public domain EFFTool. APPENDIX A provides operational details for the data component of the EFFTool and APPENDIX B contains operational details of the analysis component and examples of displaying EFFTool data graphically.

2. THE EFF PROJECT

Successful prediction, risk assessment, and planning are crucial elements for saving millions of dollars per year in the software industry. But, project managers do not always collect the data on errors, faults, and failures that will help them to perform these tasks. On a larger scale, there is a fundamental lack of actual project data on errors, faults, and failures in a public repository. Without such data, industry and government agencies lack benchmark information against which to measure software program quality and to determine the software methods most appropriate for their software development environment.

The purpose of the EFF project is to provide reference data from software development and maintenance projects; fault and failure profiles and benchmarks derived from that data; analytic methods and tools; and metrics for measuring effectiveness of software development methods. The EFF project will help industry and researchers(3) assess software system quality by collecting, analyzing, and providing error, fault, and failure data and by providing data collection and statistical methods and tools for the analysis of software systems.

Project data are needed to determine trends on broader concepts such as:

Data from many individual projects are needed to develop these and other benchmarks and to provide researchers with sufficient samples to develop new analytic methods and to identify where new methods are needed. Projects and their sponsoring companies need similar data to understand where specific error types are likely to occur and the frequency with which they occur. From various analysis methods, developers may locate troublesome parts of their programs and may adjust their development methods, adapt their testing processes, and maintain records for controlling their product quality.

Possible benefits to individual companies include:

Possible benefits to industry from the EFF project include:

By making data from various domains available to researchers, benefits to research may include:

The EFF project involves the following tasks:

Generate a standard data collection structure derived from IEEE and industry nomenclature and formats. Provide for anonymity of data and removal of any proprietary information.

Seek industry, government, academic collaborators/contributors to populate the data repository.

Make sanitized data publicly available through WWW-based facilities at NIST. Validate and sanitize data from contributors. Identify and procure commercial database management system. Refine the data collection and classification methodology as needed.

Develop methods and tools for qualitative and statistical analysis. Identify or develop methods or tools for viewing data, for analyzing data, for measuring impact of methods on software quality, and for assessing relationships of project factors to software quality.

Conduct analyses of collected data. Develop frequency profiles. Conduct analyses to provide understanding of impact of various development and diagnostics methods on failures. Report results/findings.

The plan for the EFF project is aggressive; a primary risk is that the group of willing data contributors will be very small. The fact that NIST's traditional role in defining standards and measures for industry includes objectivity and the ability to protect any proprietary information may help to overcome industry reluctance to provide data. The data collection tools with tracking and management capabilities may be an incentive for contributors. From the research perspective, another risk lies in normalizing data from diverse environments; this is part of the research problem of this project.

Several EFF tasks are progressing simultaneously. NIST has formed a collaborative relationship with SoHaR, Inc. under a Cooperative Research and Development Agreement (CRADA) for which SoHaR, Inc. will be an active participant in the project. Another collaborative activity included a meeting in September, 1996, of researchers and industry representatives to discuss problems likely to be encountered and the results of NIST research in identifying a draft data structure, or model. Indeed, the management component of the EFFTool was a consequence of this meeting. These industry representatives and researchers will continue to provide guidance.

As of the date of this report, another CRADA is under development, and several companies are negotiating the mechanics of providing data. The EFF project is continuously seeking contributors of data.

3. CONCEPTS OF EFF DATA COLLECTION AND ANALYSIS

Research on software faults, and hence on data collection and analysis, is almost as old as software itself. One early paper asks several questions for which the EFFTool is seeking data [ENDR]. These questions, pertinent today, are shown in Table 3.1 and indicate that considerable information about an error is needed to learn from it. Such information includes data about discovery of the problem (e.g., version, date, name of discoverer), description of the problem, resolution (date, name of resolver, version where change made), area or artifact of actual error cause, and the description of change.

Relevant Questions

. Where was the error made?
. When was the error made?
. Who (generic) made the error?
. What was done wrong?
. Why was the particular error made?
. What could have been done to prevent this error?
. If an error could not be prevented, what detection

method could detect it?

Table 3.1 Questions for Software Error Analysis

Basili [BASI] proposed a highly organized approach for data collection and analysis, shown in Table 3.2. With respect to the EFF project, steps 1 and 2 are relatively easy. For step 3, some data categories have been easy to establish, but classifying the symptoms that revealed errors and ultimately the cause of each error is difficult and terms may be changed in the second version of the taxonomy. While a small group has reviewed the data concepts for the EFF project (Step 4), broader usage may require changes to it. Validation of the data (Step 5) will be extremely difficult because accurate (translate to possibly time-consuming) reporting of data is needed, the contributor must understand the data fields of the NIST tool, and NIST must carefully adapt the contributor's data fields to the NIST nomenclature. Implementation of Step 6 relies on contributed data and availability of the data to researchers.

Several potential contributors plan to provide data from existing collections. Such data will vary in content and must be translated into the data categories evolving in this project.

Special care will be exercised when analyzing data collected by a mechanism different from the NIST tool. In some cases, a link may be provided to an existing data base.

BASILI'S GUIDELINES
1. Establish the goals of the data collection

2. Develop a list of questions of interest

3. Establish data categories

4. Design and test the data types

5. Collect and validate data

6. Analyze data

Table 3.2 Process for Data Collection and Analysis

At the September 1996 meeting, attendees discussed taxonomies for faults and failures and some models for descriptive data about a project and its errors, faults, and failures (Table 3.3). The diversity among the taxonomies is great. The research occurred in different domains, problem sizes, languages, and other variables. For example Endres' interest was in an operating system while Beizer collected data from many projects of varying types and languages. And, worse, much of the data that lead to these specific taxonomies was collected before the existence of some languages (e.g., Ada, C++, JAVA) and the use of software tools aiding development.

Do classifications apply to all languages equally? To all types of software? Has the advent of design tools, analysis tools, and other parameters changed the nature of errors and hence the faults manifested in the artifacts? And, has the entry of more complex systems added failures that couldn't have been dreamed of before networks? These issues need to be addressed when developing new taxonomies. Unfortunately, data are needed to develop the taxonomies, and collecting the data with a predefined taxonomy imposes a problem.

Key lessons learned from other researchers who have influenced the EFF project include the following: