The Automatic Generation of Software Test Data Using Genetic Algorithms

  • Harmen-Hinrich Sthamer

    Student thesis: Doctoral Thesis

    Abstract

    Genetic Algorithms (GAs) have been used successfully to automate the generation of test data for software developed in ADA83. The test data were derived from the program's structure with the aim to traverse every branch in the software. The investigation uses fitness functions based on the Hamming distance between the expressions in the branch predicate and on the reciprocal of the difference between numerical expressions in the predicate. The input variables are represented in Gray code and as an image of the machine memory. The power of using GAs lies in then- ability to handle input data which may be of complex structure, and predicates which may be complicated and unknown functions of the input variables. Thus, the problem of test data generation is treated entirely as an optimisation problem.

    Random testing is used as a comparison of the effectiveness of test data generation using GAs which requires up to two orders of magnitude fewer tests than random testing and achieves 100% branch coverage. The advantage of GAs is that through the search and optimisation process, test sets are improved such that they are at or close to the input subdomain boundaries. The GAs give most improvements over random testing when these subdomains are small. Mutation analysis is used to establish the quality of test data generation and the strengths and weaknesses of the test data generation strategy.

    Various software procedures with different input data structures (integer, characters, arrays and records) and program structures with 'if conditions and loops are tested i.e. a quadratic equation solver, a triangle classifier program comprising a system of three procedures, linear and binary search procedures, remainder procedure and a commercially available generic sorting procedure.

    Experiments show that GAs required less CPU time in general to reach a global solution than random testing. The greatest advantage is when the density of global optima (solutions) is small compared to entire input search domain.
    Date of Award1995
    Original languageEnglish

    Cite this

    '