[Title Page] [TOC] [Prev] [Next] [End]

Appendix B. Extended Example


This appendix illustrates the structured testing technique of Chapter 5 for a module of significant complexity. The example is a C translation of the FORTRAN blackjack game program used in [NBS99] for the same purpose. No attempt has been made to improve the structural quality of the original version, and the error has been preserved. Section B.2 describes an experimental framework for comparison of test coverage criteria. Although structured testing does not guarantee detection of the error in the blackjack program, experimental results show that it is much more effective than branch coverage.

B.1 Testing the blackjack program

Since there are many variants on the blackjack rules, the following specification describes the rules being used for this example. The mechanics of the application of structured testing can be understood without reference to the details of the specification, but the specification is required to detect incorrect program behavior. The details of the "hand" module follow, as does the description of the testing process.

B.1.1 The specification

The program, as the dealer, deals two cards to itself and two cards to the player. The player's two cards are shown face up, while only one of the dealer's cards is shown. Both the dealer and the player may draw additional cards, which is known as being "hit." The player's goal is to reach 21 or less, but be closer to 21 than the dealer's hand-in which case the player wins. Ties go to the dealer. If the player's or the dealer's hand totals greater than 21, the hand is "busted" (a loss). The King, Queen, and Jack all count as 10 points. All other cards, except the Ace, count according to their face values. The Ace counts as 11 unless this causes the hand to be over 21, in which case it counts as 1.

If both the dealer and the player get Blackjack, which is a two-card hand totaling 21, it is a "push" (neither wins). Blackjack beats all other hands-if the player has Blackjack, he wins "automatically" before the dealer has a chance to take a hit. The player can also win automatically by having five cards without being busted. The player may take any number of hits, as long as the hand is not busted. The dealer must take a hit while the hand is less than or equal to 16, and must "stand" at 17.

B.1.2 The module

Figure B-1 shows the code for module "hand" from the blackjack program, which will be used to illustrate structured testing.

6 A0 int hand()
7 {/* return win */
8 int d = 0, pace, dace;
9 int fHit; /* 1 for hit, zero for not hit */
10 A1 A2 A3 A4 A5 p = 0; d = 0; pace = 0; dace = 0; win = 0;
11 /* win will be 0 if dealer wins, 1 if player wins, 2 if a push */
12 A6 A7 A8 A9 hit(&p, &pace); hit(&d, &dace);
13 A10 A11 A12 A13 hit(&p, &pace); hit(&d, &dace);
14 A14 count = 0;
15 A15 A16 printf("DEALER SHOWS --- %d\n", cards[i-1]);
16 A17 dshow = cards[i-1];
17 A18 A19 printf("PLAYER = %d, NO OF ACES - %d\n", p, pace);
18 A20 if (p == 21) {
19 A21 A22 printf("PLAYER HAS BLACKJACK\n");
20 A23 win = 1;
21 } else {
22 A24 count = 2;
23 L11:
24 A25 A26 check_for_hit(&fHit, p, count);
25 A27 if (fHit == 1) {
26 A28 A29 hit(&p,&pace);
27 A30 count += 1;
28 A31 A32 printf("PLAYER = %d, NO OF ACES - %d\n", p, pace);
29 A33 if (p > 21) {
30 A34 A35 printf("PLAYER BUSTS - DEALER WINS\n");
31 A36 goto L13;
32 }
33 A37 A38 goto L11;
34 }
35 A39 A40 }
36 /* Handle blackjack situations, case when dealer has blackjack */
37 A41 if (d == 21) {
38 A42 A43 printf("DEALER HAS BJ\n");
39 A44 if (win == 1) {
40 A45 A46 printf("------ PUSH\n");
41 A47 win = 2;
42 A48 goto L13;
43 } else {
44 A49 A50 printf("DEALER AUTOMATICALLY WINS\n");
45 A51 goto L13;
46 A52 }
47 } else {
48 /* case where dealer doesn't have blackjack:
49 * check for player blackjack or five card hand
50 */
51 A53 A54 if (p == 21 || count >= 5) {
52 A55 A56 printf("PLAYER AUTOMATICALLY WINS\n");
53 A57 win = 1;
54 A58 goto L13;
55 }
56 A59 A60 }
57 A61 A62 printf("DEALER HAS %d\n", d);
58 L12:
59 A63 if (d <= 16) {
60 A64 A65 hit(&d,&dace);
61 A66 if (d > 21) {
62 A67 A68 printf("DEALER BUSTS - PLAYER WINS\n");
63 A69 win = 1;
64 A70 goto L13;
65 }
66 A71 A72 goto L12;
67 }
68 A73 A74 A75 printf(" PLAYER = %d DEALER = %d\n", p, d);
69 A76 if (p > d) {
70 A77 A78 printf("PLAYER WINS\n");
71 A79 win = 1;
72 } else
73 A80 A81 A82 printf("DEALER WINS\n");
74 L13:
75 A83 return win;
76 A84 }
Figure B-1. Code for example blackjack module "hand."

Figure B-2 shows the graph for module "hand," which has cyclomatic complexity 11 and essential complexity 9. The high essential complexity makes the control structure difficult to follow. In such cases, the baseline method can be performed using the source listing as a guide, with paths represented by ordered lists of decision outcomes. Decisions are "flipped" exactly as in the graphical version of the method.

Figure B-2. Graph for module "hand."

The original blackjack testing example [NBS99] demonstrated the manual application of the baseline method in full detail, which will not be duplicated here. Instead, we follow the modern tool-assisted testing process, in which basis path coverage is built on a foundation of functional testing. We show that unrealizable test paths do not necessarily imply that rc < v(G) or that testing should cease, and we also demonstrate control dependency analysis as a test case derivation technique.

B.1.3 Tests through the module

Replacing the "hit" routine with a version that allows user input of the next card is sufficient to allow individual paths to be exercised. We now follow a typical structured testing process, beginning with functional tests while using an automated tool to measure ac. Input data derivation is simplified by first determining the meaning of all program variables used in decision expressions. The variable "p" holds the total value of the player's hand, and "d" holds the dealer's hand value. Those two variables are updated by the "hit" routine. The "fHit" and "win" variables are commented with their usage. When "fHit'' is 1, the player or dealer has elected to hit, and a value of 0 means stand. A value of 0 for "win" means the dealer wins, 1 means the player wins, and 2 means there was a push. This is the value returned by the module. Finally, the variable "count" stores the total number of cards in the player's hand.

The first functional test is a "push," which can be exercised by having the replacement "hit" module first set p to 11 and pace to 1, then set d to 11 and dace to 1, then set p to 21 and leave pace at 1, then set d to 21 and leave dace at 1. We could also have exercised the same path by having "hit" set the contents of its first argument unconditionally to 21. However, it is best to make stub and driver modules conform to the specified behavior of the real modules that they replace, and in fact behave as realistically as possible. Since the input data derivation is straightforward, we omit it for the rest of the paths. Execution of this test does in fact show a push, the correct behavior. When discussing control flow paths, we use the notation "18( 20): p==21 ==> TRUE" to indicate that the test on line 18 at node 20 with decision predicate "p==21" takes on the value "TRUE." The path executed by the first functional test is:

18( 20): p==21 ==> TRUE
37( 41): d==21 ==> TRUE
39( 44): win==1 ==> TRUE
The second functional test is blackjack for the player, in which the original deal leaves the player with 21 and leaves the dealer with an inferior hand. Execution shows correct behavior. The associated path is:

18( 20): p==21 ==> TRUE
37( 41): d==21 ==> FALSE
51( 53): p==21 ==> TRUE
The third functional test is blackjack for the dealer, in which the original deal leaves the player with less then 21, the player does not get hit, and the dealer has 21. Execution shows correct behavior. The associated path is:

18( 20): p==21 ==> FALSE
25( 27): fHit==1 ==> FALSE
37( 41): d==21 ==> TRUE
39( 44): win==1 ==> FALSE
The fourth functional test is a five-card win, in which the player accumulates five cards without going over 21 and the dealer was not dealt blackjack. Execution shows correct behavior. The associated path is:

18( 20): p==21 ==> FALSE
25( 27): fHit==1 ==> TRUE
29( 33): p>21 ==> FALSE
25( 27): fHit==1 ==> TRUE
29( 33): p>21 ==> FALSE
25( 27): fHit==1 ==> TRUE
29( 33): p>21 ==> FALSE
25( 27): fHit==1 ==> FALSE
37( 41): d==21 ==> FALSE
51( 53): p==21 ==> FALSE
51( 54): count>=5 ==> TRUE
The fifth functional test is a win for the player in which neither the player nor the dealer takes a hit, and neither has blackjack. Execution shows correct behavior. The associated path is:

18( 20): p==21 ==> FALSE
25( 27): fHit==1 ==> FALSE
37( 41): d==21 ==> FALSE
51( 53): p==21 ==> FALSE
51( 54): count>=5 ==> FALSE
59( 63): d<=16 ==> FALSE
69( 76): p>d ==> TRUE
The sixth functional test is a win for the dealer in which neither the player nor the dealer takes a hit, and neither has blackjack. Execution shows correct behavior. The associated path is:

18( 20): p==21 ==> FALSE
25( 27): fHit==1 ==> FALSE
37( 41): d==21 ==> FALSE
51( 53): p==21 ==> FALSE
51( 54): count>=5 ==> FALSE
59( 63): d<=16 ==> FALSE
69( 76): p>d ==> FALSE
The seventh functional test is a win for the dealer in which the player is busted. Execution shows correct behavior. The associated path is:

18( 20): p==21 ==> FALSE
25( 27): fHit==1 ==> TRUE
29( 33): p>21 ==> TRUE
The eighth functional test is a win for the player in which the dealer is busted. Execution shows correct behavior. The associated path is:

18( 20): p==21 ==> FALSE
25( 27): fHit==1 ==> FALSE
37( 41): d==21 ==> FALSE
51( 53): p==21 ==> FALSE
51( 54): count>=5 ==> FALSE
59( 63): d<=16 ==> TRUE
61( 66): d>21 ==> TRUE
The ninth functional test is a win for the dealer in which the player is hit to reach 21 and the dealer has blackjack. Execution shows correct behavior. The associated path is:

18( 20): p==21 ==> FALSE
25( 27): fHit==1 ==> TRUE
29( 33): p>21 ==> FALSE
25( 27): fHit==1 ==> FALSE
37( 41): d==21 ==> TRUE
39( 44): win==1 ==> FALSE
This concludes our initial set of functional tests. No error has been detected yet, and each test increased ac by 1, so ac is now 9. The coverage analysis tool reports that one branch has not yet been executed, "61( 66): d>21 ==> FALSE." This corresponds to the dealer taking a hit without being busted. Since branches are usually easier to test than paths, and testing a previously unexecuted branch must increase ac, we construct a new test for this branch immediately. The test makes the dealer win the hand by hitting once and then standing. Execution shows correct behavior. The associated path is:

18( 20): p==21 ==> FALSE
25( 27): fHit==1 ==> FALSE
37( 41): d==21 ==> FALSE
51( 53): p==21 ==> FALSE
51( 54): count>=5 ==> FALSE
59( 63): d<=16 ==> TRUE
61( 66): d>21 ==> FALSE
59( 63): d<=16 ==> FALSE
69( 76): p>d ==> FALSE
All branches have now been covered. No error has been detected yet, and ac is now 10. The coverage analysis tool reports that executing the following path would be sufficient to complete basis path coverage:

18( 20): p==21 ==> FALSE
25( 27): fHit==1 ==> FALSE
37( 41): d==21 ==> FALSE
51( 53): p==21 ==> TRUE
Unfortunately, this particular path is unrealizable, since the value of p is not altered between the mutually exclusive "30( 13): p==21 ==> FALSE" and "62( 44): p==21 ==> TRUE" decision outcomes. If the basis completion path had been realizable, we could have just derived the corresponding data, executed the test, and completed the testing process. For modules as poorly structured as "hand," however, unrealizable paths are fairly common, so it is important to be prepared for them. As discussed in section 9, the existence of unrealizable paths does not necessarily mean that rc < v(G). To help determine whether a realizable alternative path can complete basis path testing, we use the dependency analysis technique. The dependency analysis tool indicates that for every tested path, the total number of times branches "39( 44): win==1 ==> TRUE" and "51( 53): p==21 ==> TRUE" were executed is equal to the number of times branch "18( 20): p==21 ==> TRUE" was executed. Note that for any single path, none of those three decision outcomes can be executed more than once and at most one of the first two can be executed. We may therefore rephrase the dependency equation in terms of the module's functionality: The player has blackjack if and only if either the result is a push or the player automatically wins with 21. Any test that does not satisfy this dependency relationship will be independent of the current set of tests and increase ac, in this case completing testing by making ac equal to v(G). On the other hand, if we can prove that the dependency must hold for every feasible test, then we have shown that rc < v(G). Fortunately, we can simplify the dependency even further, showing that rc = v(G) and giving a concise characterization of all feasible basis completion paths. Note that if the player has blackjack, the result must be either a push or an automatic win with 21 for the player. Also, there is no way to reach a push result without the player having blackjack. Thus, a test breaks the dependency relationship if and only if the player gets an automatic win with 21 without having blackjack. This is clearly possible for the module as written, for example by having the player reach exactly 21 after taking one hit. Execution of this test shows incorrect behavior-the module declares an automatic win for the player without giving the dealer a chance to take hits and win by also reaching 21. The specification states that the player can only win "automatically" with either Blackjack or a five-card hand. As expected, this test completes the basis set. The associated path is:

18( 20): p==21 ==> FALSE
25( 27): fHit==1 ==> TRUE
29( 33): p>21 ==> FALSE
25( 27): fHit==1 ==> FALSE
37( 41): d==21 ==> FALSE
51( 53): p==21 ==> TRUE
Note that in addition to providing a test that detected the error, dependency analysis clarified the nature of the error and gave a straightforward repair technique: replace the automatic win test "51( 53): p==21" with "51( 53): win==1," since at that point win==1 is precisely equivalent to the player having blackjack. If the error is repaired in this manner, there will be no way to break the dependency with a feasible path, and rc will therefore be reduced to 10.

Note also that structured testing is not guaranteed to detect the error, as can be seen from dependency analysis. If the player reaches exactly 21 after taking at least three hits, the error will not result in incorrect module output even though a dependency-breaking "incorrect" control flow path is executed. The reason is that there is no difference in specified behavior between an automatic win due to blackjack and an automatic win due to a five-card hand. Hence, the five-card hand masks the error. This is the only situation in which structured testing can fail to detect the error in the blackjack program, and as shown in section B-2 it is an extremely rare situation during actual play of the game.

B.2 Experimental comparison of structured testing and branch coverage

The "hand" module from section B-1 is one of several modules used in [WATSON4] to compare the error detection effectiveness of structured testing with branch coverage. The rest of this appendix discusses that work. Although neither testing technique is guaranteed to detect the error, structured testing performed dramatically better than branch coverage in the experiment, which used random test data generation to simulate actual play.

B.2.1 Experimental design

The overall experimental design was to package the test module with a driver program that accepts a random number seed as input, generates random test data from a realistic distribution, executes the test module with that data, analyzes the module's behavior, and returns an exit code indicating whether an error was detected. A higher-level driver invokes the module test driver with a sequence of seed values, uses an automated tool to measure the test coverage level according to each criterion being compared, and tracks both the error detection and the coverage results.

For both structured testing and branch coverage, 100 experimental trials were performed, in each of which random test cases were generated iteratively until the current testing criterion was satisfied. Four kinds of data were collected during each trial for the two testing criteria:

B.2.2 Comparative frequency of error detection

Figure B-3 shows the data collected for the "hand" module during the experiment. Data about number of tests is averaged over the 100 trials. Data about error detection is added for the 100 trials, giving a convenient interpretation as a percentage of trials in which errors were detected by the criteria.

Structured Testing

Branch Coverage

Tests

45

36

Testsinc

11

8

Detect

97

85

Detectinc

96

53

Figure B-3. Experimental data.

B.2.3 Interpretation of experimental data

Structured testing decisively outperformed branch coverage at detecting the error while requiring only 25% more total tests and 38% more tests that increase coverage. The most notable feature of the data is the robustness of structured testing with respect to test set minimization. Only 1% of the error detection effectiveness of structured testing was lost when considering only tests that increased coverage, whereas the number of tests required was reduced by 75%. For branch coverage, 38% of error detection effectiveness was lost, and the number of tests required was reduced by 78%. This indicates that the effectiveness of structured testing is due to the criterion itself rather than just the number of tests required to satisfy it. For the complete set of test programs used in [WATSON4], structured testing preserved 94% of its error detection effectiveness when restricted to the 24% of test cases that increased coverage.



[Title Page] [TOC] [Prev] [Next] [End]