Game Testing All in One (Game Development Series)
How good is "good" game software? Certainly the amount of defects in the code has something to do with goodness. The team's ability to find defects in its product is another factor to consider. A "sigma level" establishes the defectiveness of game code relative to its size , while " phase containment" provides an indicator of how successful the team is at finding defects at their source, leaving fewer to escape to your customers.
Six Sigma Software
A "sigma level" is one way to establish a goal for the outgoing quality of your game. For software this measure is based on defects per million lines of code, excluding comments (also referred to as "non-commented source lines" or "NCSL"). The "lines of code" measure is often normalized to Assembly-equivalent lines of code (AELOC) in order to balance the different level of abstraction across the variety of languages in use such as C, C++, Java,Visual Basic, and so on. The level of abstraction of each language is reflected in its multiplier . For example, each line of C code is typically regarded as the equivalent of three to four AELOC, whereas each line of Perl code is treated as about 15 AELOC. It's best to measure this factor based on your specific development environment and use that factor for any estimates or projections you need to make in the future. If you are using different languages for different parts of your game, multiply the lines of code for each portion by the corresponding language factor.
Figure 6.1: Sigma table excerpt for various sizes of delivered software.
Don't fool yourself by measuring your sigma on the sole basis of the open defects you know about in the product. This might reward poor testing which did not find many defects that still remain in the game, but wouldn't reflect the experience your customers will have. The defects being counted must include both the game defects you know about that have not been fixed, whatever defects your customers have already found, and your projection of defects that remain in the software which haven't been discovered yet. It's best to wait anywhere from 6 to 18 months after shipping to calculate your sigma. If you still have a good result after that, continue to operate your projects in a similar manner by repeating what went "right" but also fix what went "wrong." If you have poor results, take a good hard look at what changes you can make to avoid a repeat performance. You can start by going through the list of non-conformances that QA found during the project.
Phase Containment
Phase containment is the ability to detect faults in the project phase in which they were introduced. Phase Containment Effectiveness (PCE) is a measure of how well that is being done.
Faults that are found in the phase in which they are introduced are known as in-phase faults or "errors." Faults that don't get caught in the same phase in which they are introduced are said to escape and become "defects." The principle is that if any subsequent work is derived from the faulty item, then a defect has occurred. Think of the 18" Stonehenge descending from the ceiling in the movie Spinal Tap . That could have been avoided (but not as funny ‚ ) if someone noticed the size was given in inches instead of feet on the drawing given to the artist.
Errors are typically found by reviews, walkthroughs, or inspections. Defects are most noticeably found by testing and unhappy customers, but they can also be found in reviews of downstream work products. For example, a code inspection issue might actually be the result of incorrect design or requirements. Because other work has already been done based on the fault, this is a defect.
PCE is typically tracked and reported by showing the faults found in each development phase. The faults are organized into columns for each phase in which they might be found. A coding fault can't be detected in the requirements phase because the code does not exist at that point. Calculate PCE by dividing the number of in-phase faults by the sum of faults found in all phases to come up with the PCE for that phase. From the data in Figure 6.2, the design phase PCE is calculated by dividing the number of faults found in the coding phase, 93, by the sum of all faults introduced by coding, which is 93 + 6 + 24 = 123. The result is 93/123 = 0.76. Figure 6.3 shows a graph summarizing the code PCEs for each phase.
Figure 6.2: Game code phase containment data.
Figure 6.3: Game code phase containment graph.
Alternatively, test results could be broken out into separate categories, as shown in Figure 6.4. These extra categories do not affect the PCE numbers or graphs, but this could be more convenient for data collection if different systems or categories are used for different release types. This data also helps the team understand whether there will be additional testing activities that could further reduce the PCE numbers as more defects are found. In Figure 6.4, no Beta testing results are available to add to the table. So, the PCE numbers for requirements, design, and coding only represent the maximum possible value. New defects found in Beta testing will be sourced to these phases and reduce the corresponding PCEs.
Figure 6.4: Game code phase containment data with expanded test categories.
If this practice is useful for understanding how well the team is capturing defects in the game code, it should also be applied to the work produced by the testers. Figure 6.5 shows example PCE data for testing deliverables and Figure 6.6 shows the corresponding graph.
Figure 6.5: Game test phase containment data.
Figure 6.6: Game test phase containment graph.
As the test PCE data shows, some faults in the tests don't get noticed until the test is executed on the game code. The problem might have been recognized as a test defect by the tester running the test, or it may have started out as a code defect before analysis and retesting uncovered the fact that the test was wrong, not the code. You can imagine how much more time consuming that is versus finding the defect before releasing the test.
Remember, this is not a measure of how well the executed tests perform. This is a measure of how well faults were captured in the test designs, scripts, and/or code. Any mistakes made in one of these activities will need to be repaired when they are eventually discovered. Test mistakes that don't get discovered could impact the quality of the game itself. A missing test, or a test that checks for the wrong result and passes , can send game bugs on their merry way to the paying public.
As with the sigma value, look for ways to improve your PCE. If you had 100% containment in all of your phases, you would only have to run each test once and they would all pass. Your customers wouldn't find any problems and you'd never have to issue a patch. Think of the time and money that would save! Since the PCE is a function of the faults produced and the faults, you can attack a low PCE at both ends. Programmers can improve their ability to prevent the introduction of faults. Testers and QA can improve their ability to detect faults.
In both cases, some basic strategies to address low PCE areas are:
-
Improve knowledge of the subject matter and provide relevant training.
-
Have successful team members provide mentoring to less-successful members .
-
Document methods used by successful individuals and deploy them throughout the team.
-
Increase compliance with existing methods and standards.
-
Add standards which, by design, help prevent faults.
-
Add checking tools that run during the creation process, such as color -coded and syntax-aware editors.
-
Add checking tools that run after the creation process, such as stronger compilers and memory leak checkers.