结对测试（Pairwise testing）用例设计方法

LibraryOfKevin 2011-10-19

展开全文

在软件测试过程中，将测试对象的各种输入参数进行组合测试的情况非常普通。但是，有时候得到的组合测试用例数目将非常庞大。测试人员在面对这种情况的时候，可以采用以下几种常用的策略：

（1）尝试测试所有输入的组合，延期项目，导致的后果可能是失去产品的市场。

（2）选择一些容易设计和执行的测试用例，而忽略其是否能够提供产品质量的信息。

（3）罗列所有的组合，并随机选择其中的子集进行测试。

（4）采取特殊的测试技术，选择能发现大部分缺陷的子集进行测试。

采用策略(1)，则测试时间和资源又不允许；采用策略(2)、(3)，可能会存在较大的风险；如果采用最后一个策略，那么使用结对测试技术是一个很好的选择。这里的结对测试是一种测试技术，并非是指在执行测试时的两两结对测试（两个人同时同地测试同一对象，并且在过程中充分交换思想）。

采用结对测试的技术，测试并不针对输入值的所有组合进行测试，而只是针对所有输入值的两两组合。结对测试技术可以显著地减少测试用例的数目，同时保证较高的测试质量。下面是应用结对测试技术减少测试用例数目的例子：

? 假如软件系统有四个不同的输入参数，每个参数有3个不同的输入值，得到的完全组合数目是3⁴即81。假如采用结对测试的技术，只需要9个测试用例即可覆盖所有参数的两两组合。

? 假如软件系统有13个不同的输入参数，每个参数有3个不同的输入值，得到的完全组合数目是3¹³即1594323。假如采用结对测试的技术，只需要15个测试用例即可覆盖所有参数的两两组合。

? 假如软件系统有20个不同的输入参数，每个参数有10个不同的输入值，得到的完全组合数目是1020。假如采用结对测试的技术，只需要180个测试用例即可覆盖所有参数的两两组合。

结对测试技术能够发现所有的单模式失效（Single-mode Fault）和双模式失效（Double-mode Fault）。但是，结对测试并不一定适合于发现测试对象中的多模式失败（Multimode Fault）。实践过程中，大部分的失效是单失效模式和双失效模式，多失效模式占的比例很少。因此，通过采用合适的结对测试，可以大大降低测试用例数目，减少测试工作量，同时可以实现较好的测试覆盖率，保证测试质量。

? 单模式失效：失效是由单个参数引起的，只要针对所有独立参数进行测试，就能够发现该失效。

? 双模式失效：失效是由两个参数共同引起的，必须针对所有参数的两两组合进行测试，才能够确保发现此类缺陷。

? 多模式失败：失效是由三个或三个以上参数共同引起的，采用结对测试技术也可能发现多模式缺陷，但是不能保证测试的充分性。

下面的几个数据可以说明结对测试技术的有效性：

? 根据AT&T在对其基于局域网的邮件系统进行的测试中，应用结对测试技术得到的1000条测试用例比其原有的1500条测试用例多发现了20％的缺陷，而测试工作量却减少了50％。

? National Institute of Standards and Technology在一项对医疗设备测试所进行的15年追踪中发现，有98％的软件缺陷可以通过结对测试技术发现。

? 根据对Mozilla网页浏览器的缺陷分析显示，76％的缺陷可以通过结对测试技术发现。

结对测试，可以通过不同的测试技术来得到，包括正交矩阵（Orthogonal Arrays）的方法、James Bach提供的Allpairs方法，也可以通过分类树方法。

结对测试一般以选择系统的输入变量开始，而这些变量则是通过各自的输入域进行选择输入，然后列出覆盖全部线性对的组合。用于创建结对测试集的方法包括正交矩阵，算法方法如贪婪算法。产生结对测试用例的免费开源工具可在[http://www./tools/pairs.zip]上下载.

下面举例详细说明如何通过结对测试技术来设计测试用例。

如下图，现有系统S，有三个输入变量X、Y、Z，其取域分别为：D(X) = {1, 2}; D(Y) = {Q, R}; and D(Z) = {5, 6}.

第一步：列出所有可能的测试用例集，共有2×2×2 = 8 个测试用例，详见Table1.

Table1

Test ID	Input X	Input Y	Input Z
TC1	1	Q	5
TC2	1	Q	6
TC3	1	R	5
TC4	1	R	6
TC5	2	Q	5
TC6	2	Q	6
TC7	2	R	5
TC8	2	R	6

第二步：去掉重复的行。方法如下：从表的最后一行开始，如果这行的两两组合值能够在上面的行或此表中找到，那么这行就可从用例集中删除。

例如，TC8包含的两两组合值为(2-R,2-6,R-6)，2-R在TC7中存在，2-6在TC6中存在，R-6在TC4中存在，则此行删除；TC7包含的两两组合值为（2-R,2-5,R-5），因为2-R在此表中已找不到重复的值，所以保留。依此方法，最后得到的结对测试用例集见Table 2。很明显，测试用例数减少了一半。

Table2

Test ID	Input X	Input Y	Input Z
TC1	1	Q	5
TC4	1	R	6
TC6	2	Q	6
TC7	2	R	5

关于正交矩阵

数学中的正交矩阵对结对测试有较强的影响。正交矩阵被广泛应用在各门学科中，包括材料研究、制程业、冶金、民间测验及其它需要测试和统计抽样的领域。

正交矩阵有如下特性：首先，一个正交矩阵是一个矩形阵列或者一张带有行列值的表格，如数据库或电子表格。在电子表格中每一列表示一个变量或参数。Table3是一个兼容性测试矩阵的表头.

Table3 : Column Headers for a Test Matrix
Combination Number	Display Resolution	Operating System	Printer

The value of each variable is chosen from a set known as an alphabet. This alphabet doesn't have to be composed of letters—it's more abstract than that; consider the alphabet to be "available choices" or "possible values". A specific value, represented by a symbol within an alphabet is formally called a level. That said, we often use letters to represent those levels; we can use numbers, words, or any other symbol. As an example, think of levels in terms of a variable that has Low, Medium, and High settings. Represent those settings in our table using the letters A, B, and C. This gives us a three-letter, or three-level alphabet.

At an intersection of each row and column, we have a cell. Each cell contains a variable set to a certain level. Thus in our table, each row represents a possible combination of variables and values, as in Table 4 above.

Imagine a table that contains combinations of display settings, operating systems, and printers. One column represents display resolution (800 x 600 or Low; 1024 x 768, or Medium; and 1600 x 1200, High); a second column represents operating systems (Windows 98, Windows 2000, and Windows XP); and a third column represents printer types (PostScript, LaserJet, and BubbleJet). We want to make sure to test each operating system at each display resolution; each printer with operating system; and each display resolution with each printer. We'd need a table with 3 x 3 x 3 rows—27 rows—to represent all of the possible combinations; here are the first five rows in such a table:

Table 7: Compatibility Test Matrix (first five rows of 27)
Combination Number	Display Resolution	Operating System	Printer
1	Low	Win98	PostScript
2	Low	Win98	LaserJet
3	Low	Win98	BubbleJet
4	Low	Win2000	PostScript
5	Low	Win2000	LaserJet

Writing all those options out takes time; let's use shorthand. All we need is a legend or mapping to associated each variable with a letter of the alphabet: Low=A, Medium=B, High=C; Windows 98=A, Windows 2000=B, and Windows XP=C; PostScript=A, LaserJet=B, and BubbleJet=C. Again, this is only the first five rows in the table.

Table 8: Compatibility Test Matrix (first five rows of 27)
Combination Number	Display Resolution	Operating System	Printer
1	A	A	A
2	A	A	B
3	A	A	C
4	A	B	A
5	A	B	B

Note that the meaning of the letter depends on the column it's in. In fact, the table can mean whatever we like; we can construct very general OAs using letters, instead of specific values or levels. When it comes time to test something, we can associate new meanings with each column and with each letter. Again, the first five lines:

Table 9: Compatibility Test Matrix (first five rows of 27)
Combination Number	Variable 1	Variable 2	Variable 3
1	A	A	A
2	A	A	B
3	A	A	C
4	A	B	A
5	A	B	B

While we can replace the letters with specific values, a particular benefit of this approach is that we can broaden the power of combination testing by replacing the letters with specific classes of values. One risk associated with free-form text fields is that they might not be checked properly for length; the program could accept more data than expected, copy that data to a location (or "buffer") in memory, and thereby overwrite memory beyond the expected end of the buffer. Another risk is that the field might accept an empty or null value even though some data is required. So instead of associating specific text strings with the levels in column 3, we could instead map A to "no data", B to "data from 1 to 20 characters", and C to "data from 21 to 65,336 characters". By doing this, we can test for risks broader than those associated with specific enumerated values.

For example, suppose that we are testing an insurance application. Assume that Variable 1 represents a tri-state checkbox (unchecked = A = no children, checked = B = dependent children, and greyed out = C = adult children). Next assume that Variable 2 represents a set of radio buttons (labeled single, married, or divorced). Finally assume that Variable 3 represents a free-form text field of up to 20 characters for the spouse's first name. Suppose further the text field is properly range checked to make sure that it's 20 characters or fewer. However, suppose that a bug exists wherein the application produces a garbled record when the spouse's name is empty, but only when the "married" radio button is selected. By using classes of data (null, valid, excessive) in Column 3, pairwise testing can help us find the bug.

Back to orthogonal arrays. There are two kinds of orthogonal arrays: those which use the same-sized alphabet over all of the columns, and those which use a larger alphabet in at least one column. The latter type is called a "mixed-alphabet" orthogonal array, or simply a "mixed orthogonal array". If we decide to add another display resolution, we need to add another letter (D) to track it; we we would then have a mixed orthogonal array.

A regular (that is, non-mixed) orthogonal array has two other properties: strength and index.

Formally, an orthogonal array of strength S and index I over an alphabet A is a rectangular array with elements from A having the property that, given any S columns of the array, and given any S elements of A (equal or not), there are exactly I rows of the array where those elements appear in those columns.

Only a mathematician could appreciate that definition. Let's figure out what it means in practical terms.

If we're using an OA to test for faults, strength essentially refers to the mode of the fault for which we're checking. While checking for a double-mode fault, we consider pairs of columns and pairs of letters from the alphabet; we would need a table with a strength of 2. We select a strength by choosing a certain number of columns from the table and the same number of values from the alphabet; we'll call that number S. The array is orthogonal if in our S columns, each combination of S symbols appears the same number of times; or to put it another way, when S = 2, each pair of symbols must appear the same number of times. Thus we are performing pairwise or all-pairs testing when we create test conditions using orthogonal arrays of strength 2.

Were we checking for a triple-mode fault—a fault that depends on three valuables set to a certain value, we would need to look at three columns at a time, and combinations of three letters from the alphabet—a table with strength of 3. In general, strength determines how many variables we want to test in combination with each other.

An index is a more complicated concept. The orthogonal array has an index I if there are exactly I rows in which the S values from the alphabet appear. This effectively means that OAs that have an index must either have the same alphabet, or that all the columns must contain alphabets of equal size, which amounts to the same thing. An mixed-alphabet orthogonal array therefore cannot have an index.

The index is important when you want to make sure not only that each combination is tested, but that each combination is testedthe same number of times. That's important in manufacturing and product safety tests, because it allows us to account for wear or friction between components. When we test combinations of components using orthogonal arrays, we find not only which combination of components will break, but also which combination will break first.

This leads us to the major difference between orthogonal array testing and all-pairs testing. If we are searching for simple conflicts between variables, we would presume that we'll expose the defect on the first test of a conflicting value pair. If we stick to that presumption, it is not very useful to test the same combination of paired values more than once, so when testers use pairwise testing, they use orthogonal arrays of strength 2 to produce pairwise test conditions, and, to save time, they'll tend to stick to anindex of 1, to avoid duplication of effort. In cases where the alphabets for each column are of mixed size, we'll take the hit and test some pairs once and some pairs more than once.

Moreover, unless we specifically plan for it, it's not very likely that the variables in a piece of software or the parameters in configuration testing will have the same number of levels per variable. If we adjust all of our columns such that they have the same-sized alphabets, our array grows. This gives us an orthogonal array that is balanced, such that each combination appears the same number of times, but at a cost: the array is much larger than neccessary for all-pairs. In all-pairs testing, the combinations are pairs of variables in states such that each variable in each of its states is paired at least once with some other variable in each of its states. In strictly orthogonal arrays, if one pair is tested three times, all pairs have to be tested three times, whether the extra tests reveal more information or not.

A strictly orthogonal array will have a certain number of rows for which the value of one or more of the variables is irrelevant. This isn't necessary as long we've tested each pair against the other once; thus in practical terms, a test suite based on an orthogonal array has some wasted tests. Consquently, in pairwise testing, our arrays will tend not to have an index, and will be "nearly orthogonal".

So let's construct an orthogonal array. We have three columns, representing three variables. We'll choose an alphabet of Red, Green, and Blue--that's a three-level alphabet. Then we'll arrange things into a table for all of the possible combinations:

Table 10: All Combinations for Three Variables of Three Levels Each
	A	B	C
1	Red	Red	Red
2	Red	Red	Green
3	Red	Red	Blue
4	Red	Green	Red
5	Red	Green	Green
6	Red	Green	Blue
7	Red	Blue	Red
8	Red	Blue	Green
9	Red	Blue	Blue
10	Blue	Red	Red
11	Blue	Red	Green
12	Blue	Red	Blue
13	Blue	Green	Red
14	Blue	Green	Green
15	Blue	Green	Blue
16	Blue	Blue	Red
17	Blue	Blue	Green
18	Blue	Blue	Blue
19	Green	Red	Red
20	Green	Red	Green
21	Green	Red	Blue
22	Green	Green	Red
23	Green	Green	Green
24	Green	Green	Blue
25	Green	Blue	Red
26	Green	Blue	Green
27	Green	Blue	Blue

Now, for each pair of columns, AB, AC, and BC, each pair of colours appears exactly three times. Our table is an orthogonal array of level 3, strength 2, and index 3. In order to save testing effort, let's reduce the apperance of each pair to once.

Table 11: All-Pairs Array, Three Variables of Three Levels Each
	A	B	C
2	Red	Red	Green
4	Red	Green	Red
9	Red	Blue	Blue
12	Blue	Red	Blue
14	Blue	Green	Green
16	Blue	Blue	Red
19	Green	Red	Red
24	Green	Green	Blue
26	Green	Blue	Green

How did I happen to chose these nine specific combinations? That was the easy part: I cheated. I tried trial and error, and couldn't get it down to fewer than 12 entries without spending more time than I felt it was worth. The AllPairs tool mentioned above got me down to 10 combinations. However, on the Web, there are precalculated orthogonal array tables for certain numbers of variables and alphabets; one is here; that's the one that I used to produce the chart above.

Mixed-alphabet orthogonal arrays permit alphabets of different sizes in the columns. The requirement is that the number of rows where given elements occur in a given number of prescribed columns is constant over all choices of the elements, but is allowed to vary for different sets of columns. Such an orthogonal array does not have an index, and is known as a "nearly orthogonal array." The airline example above is of this "mixed alphabet" type. The consequence is that some combinations will be tested more often than others. Note that in Table 5, there are two instances of Canada as the destination and Window as the seating preference, but each test is significant with respect to the class column. In cases where a single column has a larger alphabet than all the others, we'll test all pairs, but some pairs may be duplicated or may have variables that can be set to "don't care" values. If there's an opportunity to broaden the testing in a "don't care" value—for example by trying a variety of representatives of a class that might be otherwise be considered equivalent, we can broaden the coverage of the tests somewhat.

Once again, the goal of using orthogonal arrays in testing is to get the biggest bang for the testing buck, performing the smallest number of tests that are likely to expose defects. To keep the numbers low, the desired combinations are pairs of variables in each of their possible states, rather than combinations of three or more variables. This makes pairwise testing a kind of subset of orthogonal array testing. An orthogonal array isn't restricted to pairs (strength 2); it can be triples—strength 3, or combinations of three variables; or n-tuples—strength n, or combinations of n variables. However, orthogonal arrays with a strength greater than 2 are large, complex to build, and generally return too many elements to make them practicable. In addition, in the case of mixed alphabets, strictly orthogonal arrays produce more pairwise tests than we need to do a good job, so all-pairs testing uses nearly orthogonal arrays, which represents reasonable coverage and reduced risk at an impressive savings of time and effort.

参考文档：1. [Pairwise Testing ]http://www./pairwiseTesting.html,在这篇文章中所得的Table5与我应用结对测试方法有所出入，下表是本人得出的测试用例集：

Test	Destination	Class	Seat Preference
1	Canada	Coach	Aisle
3 (defect!)	USA	Coach	Aisle
6	USA	Business Class	Aisle
8	Mexico	First Class	Aisle
9	USA	First Class	Aisle
11	Mexico	Coach	Window
12 (defect!)	USA	Coach	Window
13	Canada	Business Class	Window
16	Canada	First Class	Window