Proving versus testing programs
One of the advantages of the purely-functional
paradigm over the imperative or object-oriented ones is that
functional programs are easier to reason about.
For example, consider the following distributivity
law between reverse
and ++
:
reverse (xs++ys) = reverse ys ++ reverse xs
We can prove that this property holds for all finite lists using only high-school algebra and induction. In fact, such techniques have been taught in first-year university courses on functional programming since the 1980s.
But properties can also be used for testing programs when you don't want to invest the effort to do a formal proof. QuickCheck is an excelent Haskell library that allows expressing properties about programs and testing them with randomly-generated values.
A property for QuickCheck is simply a truth-valued
function whose arguments are implicitly universally quantified.
The library exports function called quickCheck
that tests a property with 100 randomly-generated values (by default).
QuickCheck uses types to generate test data, starting with
short lists and small integers and generating progressively larger
values.
If the property fails for some value, a counter-example is shown;
otherwise the test succeeds.
Run the code bellow to test our distributivity law for reverse
.
import Test.QuickCheck
prop_revapp :: [Int] -> [Int] -> Bool
prop_revapp xs ys = reverse (xs++ys) == reverse xs ++ reverse ys
main = quickCheck prop_revapp
The tests failed! Can you spot what is wrong?
The expression in the right-hand side of prop_revapp
is incorrect: I've intentionally
switched xs with ys. Fix it and try running the tests again.
The failure of the test case above shows another cool QuickCheck feature: when a counter-example is found, QuickCheck attempts to shorten it before presenting using a shrinking heuristic. This is great because randomly-generated data contains a lot of uninformative "noise"; a short counter-example is much more useful for debugging our programs.
In the test above QuickCheck actually found the minimal
counter-example: the wrong equation still holds trivally
if either of the lists is empty
or if they contain identical values (e.g. if both lists are [0]
)
but fails for [0]
and [1]
.
A case-study: string splitting
Consider the following list processing problem I gave my first-year functional programming students:
Define a recursive function
split :: Char -> String -> [String]
that breaks a string into substrings delimited by a given character. Some usage examples:split '@' "[email protected]" = ["pbv","dcc.fc.up.pt"] split '/' "/usr/include" = ["", "usr", "include"]
Sugestion: use the
takeWhile
anddropWhile
functions from the standard Prelude.
The sugestion is to use some standard higher-order functions from the Prelude to break the string recursively. Here's an initial attempt at a solution:
split c [] = []
split c xs = xs' : if null xs'' then [] else split c (tail xs'')
where xs' = takeWhile (/=c) xs
xs''= dropWhile (/=c) xs
However, the above definition has one subtle mistake. Let us employ testing to spot and correct the bug.
Manual testing
Let us start by testing the defintion with a few sample cases:
split c [] = []
split c xs = xs' : if null xs'' then [] else split c (tail xs'')
where xs' = takeWhile (/=c) xs
xs''= dropWhile (/=c) xs
-- show
examples = [('@',"[email protected]"), ('/',"/usr/include")]
test (c,xs) = unwords ["split", show c, show xs, "=", show ys]
where ys = split c xs
main = mapM_ (putStrLn.test) examples
-- /show
Our two test cases gave the right answers.
At this point you might be able to guess some more interesting cases
to check for "edge" conditions (you may edit the examples
list above).
But another issue is figuring out what the right answers should
be for such cases. This is not too difficult
for a simple function, but can be much harder for more
complex problems --- even for experienced programmers.
Let us instead ask ourselves one question:
can we think of general properties that split
should satisfy?
Thinking about properties
We can devise a property for split
by consider the inverse
function of split
which I shall call unsplit
.
Here are some examples of what we want unsplit
to do:
unsplit '@' ["pbv", "dcc.fc.up.pt"] = "[email protected]"
unsplit '/' ["", "usr", "include"] = "/usr/include"
Generalizing from such examples we can define unsplit
as
the composition of
concat
and intersperse
(from `Data.List`).unsplit :: Char -> [String] -> String
unsplit c = concat . intersperse [c]
We can now express one interesting property:
if we first split and then unsplit with the same delimiter,
we should get back the original string (i.e. unsplit
is the left
inverse of split
).
This is expressed as the following QuickCheck property:
prop_split_inv c xs = unsplit c (split c xs) == xs
You should convince yourself that the property above should hold for all characters c and strings xs regarless of how many times c occurs in xs (including zero occurences).
A first attempt
We can now attempt to test prop_join_split
using QuickCheck.
import Test.QuickCheck
import Data.List (intersperse)
split :: Char -> String -> [String]
split c [] = []
split c xs = xs' : if null xs'' then [] else split c (tail xs'')
where xs' = takeWhile (/=c) xs
xs''= dropWhile (/=c) xs
unsplit :: Char -> [String] -> String
unsplit c = concat . intersperse [c]
-- show
prop_split_inv c xs = unsplit c (split c xs) == xs
main = quickCheck prop_split_inv
-- /show
Most likely you have observed that 100 random tests passed (or you may have hit a counter-example if you were lucky). In any case, we should be aware of Dijkstra's remark that testing shows only the presence of bugs, not their absence!
For random testing it is important to ensure that the
test data is reasonably distributed, otherwise we
simply get a false sense of security.
QuickCheck allows us to collect information on data distribution easily:
for example, let us collect the lengths of the split
results.
import Test.QuickCheck
import Data.List (intersperse)
split :: Char -> String -> [String]
split c [] = []
split c xs = xs' : if null xs'' then [] else split c (tail xs'')
where xs' = takeWhile (/=c) xs
xs''= dropWhile (/=c) xs
unsplit :: Char -> [String] -> String
unsplit c = concat . intersperse [c]
-- show
prop_split_inv c xs
= let ys = split c xs in
collect (length ys) $ unsplit c ys == xs
main = quickCheck prop_split_inv
-- /show
The report shows that around 90% of the strings are split into lists of length 0 or 1. This is because when both the delimiter c and string xs are choosen independently it is unlikely that c occurs in xs, and if so, more than once. The property might still fail for other cases, which means we haven't tested our program thoroughly.
More thorough testing
To conduct a more thorough testing we should modify our property slightly: select delimiters that do occur in the string. We can do so with an explicit quantifier for choosing the delimiter.
import Test.QuickCheck
import Data.List (intersperse)
split :: Char -> String -> [String]
split c [] = []
split c xs = xs' : if null xs'' then [] else split c (tail xs'')
where xs' = takeWhile (/=c) xs
xs''= dropWhile (/=c) xs
unsplit :: Char -> [String] -> String
unsplit c = concat . intersperse [c]
-- show
prop_split_inv xs
= forAll (elements xs) $ \c ->
unsplit c (split c xs) == xs
main = quickCheck prop_split_inv
-- /show
This time we found a short counter-example very quickly where the string contains a single delimiter character.
split 'a' "a" = [""]
unsplit 'a' [""] = ""
This is a counter-example to our property because "a" /= ""
. But what should we
modify to fix the problem? In general, we would have three alternatives:
- modify the definition of
unsplit
; - modify the definition of
split
; - revise the property
prop_split_inv
to exclude such cases.
In our case we can reject #1 and #3 because unsplit
and prop_split_inv
are short, self-evident definitions; hence we
will fix the definition of split
.
Fixing split
It it easy to see that our property will hold in the
counter-example case found if split
gave the following result:
split 'a' "a" = ["",""]
This implies that the equation defining split
for the empty list
is wrong; it should instead be:
split c [] = [""]
Better still: we can just delete this equation altogether because the general case will give this result anyway after a single recursion step. With this correction our solution passes all tests.
import Test.QuickCheck
import Data.List (intersperse)
unsplit :: Char -> [String] -> String
unsplit c = concat . intersperse [c]
-- show
split :: Char -> String -> [String]
split c xs = xs' : if null xs'' then [] else split c (tail xs'')
where xs' = takeWhile (/=c) xs
xs''= dropWhile (/=c) xs
prop_split_inv xs
= forAll (elements xs) $ \c ->
unsplit c (split c xs) == xs
main = quickCheck prop_split_inv
-- /show
Note that, considered in isolation,
both definitions split c []=[]
and split c []=[""]
seem acceptable (our property holds for the empty string
in both cases); the first definition may appear simpler.
It is only when we consider the recursion as a whole that we
see that defining split c []=[]
is less orthogonal
because it breaks our property for strings
terminated by the delimiter character.
Conclusion
QuickCheck is a great way to test Haskell programs, not just because
it allows conducting many tests cheaply, but mostly because
it encorages thinking about general properties that our code
should satisfy rather than just specific instances.
Thinking early about properties leads
to a cleaner, more orthogonal design (i.e., with
fewer arbitrarily-specified corner cases).
There is a lot more about QuickCheck that I have not covered in this tutorial namely:
- writing custom generators and shrinking heuristic for user-defined data types;
- controling the size of generated data;
- controling the number of tests performed;
- testing monadic code (e.g. in the IO or ST monad).
For more information check the original QuickCheck paper
or the one on monadic testing; you should also lookup
the QuickCheck library documentation
.