Functions and Classes in Python

Advanced testing

Overview:

  • Teaching: 10 min
  • Exercises: 10 min

Questions

  • How do I test something new?
  • What about if I am writing something random?

Objectives

  • Understand how to compare with known analytic/literature results.
  • Know that the more you can test the more confidence you can have with your code, even if it is random.

Testing Random

Up till now we have been testing functions where the output is entirely predictable. In these cases, a handful of tests is usually enough to provide confidence that the software is working as expected. In the real world, however, you might be developing a complex piece of sofware to implement an entirely new algorithm, or model. In certain cases it might not even be clear what the expected outcome is meant to be. Things can be particularly challenging when the software is involves a stochastic element.

Let us consider a class to simulate the behaviour of a dice. One is provided in the dice package. Let's import it and see how it works.

In [1]:
from dice import Dice
help(Dice)
Help on class Dice in module dice.dice:

class Dice(builtins.object)
 |  Dice(n=6, seed=None)
 |  
 |  A simple class for an n-sided fair dice.
 |  
 |  Methods defined here:
 |  
 |  __init__(self, n=6, seed=None)
 |      Construct a n-sided dice.
 |      
 |      n -- The number of sides on the dice.
 |  
 |  lastRoll(self)
 |      Return the value of the last dice roll.
 |  
 |  roll(self)
 |      Roll the dice and return its value.
 |  
 |  sides(self)
 |      Return number of sides of the dice.
 |  
 |  ----------------------------------------------------------------------
 |  Data descriptors defined here:
 |  
 |  __dict__
 |      dictionary for instance variables (if defined)
 |  
 |  __weakref__
 |      list of weak references to the object (if defined)

How could we test that the dice is fair?

Well, first of all we could check that the value of a dice roll is in range.

# dice/test/test_dice.py
def test_valid_roll():
    """ Test that a dice roll is valid. """

    # Intialise a standard, six-sided dice.
    dice = Dice()

    # Roll the dice.
    roll = dice.roll()

    # Check that the value is valid.
    assert roll > 0 and roll < 7
In [2]:
!pytest dice/test/test_dice.py::test_valid_roll
============================= test session starts ==============================
platform linux -- Python 3.6.3, pytest-3.2.1, py-1.4.34, pluggy-0.4.0
rootdir: /home/rjg20/training/arc-training/intermediate/github-testing-ci/nbplain, inifile:
collected 1 item                                                                

dice/test/test_dice.py .

=========================== 1 passed in 0.01 seconds ===========================

Great, that worked. Although, it could just be a fluke...

In practice, we need to check that the assertions hold repeatedly.

# dice/test/test_dice.py
def test_always_valid_roll():
    """ Test that a dice roll is "always" valid. """

    # Intialise a standard, six-sided dice.
    dice = Dice()

    # Roll the dice lots of times.
    for i in range(0, 10000):
        roll = dice.roll()

        # Check that the value is valid.
        assert roll > 0 and roll < 7
In [3]:
!pytest dice/test/test_dice.py::test_always_valid_roll
============================= test session starts ==============================
platform linux -- Python 3.6.3, pytest-3.2.1, py-1.4.34, pluggy-0.4.0
rootdir: /home/rjg20/training/arc-training/intermediate/github-testing-ci/nbplain, inifile:
collected 1 item                                                                

dice/test/test_dice.py .

=========================== 1 passed in 0.07 seconds ===========================

Okay, that's better. Or is it...

xkcd: random

Not again!

Perhaps we should test the average value. We know that this should equal the sum of the faces of the dice, divided by the number of sides, i.e. 3.5 for a six-sided dice.

# dice/test/test_dice.py
def test_average():
    """ Test that the average dice roll is correct. """

    # Intialise a standard, six-sided dice.
    dice = Dice()

    # Work out the expected average roll.
    exp = sum(range(1, 7)) / 6

    # Calculate the sum of the dice rolls.
    total = 0
    for i in range(0, 100000):
        total += dice.roll()

    # Check that the average matches the expected value.
    average = total / rolls
    assert average == pytest.approx(3.5, rel=1e-2)
In [4]:
!pytest dice/test/test_dice.py::test_average
============================= test session starts ==============================
platform linux -- Python 3.6.3, pytest-3.2.1, py-1.4.34, pluggy-0.4.0
rootdir: /home/rjg20/training/arc-training/intermediate/github-testing-ci/nbplain, inifile:
collected 1 item                                                                

dice/test/test_dice.py .

=========================== 1 passed in 0.19 seconds ===========================

Good... Hang on, hold your horses!

In [5]:
(1 + 3 + 4 + 6) / 4
Out[5]:
3.5

Dang! We need to test that the distrubtion of outcomes is correct, i.e. that each of the six possible outcomes is equally likely.

# dice/test/test_dice.py
def test_fair():
    """ Test that a dice is fair. """

    # Intialise a standard, six-sided dice.
    dice = Dice()

    # Set the number of rolls.
    rolls = 1000000

    # Create a dictionary to hold the tally for each outcome.
    tally = {}
    for i in range(1, 7):
        tally[i] = 0

    # Roll the dice 'rolls' times.
    for i in range(0, rolls):
        tally[dice.roll()] += 1

    # Assert that the probability is correct.
    for i in range(1, 7):
        assert tally[i] / rolls == pytest.approx(1 / 6, 1e-2)
In [6]:
!pytest dice/test/test_dice.py::test_fair
============================= test session starts ==============================
platform linux -- Python 3.6.3, pytest-3.2.1, py-1.4.34, pluggy-0.4.0
rootdir: /home/rjg20/training/arc-training/intermediate/github-testing-ci/nbplain, inifile:
collected 1 item                                                                

dice/test/test_dice.py .

=========================== 1 passed in 2.09 seconds ===========================

Phew, thanks goodness! Testing is hard.

Exercises

1

The file dice/test/test_dice.py contains an empty function, test_double_roll, for checking that the distribution for the sum of two six-sided dice rolls is correct. Fill in the body of this function and run pytest to verify that your test passes.

Hints:

For any two n-sided dice, the probability of the sum of two rolls being a value of x is given by:

$$p(x) = \frac{n - |x - (n+1)|}{n^2},\quad\mathrm{for}\ x=2\ \mathrm{to}\ 2n$$

We've provided a helper function called prob_double_roll(x, n) that will calculate this probability for you, i.e.

prob = prob_double_roll(4, 6)

will return the probability of rolling a sum of 4 with two six-sided dice.

2

Parametrize your test so that it works for any pair of n-sided dice. Test it using pairs of five- and seven-sided dice.

Key Points

  • Just because you might be implementing a new algorithm, doesn't mean it can't or doesn't need to be tested
  • If you think carefully about problems you can test and verify the random behaviour of functions