Theory driven development using Microsoft Pex Explorer

EDIT: NOTE that some of the contents of this post have been recently been edited thanks to great feedback from Peli de Halleux (http://blog.dotnetwiki.org/). He pointed me at a great post on Test Driven Development with Parameterized Unit Tests that he'd wrote here: http://blog.dotnetwiki.org/TestDrivenDevelopmentWithParameterizedUnitTests.aspx

I've been playing with Microsoft Pex to try out some theory-driven development. I've found it a great tool – actually much better than I was hoping, it's found more bugs, and test cases, than using just unit testing or TDD (Test-Driven Development). It's test-driven development on steroids!

If you've been reading this blog, you'll remember I blogged a while ago about TDD using theories, and what theories are – with the following posts:

http://taumuon-jabuka.blogspot.com/2008/03/comparing-theories-to-more-traditional.html

http://taumuon-jabuka.blogspot.com/2008/03/theories.html

I mentioned that theories could have data-points generated by some sore of explorer, and Jeff Brown (http://blog.bits-in-motion.com/) one of the comments replied that maybe I could use Microsof Pex to do so, and I've found it to be perfect! A lot more powerful than I hoped.

This blog post will try to detail my experiences, but by far the best way to get to grips with this is to download Pex and try writing some theories!

In the example below, I'll create a queue with the following operations:

Enqueue()

Dequeue()

Capacity

Count

It will have an initial capacity, and it will double in size once its capacity is exceeded (similar to the interface on .NET lists, for instance).

First, I'll skip over a few tests written via TDD (a red-bar green-bar refactor cycle), before I get onto writing some theories.

First, in MSTest, we write a test.

        [TestMethod]
        public void Test_NewlyConstructedQueue_CountIsZero()
        {
            Queue<int> queue = new Queue<int>();
            Assert.AreEqual<int>(0, queue.Count);
        }

Then we get the tests to compile, by implementing our queue with a Count property, such that we get a red bar, by getting the count property to return -1.

We then fix the test to get a green bar, but letting the Count return zero:

    public class Queue<T>
    {
        public int Count
        {
            get { return 0; }
        }
    }

We then introduce a test that default constructor creates stack with default stack of capacity 8.

        [TestMethod]
        public void Test_ConstructDefaultConstructor_InitialCapacityIsEight()
        {
            Queue<int> queue = new Queue<int>();
            Assert.AreEqual<int>(8, queue.Capacity);
        }

The implementation is pretty simple:

        private const int initialCapacity = 8;
        public int Capacity
        {
            get { return initialCapacity; }
        }

Now, want to add a Dequeue method, and test that calling it on an empty stack throws an InvalidOperationException.

        [TestMethod]
        [ExpectedException(typeof(InvalidOperationException))]
        public void Test_NewlyConstructedQueue_DequeueThrowsException()
        {
            Queue<int> queue = new Queue<int>();
            int i = queue.Dequeue();
        }

        public T Dequeue()
        {
            throw new InvalidOperationException("Should not call Dequeue on an empty queue");
        }

We're going to come back to look at this test shortly.

We add a test for Enqueue, to check its count is 1.

        [TestMethod]
        public void Test_NewlyConstructedQueue_Enqueue_CountIsOne()
        {
            Queue<int> queue = new Queue<int>();
            queue.Enqueue(1);
            Assert.AreEqual(1, queue.Count);
        }

Add an Enqueue method that does nothing to get this to compile and redbar. We actually have to add some implementation now, to get this to pass.

    public class Queue<T>
    {
        private const int initialCapacity = 8;
        private T[] items = null;

        public int Count
        {
            get { return items == null ? 0 : 1; }
        }

        public int Capacity
        {
            get { return initialCapacity; }
        }

        public T Dequeue()
        {
            throw new InvalidOperationException("Should not call Dequeue on an empty queue");
        }

        public void Enqueue(T item)
        {
            items = new T[initialCapacity];
            items[0] = item;
        }
    }

There are lots of things wrong with this implementation – there are some obvious bugs, we're recreating the array on each Enqueue, not to mention the array being a fixed size, with no capability to grow., and the weird Count implementation. This obviously needs implementing and refactoring, but it passes all our tests, and that's TDD – we don't go adding code until we drive it via adding more tests. So, lets add another couple of tests to drive the implementation:

        [TestMethod]
        public void Test_NewlyConstructedQueue_Enqueue_DequeueReturnsItem()
        {
            Queue<int> queue = new Queue<int>();
            queue.Enqueue(1);
            int item = queue.Dequeue();
            Assert.IsTrue(1 == item);
        }

And can get the tests to pass with the following:

        public T Dequeue()
        {
            if (items == null)
            {
                throw new InvalidOperationException("Should not call Dequeue on an empty queue");
            }
            return items[0];
        }

NOTE: We're really taking baby steps, but to ensure that we're not writing any untested implementation.

Now let's add a test for adding and removing more than one item:

        [TestMethod]
        public void Test_NewlyConstructedQueue_EnqueueTwoItems_DequeueReturnsFirstItem()
        {
            Queue<int> queue = new Queue<int>();
            queue.Enqueue(1);
            queue.Enqueue(2);
            int item = queue.Dequeue();
            Assert.IsTrue(1 == item);
        }

To get this to pass we can more property implement our queue.

    public class Queue<T>
    {
        private int tailIndex = 0; // insertion
        private int headIndex = 0; // removal
        private const int initialCapacity = 8;
        private T[] items = new T[initialCapacity];

        public int Count
        {
            get { return tailIndex - headIndex; }
        }

        public int Capacity
        {
            get { return initialCapacity; }
        }

        public T Dequeue()
        {
            if (Count == 0)
            {
                throw new InvalidOperationException("Should not call Dequeue on an empty queue");
            }
            return items[headIndex--];
        }

        public void Enqueue(T item)
        {
            items[tailIndex++] = item;
        }
    }

To get this to work we had to add more and more example tests – we were saying quite general statements, but expressing this in more and more specific example cases, adding functionality as we went. This process is called triangulation (as I mentioned in an earlier blog post).

There are still a number of limitations here: if we exceed 8 items in the queue, we'll get an IndexOutOfRange exception.

Can quite easily write a test for adding more values for the auto-growth.

Also, a more insiduous bug is that our values can migrate along the array – we're not moving items back towards the back (or front of the queue), so could run out of space even with less than 8 values – a much harder bug to find. This is a subtle bug, and to find it using a unit test would be quite tricky, if you didn't see it up front. USING PEX TO AUTOMATICALLY GENERATE THE THEORY VALIDATION IS GREAT FOR THESE TYPES OF BUGS!

Let's see if we could have written our test in such a way that this problem may have been detected.

        [PexMethod]
        public void Theory_CanAddRemoveAnyNumberOfItemsToNewlyConstructedQueue_IsEmptyAfterwards(int num)
        {
            Queue<int> queue = new Queue<int>();
            for (int i = 0; i < num; i++)
            {
                queue.Enqueue(i);
            }
            for (int i = 0; i < num; i++)
            {
                queue.Dequeue();
            }
            Assert.AreEqual<int>(0, queue.Count);
        }

Use the menu option “Run Pex Explorations”.

4 explorations (unit tests) are generated for this – passing in values of num 0, 1073741824, and 2 and 1.

0 passes.

1073741824 throws an IndexOutOfRangeException

2 throws an IndexOutOfRangeException

1 throws an AssertFailedException

This straight away shows that the code is more bugged than I thought. This would probably have been spotted on the introduction of further tests (for instance, test adding and removing three items from the list), but already highlights the power of theories.

Note: This could have been written as a unit test, but would have had to iterate over all possible integers, instead of just the 4 values that Pex provided.

Passing in a value of 1 – we had a similar test to this in enqueing and dequeing a single item, but we tested that the item dequeued was the same as that enqueued, we didn't test the Count property.

This bug is due to the fact that the headIndex should have been incremented, instead of decremented on dequeuing.

Fixing that and running our tests (including the previously generated ones – without reexploring) results in 9 of the 10 tests passing. The failing case is we're passing in a large in value.

This is obviously due to the fact that we've got a fixed size array, and it doesn't auto-grow.

Let's add an assumption to get our test to pass, before we go on to add the auto-grow functionality.

PexAssume.IsTrue(num <= queue.Capacity, "num is less than capacity");

In theories, assumptions allows us to state the input data for which the theory is valid.

We need to re-run the explorations so we're not calling our test with data points that fail the assumption.

Before we go to get our queue to expand its capacity, let's change our theory to see if it can detect the problem of running out of space after adding and removing items even before it goes out of capacity.

I originally wrote my theory in the format below, but see the second format and following comments, following feedback from Peli de Halleux.

        [PexMethod]
        public void Theory_CanAddRemoveAnyNumberOfItemsAnyEmptyQueue_IsEmptyAfterwards(
            [PexAssumeNotNull]Queue<int> queue, int num)
        {
            PexAssume.IsTrue(num <= queue.Capacity, "num is less than capacity");
            PexAssume.IsTrue(queue.Count == 0);
            for (int i = 0; i <= num; i++)
            {
                queue.Enqueue(i);
            }
            for (int i = 0; i <= num; i++)
            {
                queue.Dequeue();
            }
            Assert.AreEqual<int>(0, queue.Count);
        }

This was rewritten to:

        [PexMethod]
        [PexGenericArguments(typeof(int))]
        public void Theory_CanAddRemoveAnyNumberOfItemsAnyEmptyQueue_IsEmptyAfterwards<T>(
            [PexAssumeUnderTest]Queue<T> queue, [PexAssumeNotNull]T[] values)
        {
            PexAssume.IsTrue(values.Length <= queue.Capacity, "num is less than capacity");
            PexAssume.IsTrue(queue.Count == 0);
            for (int i = 0; i < values.Length; i++)
            {
                queue.Enqueue(values[i]);
            }
            for (int i = 0; i < values.Length; i++)
            {
                queue.Dequeue();
            }
            Assert.AreEqual<int>(0, queue.Count);
        }

The PexAssumeNotNullAttribute states that the theory is not valid for null values of that parameter, but the PexAssumeUnderTest, as well as stating that the parameter cannot be null, gives Pex more hints for tuning its search strategies.

This is a much more general theory – we're now saying given any empty queue, it will be empty after enqueueing and dequeueing.

We can instruct Pex how to construct a queue, that has had items added and removed:

        [PexFactoryMethod(typeof(Queue<int>))]
        public static Queue<int> Create(int numberInitialAdds,
            int numberInitialRemoves)
        {
            Queue<int> queue = new Queue<int>();
            if (numberInitialAdds <= 0 || numberInitialRemoves < 0)
            {
                return queue;
            }
            for (int i = 0; i < numberInitialAdds; ++i)
            {
                queue.Enqueue(i);
            }
            int numberToRemove = numberInitialAdds > numberInitialRemoves ?
            numberInitialRemoves : numberInitialAdds;
            for (int i = 0; i < numberToRemove; ++i)
            {
                queue.Dequeue();
            }
            return queue;
        }

Now, running the theory gives an IndexOutOfRangeException after adding and removing ight items to the queue in the factory – and then subsequently adding and removing one items. This should work – as we're not exceeding the capacity.

This is powerful – if we were unaware of the existence of this bug it's difficult to imagine how we would have come across it in TDD.

Let's fix the failing theory before we remove our assumptions.

        public T Dequeue()
        {
            if (Count == 0)
            {
                throw new InvalidOperationException("Should not call Dequeue on an empty queue");
            }
            T item = items[headIndex++];
            if (headIndex == tailIndex)
            {
                headIndex = tailIndex = 0;
            }
            return item;
        }

Now, the tests run. Let's go on to address the auto-grow issue.

Now, remove the assumption that values.Length <= queue.Capacity and re-explore this theory.

Get an index out of range exception when attempting an array of 15 items to the queue.

We can fix this by implementing auto-grow in the stack:

        public void Enqueue(T item)
        {
            if (tailIndex == items.Length - 1)
            {
                items = new T[items.Length * 2];
            }
            items[tailIndex++] = item;
        }

We can now re-run our tests and they all pass.

Now, deleting all Generated Unit Tests in the class, and re-running Pex Explorations, gives Path bounds exceeded messages – the time is being spent in the factory adding and removing items to the queue, and exploring the paths of auto-growing the stack.

Instead of getting our queue into the required states by manipulating it via its public interface, we can more-directly construct it into any state.

Remove the factory, and add an overloaded constructor.

        public Queue(T[] items, int headIndex, int tailIndex)
        {
            if ((uint)tailIndex >= (uint)(items.Length))
                throw new ArgumentException("(uint)tailIndex >= (uint)(items.Length)");

            this.items = items;
            this.headIndex = headIndex;
            this.tailIndex = tailIndex;
        }

And instruct Pex to use this constructor with an assembly attribute:

[assembly: PexExplorableFromConstructor(typeof(Queue<int>),
    typeof(int[]), typeof(int), typeof(int))]

Now, Pex is able to explore our theory, and finds no failing cases.

NOTE: In my original version of the theory, my theory took an int, the number of values to add to the queue, instead of the array of values, and it that case Pex would pass in a number of integers to add that would cause OutOfMemoryExceptions to occur. Instead, passing in the array of values means that Pex passes in more meaningful data.

The implementation of the queue is progressing; we're autogrowing the stack, but we're not copying the values across. We're also not updating the capacity. We need to check that we can enqueue or dequeue items,and we'd obviously write more theories to do this.

This does show that the thought process behind writing theories is similar to TDD – the tests/theories are only as good as you write, and it's still easy to miss obvious implentation (you still need to think about what you're writing), but I feel that writing theories, with an explorer, allows many more bugs to be caught during the process. Hopefully this has given a good taste of theory driven development.

Labels: Pex, TDD, Theories, Unit testing