20110322

Using deterministic generators with Quickcheck

The 0.6 release of Quickcheck supports deterministic generators. The goal is to be able to make the generation of values reproducible. This is useful when you are working with a bug from your favourite continuous integration server or when you would like to run a piece of code in the debugger repeatedly with the same results.

A non-goal of the support is to remove the random nature Quickcheck. Values are still random to allow a good coverage but reproducibility is supported when needed. This way you have the best of both worlds.

Quickcheck uses internally the linear congruential random number generator (RNG) implemented in the Java’s Random class. The interesting property of the RNG in the context of reproducible values is stated in the javadoc.

If two instances of Random are created with the same seed, and the same sequence of method calls is made for each, they will generate and return identical sequences of numbers.

You can configure the seed used by Quickcheck with the RandomConfiguration class. It’s important to set the seed for every individual test method otherwise a RNG’s return values are dependent on execution order of the test methods. If you run different tests, add a new tests or execute the tests in a different order other values will be generated.

The seed is generated randomly for the normal execution. This is the result of the RandomConfiguration.initSeed method call. This way Quickcheck still produces random values. Use the setSeed method to set the seed for a test method.

Instead of using the RandomConfiguration directly you should use the SeedInfo JUnit method rule that will run with every test method. Additionally, it adds the seed information, that is needed to reproduce the problem, into the AssertionError thrown.

The SeedInfo can be used like every other JUnit method rule. It’s added as an member of the test class. The example generates values in a way that the assertion always fails.

@Rule public SeedInfo seed = new SeedInfo();

@Test public void run(){
  Generator<Integer> unique = uniqueValues(integers());
  assertEquals(unique.next(), unique.next());
}

An example error message is:
java.lang.AssertionError: expected:<243172514> but was:<-917691317> (Seed was 3084746326687106280L.)
You can also use the SeedInfo instance to set the seed for a test method to reproduce the problem from the AssertError.

Rule public SeedInfo seed = new SeedInfo();

@Test public void restore(){
  seed.restore(3084746326687106280L);
  Generator<Integer> unique = uniqueValues(integers());
  assertEquals(unique.next(), unique.next());
}

Instead of setting the seed for individual tests you can also set the initial seed once for the random generator used by the JVM. If you run the test example from above (without the SeedInfo method rule member) and the configuration -Dnet.java.quickcheck.seed=42:

@Test public void run(){
   Generator<Integer> unique = uniqueValues(integers());
   assertEquals(unique.next(), unique.next());
}

You should get the result:
java.lang.AssertionError: expected:<977378563> but was:<786938819>

The configuration of seed values replaces the serialization and deserialization support of earlier Quickcheck versions. Setting the seed is a much simpler way to reproduce values over multiple JVM executions.

20110321

Revert - Sometimes going back is the way forward

Revert is the reverse gear of your version control software. It removes all local changes and brings the local workspace back to clean state of a committed revision. It is an important tool in the revision control software tool box. Once in a while there is no way forward so you have to go backward to make progress.

This may sound unintuitive. We are trying to make a change not reverse it. Why should the tool that destroys all this hard work be the best option in some circumstances? Firstly, you do not lose everything. Even if you revert everything you gain some knowledge. At least that this exact way does not work. This is a good data point. Secondly and more obviously, revert let’s you start with a fresh state. More often than not we are able to reach a working state again. Removing everything is the fastest way to get to there.




I see mainly two scenarios for the revert command: planned mode and the accidental mode.

Planned mode revert

 

Starting with a working state of your committed source code you can do some exploratory work. Find out what you where looking for and revert.

Now you can start the work in a informed way from a working state. The artifacts of the exploration are removed. After reverting you do know that the state you are starting from works. To verify that an workspace state works you do need tools to catch the problems: a decent test coverage and other quality assurance measures.

A corollary is that because you are planning to revert anyway you can change your workspace in every way you need for the exploration.

Accidental mode revert

 

The first scenario was a bit too idyllic: you started your work with an exploratory mind set, found the precious information and clean up after yourself. Everything is planned, clean and controlled. This scenario is valid. You can do the exploratory work voluntarily. More often it is the case that you have dug yourself in. You need to find a way out.

Is this a hole or the basement?

 

The first issue is to know when your in a hole and there is little chance to get out.

Say you commit roughly every hour. Now you did not commit for four hours. Your change set becomes bigger and bigger. You see no way to get your tests running again. Different tests are broken after multiple trials to fix everything. Your in a hole.

You made a change and it resulted in absolutely unexpected problems. Your tests are broken. You do not know why. There are red lights all over the place. Your in a hole.

You made small, controlled, incremental changes for some time without committing. You did not bother to commit because everything was so simple. Now the changes become bigger you would like to commit but you can’t because you can’t bring the whole system to run again. You are in a hole.

The commonality of the three examples is that your not in control of the process. The world you created determines your next steps. This happens to everyone. It’s normal. It happens all the time. Otherwise our work would be predictable day in and out - how boring. (I would go so fare as to say that in other circumstances it’s a good sign that you can follow the inherent conclusions of your system. This way it’s productive that you are determined by the conclusions of your system because it is consistent.)

If there is such a thing as experience in hole digging it’s to see the problem coming and to stop early. If it happened often enough to you, you should know the signs. You’ll know that knee deep holes are deep enough to stop and that it’s not necessary to disappear completely.

Ways out

 

Now after you found out that have a problem all energy should be put in it. Don’t try to be too smart. Solve this one problem. You have two options to get out of the hole: fixing the current state or revert.

Fixing the current state can work. You find enough information to fix the problem. You’ll lose some time but nothing of your work. Once the current state works it’s good a idea to commit now. This creates a save point. If there are more problems lurking down the road you can always come back to this state. The problem is that you might not find the problem. Finding a way out now is hard. Your change set adds to the complexity of the underlying problem. Your changes obfuscate the problem and make it harder to analyze. Everything you do will increase the change set complexity further.

When fixing the current state is too hard, you have to revert your work to keep up the pace. Now you have the problem that you have already sunk so much time and the next step is to roll everything back to the state you started from. This does not feel pleasant. The upside is that even though you reverted the code not everything is lost. You still have more knowledge about the problem. This knowledge can be used on the second and hopefully last attack. Make notes if you need them to remember the information you gathered.

The first attempt was in the wrong direction and/or too big. It is a good idea to make smaller steps with interim commits to create save points you can revert to. This creates a safety net if you bump into the problems again. You can revert repeatedly to chop smaller portions of the problem until it is solved. You decrease the size of the changes until you can understand a problem. Once in a while strange things happen and a single line change has crazy effects. After removing such road blocks you can make bigger steps again.

There is of course a middle way: trying to revert only partially. Without creating and applying patches you have only one direction to go (revert) and you'll swiftly have to revert everything (because your change history is lost). I’ll come back to an approach to use diff and patch to do partial reverts in a controlled way later.

Bringing the costs of reverts down


The problem with reverts is that they are expensive. Work you've already done is removed from the source tree. Not something we are especially proud of.

The problem is only as big as the change set that is flushed down the toilet. You should commit as often as your infrastructure allows: the execution time of tests and the integration costs are the main factor here. (You can move some of the cost into the continuous integration environment you're using.) As always this is a trade-off between the work lost and the overhead created by frequent commits. Committing every hour is probably a good idea. Just do whatever fit’s your needs best.

The other factor is the right attitude to the revert operation. If you have already spent a lot of time on a problem and could not find a fix, it’s likely you won’t find it in this direction and a fresh approach is needed. You can actually save a lot of effort by aborting this failed attempt. This will also bring the total costs of a inevitable later revert down.

Conclusion


Failed attempts are not the problem. We have to learn from our failures. They are just too often and valuable to loose. Making failures is okay. Samuel Beckett put it nicely:

Ever tried. Ever failed. No matter. Try again. Fail again. Fail better.