Wednesday, October 03, 2007

Agile failure modes - again

Back in April and May I discussed failure modes in Agile software development. Well I’ve found another. That is to say I’ve found another way in which an Agile development practise can go wrong. I’ve seen this before but I didn’t recognise it as a failure mode, I saw it as a little local difficulty now I’ve seen it a second time I understand it a bit better. The trouble with this failure is that at first it looks a lot like what should be happening.

One of the core ideas behind Lean and Agile is improving quality of code. Improving code quality enabled several other benefits. Improved quality means fewer bugs, so future development is less disrupted because there is less rework. Improve quality means less time spent in test at the end of a development. Thus short iterative cycles become possible. And improved quality means you can deliver (or just demo) any time. So quality improvement is important.

One way in which we seek to improve quality is through automated unit testing. This is usually undertaken as Test Driven Development or Test First Development. Now there is a whole host of reasons why it can be difficult but that is another blog. With automated unit tests fine grained functionality can be tested very rapidly which provides for rapid regression testing and increased confidence in new code.

This is where the failure comes in.

We actually want the tests to fail if we change something because they prevents errors slipping into the system. So we add a test for new code, write the new code and by the time we check the code it is finished and the test pass. But, if later something changes which will break the code then we want the test to fail in order to catch the error that has been introduced. In this case we want the test to be fragile so that it catches anything which is now wrong.

The problem occurs when the test is fragile in the wrong way. In the case I’ve just observed the test was fragile when data in the database changed. Previously when I saw this the tests were fragile around COM. The test was/is not sufficiently isolated from its environment.

True dyed-in-the-wool Agile/TDD practitioners will immediately jump to say: you shouldn’t do it that way. And they are right. These fragility points should be addressed. This might be by stubbing code out, using mock objects or by tearing-down and restoring some element - such as a database.

As a consequence instead of the tests helping you validate code and move faster they become a hindrance. The need to update and change the tests regularly adds to the work load and complexity. Instead of the test showing everything is good they add to the maintenance burden, the exact opposite of what was desired.

The problem is that those trying to take the Agile/TDD approach have good intentions by they don’t have the experience, but how do you get the experience without doing it?

Well there are three answers here: first the organization should have invested in training for these people. Second the organization could have hired someone with experience to help them, perhaps as a part-time coach or perhaps as a full time co-worker. And third, the management could have helped the teams learn faster and better and encouraged them to reflect more on what they were doing.

Trouble is Agile development is often introduced bottom up, so people do just jump in. And even where management introduce it then such training and coaching can be expensive - that is if you know who to call and where to buy it in the first place. (If you don’t know who to call then let me know, I know people who do this stuff.)

So although it is well intentioned and the ‘just jump in’ / ‘just do it’ approach can get you started but it can also start building up bigger problems further down the line. If you know what to look for you can spot the problems and - at least try to - take corrective action. If you don’t then it is quite likely you’ll give the whole Agile experiment a bad name,