Friday, March 30, 2012

Agile: Where's the evidence?

A few weeks ago I was presenting at the BCS SIGIST conference - another outing for my popular Objective Agility presentation. Someone in the audience asked: “Where is the evidence that Agile works?”

My response was in two parts. First although it sounds like a reasonable question I’ve come to believe that this is a question that is asked by those who don’t believe in Agile, those who want to stall thing. It is rarely a question aimed at a rational decision.

Second I said: lets turn the question around, Where is the evidence for Waterfall? - as far as I know there is none, although there are plenty of cases of failed projects.

Now this was a pretty flippant answer, I was in show mode and as someone later asked: “Do you really recommend we ridicule people who ask for evidence?”, to which the answer is certainly No.

So let me try and answer the question more seriously.

Lets start with “Agile.” How do we define Agile? Scrum? XP? Where does Agile start and Lean begin, or vice-versa? Or, as I believe, Agile is a form of Lean?

We have a definition problem. Agile is poorly defined. I e-mailed a friend of mine who is undertaking a PhD in Architecture and Agile and asked him: Can you please point me at a literature review?

(Notice, if I was a serious researcher I would closet myself in the library for days or weeks and do the literature review myself. I’m not and I need to earn money so I I didn’t.)

His answer: “that's an interesting question because a lot of research simply assumes that agile (as a whole) is better, without any evidence for it.”

Michael also pointed out there is a context issue here. Embedded systems? Web development? Finance? Health care?

And how do you define better? Better architecture? Better requirements? Better code quality?

And Better compared to what? Chaos? Good waterfall? Bad waterfall? CMMI level 1? 2? 3? 4? 5?

You see, context.

Despite this one study claimed Scrum resulted in productivity improvements of as much as 600% - Benefield, “Rolling Out Agile in a Large Enterprise”. I’ve even heard Jeff Sutherland verbally claim Scrum can deliver 1000% improvement. To be honest I don’t believe this figures. If they are possible then I think it says something about chaotic state the organisations started in. In these cases my guess is following any process or practice would be an improvement. Standing on one leg every morning would probably have generated a 50% improvement alone.

There is a trap for Agile here. Much traditional work has defined better as: On schedule/time, on budget/cost with the desired features/functionality. But Agile doesn’t accept that as better, Agile negotiates over features/functionality and aims for business value. A report from Cranfield University a few years ago suggested than much traditional IT work failed to truly capture business value because people focused on: time, budget, features.

Then there is a question of bugs. Traditional development has been very accepting of bugs, Agile isn’t.

Maybe asking for evidence about Agile is aiming for too much. Maybe we should look at the practices instead. Here there is some evidence.

Over the years there have been various studies on pair programming which have been contradictory. Since most of these studies have been conducted on students you might well question the reliability.

Test Driven Development is clearer. As I blogged two years ago there is a study from Microsoft Research and North Carolina University which is pretty conclusive on this, TDD leads to vastly fewer bugs.

Keith Braithwaite has also done some great work looking at Cyclomatic Complexity of code and there seems to be a correlation between test coverage and better (i.e. lower) cyclomatic complexity. Keith is very careful point out that correlation does not imply cause although one might hypothesis that test driven development leads to “better” design.

TDD and source code are relatively easy to measure. I’m not really sure how you would determine whether Planning Meetings or User Stories worked. Maybe you might get somewhere with retrospectives. Measure how long a team spends in retrospectives, see if the following iteration delivers more or less as a result.

For their book Organizational Patterns of Agile Software Development Coplien and Harrison spent over 10 years assessing teams. This lead to a set of patterns which describe much of Agile software development. This is qualitative, or grounded, research rather than qualitative but is just as value.

At this point I probably should seclude myself in the British Library for a week to research each practice. Maybe I should, or maybe you want to call me an charlatan. Next year, maybe.

If we switch from hard core research to anecdotal evidence and case studies things become easier. As many readers know I’ve been working with teams in Cornwall for over 18 months. On my last visit we held a workshop with the leaders of the companies and software teams. Without naming names some comments stood out here:
  • “The main benefit [of Agile] was time to market... I don’t know how we would have done it without Agile”
  • “Agile has changed the way we run the company”
  • “It is hard to imagine a world without Agile”
That last company is now finding their source code base is shrinking. As they have added more and more automated tests the design has changed and they don’t see the need for vast swaths of code. Is that success? Each line of code is now far more expensive and they have less. (Heaven only knows what it does to function point analysis.)

If you want something a little more grounded there was a recent Forrester report (sorry, I can’t afford the full report) which said: “Agile enables early detection of issues and mid-course corrections because it delivers artefacts much sooner and more often. Finally, Agile improves IT alignment with business goals and customer satisfaction.”

Before this there was a Gartner report in which said: “It's a fact that agile efforts differ substantially from waterfall projects. It's also a fact that agile methods provide substantial benefits for appropriate business processes.” (Agile Development: Fact or Fiction, 2006).

But I’m back to that nebulous thing “Agile.”

All in all I can’t present you any clear cut evidence that “Agile works”. I think there is enough evidence to believe that Agile might be a good thing and deserves a look.

What I haven’t done is look for, let alone present, any evidence that Waterfall works.

Frankly I don’t believe Waterfall ever worked. Full stop. Winston Royce who defined “the waterfall” didn’t so why should I? (You have to read to the end of the paper.)

OK, sometimes, to save myself form a boring conversation or to cut to the chase I’m prepared to concede that: Waterfall worked in 1972 on developments using Cobol on OS/360 with an IMS database. But this isn’t 1972, you aren’t working on OS/360 and very few of you are using Cobol with IMS.

(I have seen inside an Z-OS Cobol IMS team more recently and something not unlike waterfall kind of worked, but the company concerned found it expensive, slow and troublesome and wanted them to be “Agile.”)

Even if I could present you with some research that showed Agile, or Waterfall, did work then it is unlikely that the context for that research would meet you context.

(Thus, if you do want to stall the Agile people in your company, first ask “Where is the evidence Agile works?”. When they produce it ask “How does this relate to our company? This is not our technology/market/country/etc. etc.”)

I think it might come down to one question: Is software development a defined process activity or an empirical process activity?

If you believe you can follow a defined process and write down a process to follow and the requirements of the thing you wish to build then waterfall is probably for you. On the other hand, if you believe the requirements, process, technology and a lot else defies definition then Agile is for you.

Finally, the evidence must be in the results. You organisation, your challenges, your context, are unique to you. So my suggestion is: make your own evidence.

Set up two similar teams. Tell one work Waterfall and one to Agile and let them get on with it. Come back every six months and see how they are doing.

Anyone want to give it a try?

And if anyone knows of any research please please please let me know!

(Many thanks to Michael Waterman for a very thoughtful e-mail which filled in some of the missing pieces for this blog.)

6 comments:

  1. Jeff Sutherland presented some productivity measures based on functions points in http://jeffsutherland.com/SutherlandDistributedScrumHICSS2007_v6_7_Jun_2006.pdf

    ReplyDelete
  2. Great post, Allan! It's disturbing that people seem to mainly use data to support ideas they already hold, not to reason by.

    Indeed "where is the evidence for Waterfall?" The best answer I've seen is in "Agile and Iterative Development" by Craig Larman. On pp. 102 - 105 he describes how Winston Royce was actually recommending iterative development in his 1970 paper that got commonly mis-read as advocating a single-pass waterfall process!

    But that's not even the most incredible part of the story. Craig found and spoke with the original author of DOD STD 2167. That's the one that institutionalized waterfall for the US military. I quote: "He expressed regret for the creation of the rigid single-pass waterfall standard…in hindsight, he said he would have made a strong iterative & incremental development recommendation, rather than what was in 2167."

    I can vouch for some data concerning Agile processes. With my early Agile team in 1999 - 2001, I wanted us to try this set of practices but I didn't want to have only anecdotes after all was said and done; I wanted to have real metrics.

    At the start of our project, I persuaded the members of my team to cooperate in understanding where our time was going and what results we were getting. We did 3 years of sustained Agile development work for a challenging embedded system. We measured about 3 times more output than comparable teams, and we averaged about 17 bugs per year. There was never more than 2 open defects at any given time. This quality level was far beyond anything I had seen before in software development!

    I wrote up an experience report on it (given at the Agile conference some years ago) which is available here:
    http://www.leanagilepartners.com/publications.html
    The title is "Embedded Agile Project by the Numbers With Newbies"

    It took a long time but I eventually found credible industry data to compare our numbers with. I found it hard to believe we were that much more productive, but I knew how all our data was created and there was no funny business going on. We simply wanted to understand the reality, not to promote (or bash) any idea or practice.

    Best of all, others can use the benchmark numbers to see how their teams compare. Some of the most useful data I found was from Capers Jones. He looked at the comparison I did, and agreed that the analysis technique is valid.

    You're right that it's important for people to go collect their own data and reflect on what it means. Thanks for bringing this up.

    - Nancy Van Schooenderwoert, @vanschoo

    ReplyDelete
  3. "First although it sounds like a reasonable question I’ve come to believe that this is a question that is asked by those who don’t believe in Agile, those who want to stall thing. It is rarely a question aimed at a rational decision."

    Alan, that may have been true 10 years ago when we were basically struggling to get permission to try Agile.

    It certainly is no longer true today. Agile has entered mainstream discourse and providing evidence is one of the responsibilities that come with claiming the territory of a knowledge discipline.

    We shouldn't restrict the call for evidence specifically to Agile claims though. That's not how knowledge works - we don't cherry pick the claims we like. Rather, we are all engaged in the work of figuring out what software development is about; what reliable knowledge we can have about it.

    Many claims that are considered "ground truths" of software engineering, pre-Agile, turn out to actually be pretty shaky. (See my book-in-progress "The Leprechauns of Software Engineering"). IMO software engineering needs a reboot, and I'm hoping that this reboot might come from the Agile community. But this reboot is only possible if it values hard evidence over hype and marketing schemes.

    For instance, the early quote (1996) about "600% productivity gains" from Scrum: I've actually tried to get at the underlying studies to see what the evidence behind that was. I've contacted Capers Jones (who the quote was attributed to), he doesn't remember writing that and specifically disclaims having seen that kind of productivity gain from Agile. I've contacted Ken Schwaber (on whose site the quote was posted), he referred me to Jeff Sutherland and added that he tries to avoid productivity claims. I've reached out to Jeff Sutherland and he's only been able to point me to studies way posterior to the 1996 quote. This quote should be viewed as extremely dubious; just because someone claims to have measured something doesn't mean the evidence exists.

    My stance is "if the original research isn't fully auditable, the raw data and methodology out in the open, it doesn't exist". There are any number of ways that you could fool yourself into thinking that you had witnessed a 600% productivity gain. Science demands replication.

    Unfortunately, our field has had extremely low evidential standards so far.

    ReplyDelete
  4. Hi Allan,

    When answering the question "Where is the evidence for waterfall?" have you looked around you?

    I'm saying this because nearly every product around us was built using Waterfall, and every construction project still uses the Waterfall methodology.

    I suggest you read the comments on this post: http://www.pmhut.com/agile-project-management-a-solution-to-the-changing-project-landscape where one of those who commented claim that Agile projects never get finished.

    ReplyDelete
  5. Actually PM Hut I have looked around me, in fact I've worked on a good few IT efforts which would consider themselves "waterfall" (if they ever had to name it) and actually: I don't think they ever worked.

    I think you are mistaken. I think if you looked inside the projects you refer to you would find very very very few of them defined what they were about to build, "designed it", implemented it, and tested it without any feedback between those activities.

    Therefore, I do not believe the Waterfall ever worked. QED.

    ReplyDelete
  6. Hi Laurent -

    You are right that evidence should be verifiable and fully auditable, but what about situations like the one I mentioned? I was an employee wanting to know how well my team compared with whatever was known about similar teams. I couldn't open up our code to be examined by others. My managers would not have been interested in inviting a software metrics company in to evaluate our code. Would they have allowed it if it was for free? I am confident the answer would be no, just due to suspicion that doing so could somehow expose proprietary information. Safer to say no than yes.

    My fallback was to use a 3rd party static analysis tool to profile our code for complexity, average length of methods, etc. and then include some of that data in the experience report. (It was too voluminous to include all of it.) If you accept that productivity has to do with amount of code per person hour then the next problem is how do you prove what hours were worked?

    Again, we get to the problem where I didn't have the authority to make our labor records public. But there are more problems with auditability. For instance, the connection between software statistics and good design is not direct. Without an ability to examine the actual source code.

    My earliest validation was from talking informally with other early practitioners, and with Ward Cunningham at the initial Agile conferences. I found several teams that claimed they were seeing only 1 or 2 bugs per month. Conversation showed me they were not some weird corner case, so I took that as useful info. Also they weren't selling me anything;-)

    In the end I figured that the odds of a whole team making up a bunch of phony data and claims were longer than the odds that the practices we described were having a positive effect. So I wrote up what I had.

    I agree that science demands replication, and that our field has extremely low evidential standards.

    I also agree that we need to find a way to have authoritative data supporting claims made. Without that, Agile is too vulnerable to those with other agendas.

    But this is a big goal and would need a real effort by a group of us who are serious about finding some practical way for us to surface the evidence and subject it to peer review. Is there a group already working on this? I checked the Agile Alliance site for the list of current programs. There isn't one for this idea. Want to consider starting one?

    - Nancy Van Schooenderwoert, @vanschoo

    ReplyDelete