Tuesday, October 18, 2016

Software diseconomies of scale - any research?

To say last October’s post “Software has diseconomies of scale - not economies of scale” has been my most popular post ever is something of an understatement! It has been read tens of thousands of times after being picked up by some very popular publications and newsletters.

When I wrote the piece I was writing largely on intuition. That software development experiences diseconomies of scale started as hunch, gradually I could see more evidence but when I published last October I didn’t have anything that would stand up to academic scrutiny - not that I was aiming for a peer reviewed publication.

Since then I’ve become aware of several pieces of research which do show the same phenomenon and do stand up to academic standards, plus I’ve become aware of several people who have made the claim earlier than myself.

So, for those who want to dig into this subject some more let me record the evidence.

From Aristotle to Ringelmann” (or better an open download of a preprint version): in this paper Ingo Scholtes, Pavlin Mavrodiev and Frank Schweitzer at ETH in Zurich examined open source code bases and team sizes. In their abstract they conclude:

“Our findings confirm the negative relation between team size and productivity previously suggested by empirical software engineering research, thus providing quantitative evidence for the presence of a strong Ringelmann effect.”

The term “Ringelmann effect” is new to me but describes what is seen in software teams - and elsewhere:

“The Ringelmann effect is the tendency for individual members of a group to become increasingly less productive as the size of         their group increases” Wikipedia

Another academic paper is Economies and diseconomies of scale in software development by Comstock, Jiang and Davies. Unfortunately, despite the title this paper focused on constructing a model for effort and value in software development than drawing conclusions about economies and diseconomies. So yes it is about software economics but not a lot about economies and diseconomies, more software accounting. The authors offer no conclusion of their own in that respect.

That said there is some evidence here. The authors note several earlier studies which had mixed results, some showed diseconomies of scale but some also showed economies. They also point out that at least two of the more established forecasting models (from traditional backgrounds), the Putnam and CJD models, assumed diseconomies of scale (I’d never heard of either of these models before!).

It a shame that some of this knowledge has existed in some academic circles for several years, it seems to be another example of how academic papers hidden behind paywalls prevent the spread of useful knowledge. (My initial attempts to get this paper met with paywalls but in the link above I tracked down a downloadable version.)

There is an important observation built into this paper, almost by accident: the optimal team size is not fixed, it will depend on the size of the undertaking and the duration of the effort. I am often asked “What size should a team be?” and several Scrum advocates have stated “the team should be 7 people plus or minus 2.” Clearly the right size for any undertaking will depend on multiple factors.

Away from the academic world there is some more support for diseconomies from other observations.

In a blog post about two and a half years before mine Jesus Gil Hernandez made exactly the same point, “Diseconomies of Scale in Software Development.” Jesus also makes the point that if we want to forecast and plan large software initiatives we need to take such diseconomies into account.

Diseconomies of Scale and Lines of Code” is blog post from further back, 10 years ago, from Jeff Atwood’s Coding Horrors blog. Jeff also suggests diseconomies of scale but then his blog goes on to focus on lines of code and why its a poor metric (and it is). Jeff also points out that Steve McConnell discussed diseconomies of scale in his 2006 book, Software Estimation.

What is becoming clear is that the possibility of diseconomies of scale in software development have been known about for a long time and my argument has some validity. What we need now is more research to get to the bottom of this…

Unfortunately I don’t have the time or resources to get do such research, I’m happy to collaborate, any academic or PhD student out there fancy picking this up as a research question?