Jonathan on October 17th, 2006

As open source has become more mainstream, people have become more comfortable with the idea that open source is a viable and, in many cases, preferable choice when choosing which software to use. Companies such as Google agree that open source is a good platform to use for their mission-critical servers. Gartner is predicting a rosy picture for open source databases in enterprise-level environments. However, occasionally, there is still FUD being generated about the supposed inherent lack of quality of open source code.

In a recent article titled Insecurity in Open Source, Ben Chelf, the CTO of Coverity, blogs about how Coverity compared the best open source code with the best closed-source code, and — gasp! — closed source wins! The implication of his article is that it is the open source nature of the code that makes it less secure. This is a fallacious argument. Using a cum hoc ergo propter hoc, or correlation implies causation argument, he picks an easy way to make a good news article. However, it isn’t a valid conclusion. He doesn’t make clear why open source software would not be as good simply because it isn’t proprietary. He also doesn’t make clear why he thinks that open source developers don’t understand “how the best proprietary software gets built.” After all, many of those open source developers are the same people who develop proprietary software.

The results from his work are interesting and useful insofar as he has published the open source data. But I question the conclusions that are drawn from that data. What is sorely missing from the article is what other factors are correlated with the best (proprietary) software. What kind of software was being built? What were the requirements of that software and what are the penalties of it failing? How much time was allocated for the project? How many people were part of the development and testing teams? And finally, how much money was spent on each of those projects?

Does it make sense to compare the quality of the open source Apache web server with that of, say, a proprietary heart-lung machine (HLM)? Are the requirements the same? Was the intent and design of the Apache web server meant to to be run in a scenario involving human life? Or did Apache web server start its life as a mechanism for improving the way that humans find documents on the Internet? If there are any wizard coders who are using the HLM or have relatives who are using the HLM, perhaps there would be a strong incentive for them to reduce software defects beyond what could be done by the proprietary company itself.

Let’s try a thought experiment. What if the HLM software were implemented in the same way, with the same time, effort, and money given to it, but the company decided to open source the software and allow it to be reviewed by others prior to any use in a production environment? Would this result in worse software due simply to its license and to the fact that it has public peer review?

It would be interesting to perform direct comparisons between open source and proprietary software that has a similar purpose, a similar intent, similar requirements, and similar cost.

He ends with a challenge to the open source community to “take a closer look at how the best proprietary software gets built and learn from that.” I’ll end with my own challenge to the makers of the “best proprietary software”: Could you become even better if you were to open source your code?

Leave a Reply