The lack of reliable data
One problem with the field of information security is that there’s little reliable data about security vulnerabilities. Without reliable data on exactly what’s a risk and what’s not, it’s very difficult to make good decisions about what’s worth doing and what’s not. It’s even worse because information security deals with the vulnerabilities in very rapidly changing technology. So even if you had accurate data on the vulnerabilities that exist today, the changing technology landscape will probably make that data totally useless a year from now. Probably much sooner than that.
People have overcome similar problems in the past, and seeing how this was done might provide some insight into what could be done to get a better and more useful understanding of security vulnerabilities. The International Classification of Diseases (ICD) is probably a good model for this.
All members of the World Health Organization (WHO) report the causes of deaths in their country to the WHO in a standard format that’s defined by the ICD. The most recent version of this scheme is ICD-10. If you’re curious to see how people are dying throughout the world, you can find the combined data from all WHO members here, where it’s broken down into the categories that ICD-10 defines. If you die from an intestinal nematode infection, your death will show up in category W032. If you die from an iodine deficiency, it’s category W055. If you die in a traffic accident, it’s code W150. If die from a snake bite, it’s X20.
Having all WHO members report the causes of death in a standard format makes it easy to track the progress of many health and safety programs. If you want to learn how well the medical community is doing in eliminating bubonic plague, you can get the data that tracks this from the WHO. If you want to see how well safety features of automobiles are reducing the number of fatalities, you can learn this from the WHO data also.
If we’d like a system much like ICD-10, but applied to information security, we would need a way to report and classify security vulnerabilities. We’re not that far from that today. NIST maintains the National Vulnerability Database which tracks known vulnerabilities, but there’s not yet a good system to classify the vulnerabilities past a high-level description like a buffer-overflow vulnerability or a SQL injection vulnerability. Maybe that’s good enough, but I doubt it. A better system might be needed. With the current NVD it’s still hard to find exactly how many products have a particular class of vulnerability. That could just be a question of the user interface, however.
It would also be useful to know how many users are affected by each of the known security vulnerabilities. Vendors of commercial software probably know how many users there are of their products, but that information might be harder to get for open-source software. On the other hand, it might not be too difficult to get open-source projects to report how many downloads of their code that they have had, from which you can probably get a reasonable estimate of how many users have it installed. Having this data reported to a single organization might make it much easier to understand security vulnerabilities. NIST is probably the best organization to collect this data. Or is there a better alternative?