Coping with Complexity
Complexity has become the most significant challenge to meeting time to market and reliability demands for software. Traditional debugging and testing methodologies simply fall short when dealing with today’s sophisticated code bases. Increased complexity reduces software quality, reliability, safety, and security. Automated tools are needed to cope with this complexity explosion, and static source code analysis represents one of the most effective strategies.
Static analyzers attempt to find code sequences that may result in buffer overflows, resource leaks, or many other security and reliability problems. Source code analyzers are effective at locating a significant class of defects that are not detected by compilers during standard builds and often go undetected during run-time testing or typical field operation.
DoubleCheck Source Code Analyzer
Unlike other source code analyzers that run as separate tools, DoubleCheck™ is an Integrated Static Analyzer (ISA). DoubleCheck is built into the Green Hills™ C/C++ compiler, taking advantage of accurate and efficient analysis algorithms that have been tuned and field proven over the past 25 years. DoubleCheck can be used as a single integrated tool to perform compilation and defect analysis in the same pass.
A typical compiler will issue warnings and errors for some basic potential code problems, such as violations of the language standard or use of implementation-defined constructs. In contrast, DoubleCheck performs a full program analysis, finding bugs caused by complex interactions between pieces of code that may not even be in the same source file.
Unlike other tools, DoubleCheck automatically uses the exact same code configuration as used during the build process. This allows developers to be certain that the code executed is the same code that was checked.
DoubleCheck determines potential execution paths through code, including paths into and across subroutine calls, and how the values of program objects (such as standalone variables or fields within aggregates) could change across these paths.
DoubleCheck looks for many types of flaws, including:
- Potential NULL pointer dereferences
- Access beyond an allocated area (e.g. array or dynamically allocated buffer); otherwise known as a buffer overflow
- Potential writes to read-only memory
- Reads of potentially uninitialized objects
- Resource leaks (e.g. memory leaks and file descriptor leaks)
- Use of memory that has already been deallocated
- Out of scope memory usage (e.g. returning the address of an automatic variable from a subroutine)
- Failure to set a return value from a subroutine
- Buffer and array underflows
The analyzer understands the behavior of many standard runtime library functions. For example it knows that subroutines like free should be passed pointers to memory allocated by subroutines like malloc. The analyzer uses this information to detect errors in code that calls or uses the result of a call to these functions.
Customizing the Bug Search
DoubleCheck can be taught about properties of userdefined subroutines. For example if a custom memory allocation system is used, DoubleCheck can be taught to look for misuses of this system, finding more bugs and reducing false positives. DoubleCheck is highly accurate– much better at limiting false positives than traditional UNIX analyzers like lint. In addition to flaws that lead directly to program faults, DoubleCheck can detect questionable constructs that should be fixed to improve code clarity. A good example of this is a write to a variable that is never subsequently read.
Green Hills Coding Standard
Many software development organizations employ an internal coding standard which governs programming practices to help ensure quality, maintainability, and reliability. DoubleCheck helps automate the enforcement of coding standards.
For example, DoubleCheck measures and, optionally, limits software component complexity using standardized metrics such as McCabe. These metrics help make code easier to understand, maintain, and test.
DoubleCheck also has a Green Hills Mode, incorporating 25 years of Green Hills experience in helping customers develop high quality software. Green Hills Mode adds a number of sensible quality controls to DoubleCheck’s bug finding mission, including a number of MISRA compliance checks, enforcement of optional but important language standards, and more.
Once again, since DoubleCheck is already traversing the code tree to find bugs, metric computations and enforcement of other coding rules do not incur significant overhead. Because DoubleCheck can be configured to generate a build error pointing out the offending code, the developer is unable to accidentally submit software that violates the coding rule. Using DoubleCheck as an automated software quality control saves the time and frustration typically associated with peer reviews.
Output of the Analyzer
DoubleCheck is capable of emitting errors as part of the build process as well as generating an intuitive set of web pages, powered by an integrated web server. The user can browse high level summaries of the different flaws found by the analyzer (Figure 1) and then click on hyperlinks to investigate specific problems. Within a specific problem display, the error is displayed inline with the surrounding code, making it easy to understand (Figure 2). Function names and other objects are hyperlinked for convenient browsing of the source code. Since the web pages are running under a web server, the results can easily be shared and browsed by any member of the development team.
Analysis Time
Analysis time is a gating factor in the adoption of source code analyzers. Unlike other analyzers that are used sporadically as a testing tool, DoubleCheck is fast enough to be used by all developers, all the time. DoubleCheck executes 5 times faster than other commercial analyzers. This advantage increases to a factor of 20 or more when DoubleCheck’s distributed build engine is used to automatically parallelize the analysis across available workstation resources on the developers’ network.
Furthermore, DoubleCheck uses sophisticated subroutinelevel dependency checking. With other analyzers, a simple change to a single source file will result in a lengthy reanalysis. With DoubleCheck, analysis time is limited to portions of the code base affected by the edit, once again ensuring that DoubleCheck can be used throughout the development cycle.
Return on Investment of 30:1
DoubleCheck reduces development cost by enabling engineers to detect and resolve problems more efficiently and earlier in the development cycle. By reducing development time, products reach market faster and stay in market longer, translating into higher sales and profits. By increasing product quality, DoubleCheck reduces postsales cost (or “user” cost) associated with product failures, recalls, and in-field maintenance. Furthermore, increased quality improves market positioning and reputation, enabling organizations to command higher prices which filter directly to the bottom line.
Many studies have attempted to estimate the cost to produce and deliver software to market. It is estimated that it cost $1000 to develop each line of code on the space shuttle. Developing software to the stringent DO- 178B Level A standard (for critical aircraft systems) has been estimated at hundreds of dollars per line. On the lower end, Red Hat Linux has been estimated to cost $33 per line of code. Other estimates generally place the cost of good quality commercial software in the range of $30 to $40 per line of code.
Yet other studies have estimated how this development time is spent. Most concur that more than half of software development time is spent debugging: identifying and correcting software defects. If we use an estimate of $30 per line of code in total cost, this means that organizations conservatively spend $15 to debug each line of code.
Another commonly held belief is that the cost of identifying and correcting defects grows dramatically as the development cycle progresses. Some studies have shown that the time to fix a bug grows from an average of 2-3 hours during the coding phase to 16-18 hours when a defect must be tracked down during postintegration quality assurance testing. Author Steve McConnell is often quoted for his estimates that defects cost 10 to 100 times more to fix when they escape detection during the coding phase.
DoubleCheck decreases defect resolution time
Now let’s consider the decrease in defect resolution time enabled by DoubleCheck. Some studies have shown that static analysis can reduce the number of defects found relative to manual reviews by more than 40%. In addition to new code, DoubleCheck has been run on mature, production code, including the Apache web server, Linux kernel, OpenSSL, and sendmail. DoubleCheck has found many defects, including serious security vulnerabilities, in these code bases. When a defect is identified using static analysis, the most expensive part of defect resolution– tracking down the bug–is reduced to a negligible amount: the tool automatically locates defects and elucidates the offending code sequence leading to the failure. Using a conservative estimate of 10% for the decrease in bug fixing time enabled by DoubleCheck, the $15 cost to debug a line of code is reduced by $1.50.
A savings of $1.50 per line of code represents a return on investment of approximately 30 to 1. This ROI calculation completely ignores the aforementioned “user” cost reduction, time to market benefits, pricing benefits, etc. Studies have shown that the post-production cost of software defects can be as high or even multiple factors higher than the total development cost. This is certainly common in the aerospace, medical, and automotive industries.
As software grows in complexity, integrated static analyzers represent a powerful and cost effective tool to help manage and control that complexity.