This lecture provides an introduction to debugging, a crucial activity in every developer's life. After an elementary discussion of some useful debugging concepts, the lecture goes on with a detailed review of general debugging techniques, independent of any specific software. The final part of the lecture is dedicated to analysing problems related to the use of C++ , the main programming language commonly employed in particle physics nowadays.
1. General concepts
about debugging. After many days of brainstorming, designing and coding, the programmer finally have a wonderful piece of code. He compiles it and runs it. Everything seems pretty straightforward but unfortunately it doesn't work! And now? Now the great fun starts! Time to dig into the wonderful world of debugging. Despite being the realm of ingenuity and uncertainty, a debugging process can be divided into four main steps:
1. localising a bug,
2. classifying a bug,
3. understanding a bug,
4. repairing a bug.
1.1 Localizing a bug
A typical attitude of inexperienced programmers towards bugs is to consider their localization an easy task: they notice their code does not do what they expected, and they are led astray by their confidence in knowing what their code should do. This confidence is completely deceptive because spotting a bug can be very difficult. As it was explained in the introduction, all bugs stem from the premise that something thought to be right, was in fact wrong.
Noticing a bug implies testing. Testing should be performed with discipline and, when possible, automatically, for example after each build of the code. In case of a test failure, the programmer must be able to see what went wrong easily, so tests must be prepared carefully. This lecture will not cover the basic of testing.
1.2 Classifying a bug
Despite the appearance, bugs have often a common background. This allows to attempt a quite coarse, but sometimes useful, classification. The list is arranged in order of increasing difficulty (which fortunately means in order of decreasing frequency).
Syntactical Errors should be easily caught by your compiler. I say "should" because compilers, beside being very complicated, can be buggy themselves. In any case, it is vital to remember that quite often the problem might not be at the exact position indicated by the compiler error message.
Build Errors derive from linking object files which were not rebuilt after a change in some source files. These problems can easily be avoided by using tools to drive software building.
Basic Semantic Errors comprise using un initialized variables, dead code (code that will never be executed) and problems with variable types. A compiler can highlight them to your attention, although it usually has to be explicitly asked through flags (cp. 2.1).
Semantic Errors include using wrong variables or operators (e.g., & instead of && in C++). No tool can catch these problems, because they are syntactically correct statements, although logically wrong. A test case or a debugger is necessary to spot them.
A funny physical classification distinguishes between Bohrbugs and Heisenbugs. Bohrbugs are deterministic: a particular input will always manifest them with the same result. Heisenbugs are random : difficult to reproduce reliably, since they seem to depend on environmental factors (e.g. a particular memory allocation, the way the operating system schedules processes, the phase of the moon and so on). In C++ a Heisenbug is very often the result of an error with pointers.
1.3 Understanding a bug
A bug should be fully understood before attempting to fix it. Trying to fix a bug before understanding it completely could end in provoking even more damage to the code, since the problem could change form and manifest itself somewhere else, maybe randomly. Again, a typical example is memory corruption: if there is any suspect memory was corrupted during the execution of some algorithm, all the data involved in the algorithm must be checked before trying to change them.
The following check-list is useful to assure a correct approach to the investigation:
- do not confuse observing symptoms with finding the real source of the problem;
- check if similar mistakes (especially wrong assumptions) were made elsewhere in the code;
- verify that just a programming error, and not a more fundamental problem (e.g. an incorrect algorithm), was found.
1.4 Repairing a bug
The final step in the debugging process is bug fixing. Repairing a bug is more than modifying code. Any fixes must be documented in the code and tested properly. More important, learning from mistakes is an effective attitude: it is good practice filling a small file with detailed explanations about the way the bug was discovered and corrected. A check-list can be a useful aid.
Several points are worth recording:
- How the bug was noticed, to help in writing a test case;
- How it was tracked down, to give you a better insight on the approach to choose in similar circumstances;
- What type of bug was encountered?
- If this bug was encountered often, in order to set up a strategy to prevent it from recurring;
- If the initial assumptions were unjustified; this is often the main reason why tracking a bug is so time consuming.
2 General debugging techniques
As said before, debugging is often the realm of ingenuity and uncertainty. Yet a number of tricks can be adopted in the daily programming activity to ease the hunt for problems.
2.1 Exploiting compiler features
A good compiler can do some static analysis on the code. Static code analysis is the analysis of software that is performed without actually executing programs built from that software. Static analysis can help in detecting a number of basic semantic problems, e.g. type mismatch or dead code.
Having a look at the user manual of the compiler employed, where all the features should be documented, is highly recommended. For gcc, the standard compiler on GNU/Linux systems, there are a number of options that affect what static analysis can be performed. They are usually divided into two classes: warning options and optimization flags.
2.2 Reading the right documentation
This seems quite an obvious tip, but too often inexperienced programmers read the wrong papers looking for hints about the task they have to accomplish. The relevant documentation for the task, the tools, the libraries and the algorithms employed must be at fingertips to find the relevant information easily.
As far as documentation is concerned, the most important distinction is between tutorials and references. A tutorial is a pedagogical paper, usually with plenty of examples. It doesn't assume any previous knowledge of the topic and its first aim is to convey ideas about the subject. Reference manuals, on the contrary, are comprehensive and exhaustive descriptions, which allow finding the answers to questions through indexes and cross-references.
In the world of programming, all these types of documents are usually in electronic format. The reference documentation must be up to date, accurate and corresponding to the problems and tools used: looking up in a wrong reference manual could end up in trying to use a feature that is not supported by the current version of the tool, for example.
2.3 The abused cout debugging technique
The cout technique takes its names from the C++ statement for printing on the standard output stream (usually the terminal screen). It consists of adding print statements in the code to track the control flow and data values during code execution. Although it is the favourite technique of all the novices, it is unbelievable how many experienced programmers still refuse to evolve and abandon this absolutely time-wasting and very ad-hoc method.
Despite its popularity, this technique has strong disadvantages. First of all, it is very ad-hoc, because code insertion is temporary, to be removed as soon as the bug is fixed. A new bug means a new insertion, making it a waste of time. In debugging as well as in coding, the professional should aim to find reusable solutions whenever possible. Printing statements are not reusable, and so are deprecated. As we will see shortly, there are more effective ways to track the control flow through messages. In addition, printing statements clobber the normal output of the program, making it extremely confused. They also slow the program down considerably: accessing to the outputting peripherals becomes a bottleneck. Finally, often they do not help at all, because for performance reasons, output is usually buffered and, in case of crash, the buffer is destroyed and the important information is lost, possibly resulting in starting the debugging process in the wrong place.
In some (very few) circumstances cout debugging can be appropriate, although it can always be replaced by other techniques. For these cases, here are some tips. To begin with, output must be produced on the standard error, because this channel is unbuffered and it is less likely to miss the last information before a crash. Then, printing statements should not be used directly: a macro should be defined around them (as illustrated in listing 2) so to switch debugging code on and off easily. Finally, debugging levels should be used to manage the amount of debugging information.
Listing 2: An example of cout technique - Declaration
#ifndef DEBUG_H
#define DEBUG_H
#inc lude < stdarg.h>
#if defined (NDEBUG) && defined (__GNUC__)
/* gcc's cpp has extensions; it allows for macros with a variable number of arguments. We use this extension here to preprocess pmesg away. */
#define pmesg ( level, forma t, args. . . )(( void ) 0 )
#else
void pmesg ( int level , char * forma t , . . . ) ;
/ * print a message , if it is con sidered significant enough */
#endif
#endif /* DEBUG_H */
Listing3:An example of cout technique-Implementation
#include "debug.h"
#include
extern int msglevel; /* the higher,t