5.1 Input Features Our CUEZILLA tool measures quality of bug reports on the basis of their contents. From the survey, we know the most desired features in bug reports by developers. Endowed with this knowledge, CUEZILLA first detects the features listed below. For each feature a score is awarded to the bug report, which is either binary (e.g., attachment present or not) or continuous (e.g., readability).
Itemizations. In order to recognize itemization in bug reports, we checked whether several subsequent lines started with an itemization character (such as -, *, or +). To recognize enumerations, we searched for lines starting with numbers or single characters that were enclosed by parenthesis or brackets or followed by a single punctuation character.
Keyword completeness. We reused the data set provided by Andy Koetal.[20] o define a quality-score of bug reports based on their content. In a first step,we removed stop words,reduced the words to their stem, and selected words occurring in at least 1% of bug reports. Next we categorized the words into the following groups:
– action items (e.g., open, select, click) – expected and observed behavior (e.g., error, missing) – steps to reproduce (e.g., steps, repro) – build-related (e.g., build) – user interface elements (e.g., toolbar, menu, dialog) In order to assess the completeness of a bug report, we computed for each group a score based on the keywords present in the bug report. The maximum score of 1 for a group is reached when a keyword is found. In order to obtain the final score (which is between 0 and 1), we averaged the scores of the individual groups. In addition to the description of the bug report, we analyze the attachments that were submitted by the reporter within 15 minutes after the creation of the bug report. In the initial description and attachments, we recognize the following features: CodeSamples. We identify C++ and JAVA code examples using techniques from island parsing [24]. Currently, our tools can recognize declarations (for classes, methods, functions, and variables),comments,conditional statements(suchasifand switch), and loops (such as for and while). StackTraces. We currently can recognize JAVA stack traces, GDB stack traces, and MOZILLA talkback data. Stack traces are easy to recognize with regular expressions: they consist of a start line (that sometimes also contains the top of the stack) and trace lines. Patches. In order to identify patches in bug reports and attachments we again used regular expressions. They consist of several start lines(which file to patch) and blocks (which are the changes to make) [23]. Screenshots. We identify the type of an attachment using the file toolinUNIX.If an attachment is an image,we recognizeitas a screenshot. If the file is recognized as text, we process the file and search for code examples, stack traces, and patches (see above). For more details about extraction of structural elements from bug reports we refer to our previous work [7], in which we showed that we can identify the above features with a close to perfect precision. After cleaning the description of a bug report from source code, stack traces, and patches, we compute its readability. Readability. To compute readability we use the style tool, which “analyses the surface characteristics of the writing style of a document” [10]. It is important to not confuse readability with grammatical correctness. The readability of a text is measured by the number of syllables per word and the length of sentences. Read ability measures are used by Amazon.com to inform customers about the difficulty of books and by the US Navy to ensure readability of technical documents [19]. In general, the higher a readability score the more complex a text is to read. Several readability measures return values that correspond to school grades. These grades tell how many years of education a reader should have before reading the text without difficulties. For our experiments we used the following seven readability measures: Kincaid, Automated Readability Index (ARI), Coleman-Liau, Flesh, Fog, Lix, and SMOG Grade.
5.5 Recommendations by CUEZILLA The core motivation behind CUEZILLA is to help reporters file better quality bug reports. For this, its ability to detect the presence of information features can be exploited to tip reporters on what information to add. This can be achieved simply by recommending additions from the set of absent information, starting with the feature that contributes to the quality further by the largest margin. These recommendations are intended to serve as cues or reminders to reporters of the possibility to add certain types of information; likely to improve bug report quality. The left panel of Figure 1 illustrates the concept. The text in the panel is determined by investigating the current contents of the report, and then determining that would be best, for instance, adding a code sample to the report. As and when new in formation