Our analysis of the effect of experience in reviewing a file showed strong effects on the density of useful comments (Figure 6). For all the five projects, reviewers who had reviewed a file before were almost twice more useful (65% -71%) than the first time reviewers (32% -37%). Comment usefulness densities also show an increasing trend with the number of prior reviews up to around five reviews, after which the usefulness density plateaued between 70% and 80%.
Based on these results, we conclude that developers who have either changed or reviewed an artifact before give more useful comments. One possible explanation for these results is that reviewers who have changed or reviewed a file before have more knowledge about the design constraints and the implementation. Therefore, they are able to provide more relevant comments. Also, a first time reviewer may not know the design and context, they may ask questions to understand the implementation, or identify false issues based on their incorrect assumption. Unsurprisingly, first time reviewers of an artifact are providing less valuable feedback.
We assume that review experience shows more drastic effects on comment usefulness than change experience because many teams have a practice of letting new developers first review the code before they are allowed to change the code. Therefore, a developer who makes the first change to a file has most likely already reviewed it before.
We calculated a reviewer’s experience based on his or her tenure at Microsoft. In four out of the five projects (all but Exchange), reviewers that spend more time in the organization have a higher density of useful comments. The effect is especially visible for new hires, who in the first three months had the lowest density of useful comments. During the first three quarters, the usefulness density increases the most, and stays relatively stable after the first year. The first year at Microsoft is often considered “ramp up” time for the new hires. During that time employees become more familiar with the code review process, project design, and coding practices at Microsoft. After the ramp up period, they can be as useful reviewers as their senior Microsoft peers. In detail, we saw for Azure an increased density of useful comments from 60% to 66%, for Bing from 62% to 67%, for Visual Studio from 60% to 70% and for Office from 60% to 68% after the first year. For Exchange, we could not see a steady trendline, and usefulness ratios vary between 60% to 65%.