Mining evolutionary couplings from developer interactions and commits
The thesis presents an approach to mine evolutionary couplings from a combination of interaction (e.g., Mylyn) and commit (e.g., CVS) histories. The evolutionary couplings are expressed at the file and method levels of granularity. Although the topic of mining evolutionary couplings has been investigated previously, the empirical comparison and combination of the two types from interaction and commit histories have not been attempted. An empirical study on 3272 interactions and 5093 commits from Mylyn, an open source task management tool for the Eclipse Integrated Development Environment, was conducted. Both interaction and commit histories were divided into training and testing sets. The training sets were used to train six different, two individual and four combined, prediction models for each of the six prediction models. The testing sets were used to evaluate the prediction models for the task of commit and interaction prediction. Precision and recall metrics were used to measure the effectiveness of the model. The results show that combined models offer statistically significant increases in recall over the individual models for commit predictions. At the file level, the combined models achieved a maximum recall improvement of 13% for commit prediction and 3% for interaction prediction. These recall improvements came with a maximum precision drop of 2% for commit prediction and 1% for interaction prediction. The model trained from commit history predicted interactions with a higher precision than the model trained from interaction history. The model from interaction history predicted commits with a higher recall than the model from commit history at the file level granularity. The combination models were quite effective in commit predictions; however, no single combination model outperformed the individual models in terms of both recall and precision.