Predicting Design Impactful Changes in Modern Code Review: A Large-Scale Empirical Study

Companies have adopted modern code review as a key technique for continuously monitoring and improving the quality of software changes. One of the main motivations for this is the early detection of design impactful changes, to prevent that design-degrading ones prevail after each code review. Even...

Full description

Saved in:
Bibliographic Details
Published in:2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR) pp. 471 - 482
Main Authors: Uchoa, Anderson, Barbosa, Caio, Coutinho, Daniel, Oizumi, Willian, Assuncao, Wesley K. G., Vergilio, Silvia Regina, Pereira, Juliana Alves, Oliveira, Anderson, Garcia, Alessandro
Format: Conference Proceeding
Language:English
Published: IEEE 01-05-2021
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Companies have adopted modern code review as a key technique for continuously monitoring and improving the quality of software changes. One of the main motivations for this is the early detection of design impactful changes, to prevent that design-degrading ones prevail after each code review. Even though design degradation symptoms often lead to changes' rejections, practices of modern code review alone are actually not sufficient to avoid or mitigate design decay. Software design degrades whenever one or more symptoms of poor structural decisions, usually represented by smells, end up being introduced by a change. Design degradation may be related to both technical and social aspects in collaborative code reviews. Unfortunately, there is no study that investigates if code review stakeholders, e.g, reviewers, could benefit from approaches to distinguish and predict design impactful changes with technical and/or social aspects. By analyzing 57,498 reviewed code changes from seven open-source systems, we report an investigation on prediction of design impactful changes in modern code review. We evaluated the use of six ML algorithms to predict design impactful changes. We also extracted and assessed 41 different features based on both social and technical aspects. Our results show that Random Forest and Gradient Boosting are the best algorithms. We also observed that the use of technical features results in more precise predictions. However, the use of social features alone, which are available even before the code review starts (e.g., for team managers or change assigners), also leads to highly-accurate prediction. Therefore social and/or technical prediction models can be used to support further design inspection of suspicious changes early in a code review process. Finally, we provide an enriched dataset that allows researchers to investigate the context behind design impactful changes during the code review process.
ISSN:2574-3864
DOI:10.1109/MSR52588.2021.00059