Using a Distributed Representation of Words in Localizing Relevant Files for Bug Reports

Once a bug in software is reported, developers have to determine which source files are related to the bug. This process is referred to as bug localization, and an automatic way of bug localization is important to improve developers' productivity. This paper proposes an approach called DrewBL t...

Full description

Saved in:
Bibliographic Details
Published in:2016 IEEE International Conference on Software Quality, Reliability and Security (QRS) pp. 183 - 190
Main Authors: Uneno, Yukiya, Mizuno, Osamu, Eun-Hye Choi
Format: Conference Proceeding
Language:English
Published: IEEE 01-08-2016
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Once a bug in software is reported, developers have to determine which source files are related to the bug. This process is referred to as bug localization, and an automatic way of bug localization is important to improve developers' productivity. This paper proposes an approach called DrewBL to efficiently localize faulty files for a given bug report using a natural language processing tool, word2vec. In DrewBL, we first build a vector space model named semantic-VSM which represents a distributed representation of words in the bug report and source code files and next compute the relevance between them by feeding the constructed model to word2vec. We also present an approach called CombBL to further improve the accuracy of bug localization which employs not only the proposed DrewBL but also existing bug localization techniques, such as BugLocator based on textual similarity and Bugspots based on bug-fixing history, in a combinational manner. This paper gives our early experimental results to show the effectiveness and efficiency of the proposed approaches using two open source projects.
DOI:10.1109/QRS.2016.30