Deep Learning-Based Correct Answer Prediction for Developer Forums

Developer forums are essential for software engineers to solve their problems with the assistance of experts on such forums. However, sometimes the solutions (answers) of a problem are not satisfactory or challenging to select the potential answer. Information seekers usually browse all the answers...

Full description

Saved in:
Bibliographic Details
Published in:Access, IEEE Vol. 9; pp. 128166 - 128177
Main Authors: Iftikhar, Hafiz Umar, Rehman, Aqeel Ur, Kalugina, Olga A, Umer, Qasim, Khan, Haris Ali
Format: Standard
Language:English
Published: IEEE 2021
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Developer forums are essential for software engineers to solve their problems with the assistance of experts on such forums. However, sometimes the solutions (answers) of a problem are not satisfactory or challenging to select the potential answer. Information seekers usually browse all the answers within the question thread to get the potential answer. The manual selection of correct answers is a tedious and time-consuming task. In this paper, we propose an automatic classification approach to predict the correct answers for developer forums. We first extract the metadata and combination of Q/A for each thread of the developer community ( Stack Overflow ). Then, the natural language processing techniques are applied to preprocess the Q/A combinations of the given dataset. After that, a keyword ranking algorithm is leveraged to extract keywords and their ranking scores for each Q/A combination. Based on keywords and their ranking scores for each Q/A combination, a keywords-based feature vector is constructed. Subsequently, word embedding is leveraged to convert each preprocessed Q/A combination into a text-based feature vector. Finally, we pass the metadata, keywords-based features, and text-based features to the ensemble deep learning model for training to predict correct answers. The results of 10-fold cross-validation specify that the proposed approach is accurate and surpasses the state-of-the-art. On average, it improves the accuracy , precision , recall , and f-measure up to 1.72% , 24.96% , 6.57% , and 16.62% , respectively.
DOI:10.1109/ACCESS.2021.3108416