Data-Driven Answer Selection in Community QA Systems
Finding similar questions from historical archives has been applied to question answering, with well theoretical underpinnings and great practical success. Nevertheless, each question in the returned candidate pool often associates with multiple answers, and hence users have to painstakingly browse...
Saved in:
Published in: | IEEE transactions on knowledge and data engineering Vol. 29; no. 6; pp. 1186 - 1198 |
---|---|
Main Authors: | , , , , , |
Format: | Journal Article |
Language: | English |
Published: |
New York
IEEE
01-06-2017
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Finding similar questions from historical archives has been applied to question answering, with well theoretical underpinnings and great practical success. Nevertheless, each question in the returned candidate pool often associates with multiple answers, and hence users have to painstakingly browse a lot before finding the correct one. To alleviate such problem, we present a novel scheme to rank answer candidates via pairwise comparisons. In particular, it consists of one offline learning component and one online search component. In the offline learning component, we first automatically establish the positive, negative, and neutral training samples in terms of preference pairs guided by our data-driven observations. We then present a novel model to jointly incorporate these three types of training samples. The closed-form solution of this model is derived. In the online search component, we first collect a pool of answer candidates for the given question via finding its similar questions. We then sort the answer candidates by leveraging the offline trained model to judge the preference orders. Extensive experiments on the real-world vertical and general community-based question answering datasets have comparatively demonstrated its robustness and promising performance. Also, we have released the codes and data to facilitate other researchers. |
---|---|
ISSN: | 1041-4347 1558-2191 |
DOI: | 10.1109/TKDE.2017.2669982 |