Social media and NLP tasks: Challenges in crowdsourcing linguistic information
In the framework of the TraMOOC1(Translation for Massive Open Online Courses) research and innovation project, data collection tasks for parallel translation are implemented using a crowdsourcing platform. The educational genre (videolectures subtitles, forums discussions, course assignments), the t...
Saved in:
Published in: | 2016 11th International Workshop on Semantic and Social Media Adaptation and Personalization (SMAP) pp. 53 - 58 |
---|---|
Main Authors: | , , , |
Format: | Conference Proceeding |
Language: | English |
Published: |
IEEE
01-10-2016
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | In the framework of the TraMOOC1(Translation for Massive Open Online Courses) research and innovation project, data collection tasks for parallel translation are implemented using a crowdsourcing platform. The educational genre (videolectures subtitles, forums discussions, course assignments), the type of text (segmentation, misspellings, syntax errors, specialized terminology, scientific formulas, limited knowledge on context) of the source data, and the multilingual approach of the involved activities (the focus is on a total of 12 European and BRIC languages) provides a challenging setting for the success of the project. Experimental trials reveal significant findings for the purposes of Language Technology research as well as limitations in crowdsourcing linguistic data collections for multilingual tasks. |
---|---|
ISBN: | 9781509052455 1509052453 |
DOI: | 10.1109/SMAP.2016.7753384 |