Search Results - "Masoud, Maraim"
-
1
Data Governance in the Age of Large-Scale Data-Driven Language Technology
Published 02-11-2022“…Proceedings of 2022 ACM Conference on Fairness, Accountability, and Transparency (FAccT '22) The recent emergence and adoption of Machine Learning technology,…”
Get full text
Journal Article -
2
Masader: Metadata Sourcing for Arabic Text and Speech Data Resources
Published 13-10-2021“…The NLP pipeline has evolved dramatically in the last few years. The first step in the pipeline is to find suitable annotated datasets to evaluate the tasks we…”
Get full text
Journal Article -
3
Aspects of Terminological and Named Entity Knowledge within Rule-Based Machine Translation Models for Under-Resourced Neural Machine Translation Scenarios
Published 28-09-2020“…Rule-based machine translation is a machine translation paradigm where linguistic knowledge is encoded by an expert in the form of rules that translate text…”
Get full text
Journal Article -
4
Documenting Geographically and Contextually Diverse Data Sources: The BigScience Catalogue of Language Data and Resources
Published 24-01-2022“…In recent years, large-scale data collection efforts have prioritized the amount of data collected in order to improve the modeling capabilities of large…”
Get full text
Journal Article -
5
Masader Plus: A New Interface for Exploring +500 Arabic NLP Datasets
Published 01-08-2022“…Masader (Alyafeai et al., 2021) created a metadata structure to be used for cataloguing Arabic NLP datasets. However, developing an easy way to explore such a…”
Get full text
Journal Article -
6
The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset
Published 07-03-2023“…As language models grow ever larger, the need for large-scale high-quality text datasets has never been more pressing, especially in multilingual settings. The…”
Get full text
Journal Article -
7
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
Published 09-11-2022“…Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these…”
Get full text
Journal Article