Search Results - "Leong, Colin"
-
1
Quality at a Glance: An Audit of Web-Crawled Multilingual Datasets
Published in Transactions of the Association for Computational Linguistics (31-01-2022)“…With the success of large-scale pre-training and multilingual modeling in Natural Language Processing (NLP), recent years have seen a proliferation of large,…”
Get full text
Journal Article -
2
Culture of low passage colorectal cancer cells and demonstration of variation in selected tumour marker expression
Published in Cytotechnology (Dordrecht) (01-05-2014)“…There is increasing evidence that a tumour comprises of heterogeneous population of cells. Thus, studying homogenous cell lines in vitro may yield results that…”
Get full text
Journal Article -
3
Enhancing Multi-Domain Automatic Short Answer Grading through an Explainable Neuro-Symbolic Pipeline
Published 04-03-2024“…Grading short answer questions automatically with interpretable reasoning behind the grading decision is a challenging goal for current transformer approaches…”
Get full text
Journal Article -
4
JWSign: A Highly Multilingual Corpus of Bible Translations for more Diversity in Sign Language Processing
Published 16-11-2023“…Advancements in sign language processing have been hindered by a lack of sufficient data, impeding progress in recognition, translation, and production tasks…”
Get full text
Journal Article -
5
Bloom Library: Multimodal Datasets in 300+ Languages for a Variety of Downstream Tasks
Published 26-10-2022“…EMNLP 2022 We present Bloom Library, a linguistically diverse set of multimodal and multilingual datasets for language modeling, image captioning, visual…”
Get full text
Journal Article -
6
The eBible Corpus: Data and Model Benchmarks for Bible Translation for Low-Resource Languages
Published 19-04-2023“…Efficiently and accurately translating a corpus into a low-resource language remains a challenge, regardless of the strategies employed, whether manual,…”
Get full text
Journal Article -
7
Adapting to the Low-Resource Double-Bind: Investigating Low-Compute Methods on Low-Resource African Languages
Published 29-03-2023“…Many natural language processing (NLP) tasks make use of massively pre-trained language models, which are computationally expensive. However, access to high…”
Get full text
Journal Article -
8
BibleTTS: a large, high-fidelity, multilingual, and uniquely African speech corpus
Published 07-07-2022“…BibleTTS is a large, high-quality, open speech dataset for ten languages spoken in Sub-Saharan Africa. The corpus contains up to 86 hours of aligned, studio…”
Get full text
Journal Article -
9
Documenting Geographically and Contextually Diverse Data Sources: The BigScience Catalogue of Language Data and Resources
Published 24-01-2022“…In recent years, large-scale data collection efforts have prioritized the amount of data collected in order to improve the modeling capabilities of large…”
Get full text
Journal Article -
10
A Few Thousand Translations Go a Long Way! Leveraging Pre-trained Models for African News Translation
Published 04-05-2022“…Recent advances in the pre-training of language models leverage large-scale datasets to create multilingual models. However, low-resource languages are mostly…”
Get full text
Journal Article -
11
Quality at a Glance: An Audit of Web-Crawled Multilingual Datasets
Published 21-02-2022“…Transactions of the Association for Computational Linguistics (2022) 10: 50-72 With the success of large-scale pre-training and multilingual modeling in…”
Get full text
Journal Article -
12
Power-Control Design of Resonant Converters
Published 01-01-1999“…Novel design techniques are presented for load-resonant and quasi-resonant converters for use in, for example, arc-welding and fan-load power supplies. Both…”
Get full text
Dissertation