Search Results - "Fei, Zhaoye"
-
1
Balanced Data Sampling for Language Model Training with Clustering
Published 22-02-2024“…Data plays a fundamental role in the training of Large Language Models (LLMs). While attention has been paid to the collection and composition of datasets,…”
Get full text
Journal Article -
2
Query of CC: Unearthing Large Scale Domain-Specific Knowledge from Public Corpora
Published 25-01-2024“…Large language models have demonstrated remarkable potential in various tasks, however, there remains a significant scarcity of open-source models and data for…”
Get full text
Journal Article -
3
Turn Waste into Worth: Rectifying Top-$k$ Router of MoE
Published 17-02-2024“…Sparse Mixture of Experts (MoE) models are popular for training large language models due to their computational efficiency. However, the commonly used top-$k$…”
Get full text
Journal Article -
4
Pre-training for Information Retrieval: Are Hyperlinks Fully Explored?
Published 14-09-2022“…Recent years have witnessed great progress on applying pre-trained language models, e.g., BERT, to information retrieval (IR) tasks. Hyperlinks, which are…”
Get full text
Journal Article -
5
Coarse-to-Fine: Hierarchical Multi-task Learning for Natural Language Understanding
Published 18-08-2022“…Generalized text representations are the foundation of many natural language understanding tasks. To fully utilize the different corpus, it is inevitable that…”
Get full text
Journal Article -
6
WanJuan-CC: A Safe and High-Quality Open-sourced English Webtext Dataset
Published 29-02-2024“…This paper presents WanJuan-CC, a safe and high-quality open-sourced English webtext dataset derived from Common Crawl data. The study addresses the challenges…”
Get full text
Journal Article -
7
InternLM-Math: Open Math Large Language Models Toward Verifiable Reasoning
Published 09-02-2024“…The math abilities of large language models can represent their abstract reasoning ability. In this paper, we introduce and open-source our math reasoning LLMs…”
Get full text
Journal Article -
8
Towards More Effective and Economic Sparsely-Activated Model
Published 14-10-2021“…The sparsely-activated models have achieved great success in natural language processing through large-scale parameters and relatively low computational cost,…”
Get full text
Journal Article -
9
InternLM2 Technical Report
Published 25-03-2024“…The evolution of Large Language Models (LLMs) like ChatGPT and GPT-4 has sparked discussions on the advent of Artificial General Intelligence (AGI). However,…”
Get full text
Journal Article