Multilingual Hallucination Gaps in Large Language Models
Large language models (LLMs) are increasingly used as alternatives to traditional search engines given their capacity to generate text that resembles human language. However, this shift is concerning, as LLMs often generate hallucinations, misleading or false information that appears highly credible...
Saved in:
Main Authors: | , , |
---|---|
Format: | Journal Article |
Language: | English |
Published: |
23-10-2024
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Large language models (LLMs) are increasingly used as alternatives to
traditional search engines given their capacity to generate text that resembles
human language. However, this shift is concerning, as LLMs often generate
hallucinations, misleading or false information that appears highly credible.
In this study, we explore the phenomenon of hallucinations across multiple
languages in freeform text generation, focusing on what we call multilingual
hallucination gaps. These gaps reflect differences in the frequency of
hallucinated answers depending on the prompt and language used. To quantify
such hallucinations, we used the FactScore metric and extended its framework to
a multilingual setting. We conducted experiments using LLMs from the LLaMA,
Qwen, and Aya families, generating biographies in 19 languages and comparing
the results to Wikipedia pages. Our results reveal variations in hallucination
rates, especially between high and low resource languages, raising important
questions about LLM multilingual performance and the challenges in evaluating
hallucinations in multilingual freeform text generation. |
---|---|
DOI: | 10.48550/arxiv.2410.18270 |