NoticIA: A Clickbait Article Summarization Dataset in Spanish
We present NoticIA, a dataset consisting of 850 Spanish news articles featuring prominent clickbait headlines, each paired with high-quality, single-sentence generative summarizations written by humans. This task demands advanced text understanding and summarization abilities, challenging the models...
Saved in:
Main Authors: | , |
---|---|
Format: | Journal Article |
Language: | English |
Published: |
11-04-2024
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | We present NoticIA, a dataset consisting of 850 Spanish news articles
featuring prominent clickbait headlines, each paired with high-quality,
single-sentence generative summarizations written by humans. This task demands
advanced text understanding and summarization abilities, challenging the
models' capacity to infer and connect diverse pieces of information to meet the
user's informational needs generated by the clickbait headline. We evaluate the
Spanish text comprehension capabilities of a wide range of state-of-the-art
large language models. Additionally, we use the dataset to train
ClickbaitFighter, a task-specific model that achieves near-human performance in
this task. |
---|---|
DOI: | 10.48550/arxiv.2404.07611 |