Minimax and Neyman-Pearson Meta-Learning for Outlier Languages
Model-agnostic meta-learning (MAML) has been recently put forth as a strategy to learn resource-poor languages in a sample-efficient fashion. Nevertheless, the properties of these languages are often not well represented by those available during training. Hence, we argue that the i.i.d. assumption...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | Journal Article |
Language: | English |
Published: |
02-06-2021
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Model-agnostic meta-learning (MAML) has been recently put forth as a strategy
to learn resource-poor languages in a sample-efficient fashion. Nevertheless,
the properties of these languages are often not well represented by those
available during training. Hence, we argue that the i.i.d. assumption ingrained
in MAML makes it ill-suited for cross-lingual NLP. In fact, under a
decision-theoretic framework, MAML can be interpreted as minimising the
expected risk across training languages (with a uniform prior), which is known
as Bayes criterion. To increase its robustness to outlier languages, we create
two variants of MAML based on alternative criteria: Minimax MAML reduces the
maximum risk across languages, while Neyman-Pearson MAML constrains the risk in
each language to a maximum threshold. Both criteria constitute fully
differentiable two-player games. In light of this, we propose a new adaptive
optimiser solving for a local approximation to their Nash equilibrium. We
evaluate both model variants on two popular NLP tasks, part-of-speech tagging
and question answering. We report gains for their average and minimum
performance across low-resource languages in zero- and few-shot settings,
compared to joint multi-source transfer and vanilla MAML. |
---|---|
DOI: | 10.48550/arxiv.2106.01051 |