Vera Verto: Multimodal Hijacking Attack
The increasing cost of training machine learning (ML) models has led to the inclusion of new parties to the training pipeline, such as users who contribute training data and companies that provide computing resources. This involvement of such new parties in the ML training process has introduced new...
Saved in:
Main Authors: | , , , |
---|---|
Format: | Journal Article |
Language: | English |
Published: |
31-07-2024
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | The increasing cost of training machine learning (ML) models has led to the
inclusion of new parties to the training pipeline, such as users who contribute
training data and companies that provide computing resources. This involvement
of such new parties in the ML training process has introduced new attack
surfaces for an adversary to exploit. A recent attack in this domain is the
model hijacking attack, whereby an adversary hijacks a victim model to
implement their own -- possibly malicious -- hijacking tasks. However, the
scope of the model hijacking attack is so far limited to the
homogeneous-modality tasks. In this paper, we transform the model hijacking
attack into a more general multimodal setting, where the hijacking and original
tasks are performed on data of different modalities. Specifically, we focus on
the setting where an adversary implements a natural language processing (NLP)
hijacking task into an image classification model. To mount the attack, we
propose a novel encoder-decoder based framework, namely the Blender, which
relies on advanced image and language models. Experimental results show that
our modal hijacking attack achieves strong performances in different settings.
For instance, our attack achieves 94%, 94%, and 95% attack success rate when
using the Sogou news dataset to hijack STL10, CIFAR-10, and MNIST classifiers. |
---|---|
DOI: | 10.48550/arxiv.2408.00129 |