Search Results - "Zhmoginov, Andrey"
-
1
Antimatter interferometry for gravity measurements
Published in Physical review letters (28-03-2014)“…We describe a light-pulse atom interferometer that is suitable for any species of atom and even for electrons and protons as well as their antiparticles, in…”
Get full text
Journal Article -
2
MobileNetV2: Inverted Residuals and Linear Bottlenecks
Published in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (01-06-2018)“…In this paper we describe a new mobile architecture, MobileNetV2, that improves the state of the art performance of mobile models on multiple tasks and…”
Get full text
Conference Proceeding -
3
Fine-tuning Image Transformers using Learnable Memory
Published in 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (01-06-2022)“…In this paper we propose augmenting Vision Transformer models with learnable memory tokens. Our approach allows the model to adapt to new tasks, using few…”
Get full text
Conference Proceeding -
4
Decentralized Learning with Multi-Headed Distillation
Published in 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (01-06-2023)“…Decentralized learning with private data is a central problem in machine learning. We propose a novel distillation-based decentralized learning technique that…”
Get full text
Conference Proceeding -
5
Continual HyperTransformer: A Meta-Learner for Continual Few-Shot Learning
Published 11-01-2023“…We focus on the problem of learning without forgetting from multiple tasks arriving sequentially, where each task is defined using a few-shot episode of novel…”
Get full text
Journal Article -
6
HyperTransformer: Model Generation for Supervised and Semi-Supervised Few-Shot Learning
Published 11-01-2022“…In this work we propose a HyperTransformer, a Transformer-based model for supervised and semi-supervised few-shot learning that generates weights of a…”
Get full text
Journal Article -
7
Compositional Models: Multi-Task Learning and Knowledge Transfer with Modular Networks
Published 22-07-2021“…Conditional computation and modular networks have been recently proposed for multitask learning and other problems as a way to decompose problem solving into…”
Get full text
Journal Article -
8
Learning and Unlearning of Fabricated Knowledge in Language Models
Published 29-10-2024“…ICML 2024 Workshop on Mechanistic Interpretability What happens when a new piece of knowledge is introduced into the training data and how long does it last…”
Get full text
Journal Article -
9
Non-Discriminative Data or Weak Model? On the Relative Importance of Data and Model Resolution
Published in 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW) (01-10-2019)“…We explore the question of how the resolution of the input image ("input resolution") affects the performance of a neural network when compared to the…”
Get full text
Conference Proceeding -
10
MELODI: Exploring Memory Compression for Long Contexts
Published 04-10-2024“…We present MELODI, a novel memory architecture designed to efficiently process long documents using short context windows. The key principle behind MELODI is…”
Get full text
Journal Article -
11
Training trajectories, mini-batch losses and the curious role of the learning rate
Published 05-01-2023“…Stochastic gradient descent plays a fundamental role in nearly all applications of deep learning. However its ability to converge to a global minimum remains…”
Get full text
Journal Article -
12
Narrowing the Focus: Learned Optimizers for Pretrained Models
Published 17-08-2024“…In modern deep learning, the models are learned by applying gradient updates using an optimizer, which transforms the updates based on various statistics…”
Get full text
Journal Article -
13
Decentralized Learning with Multi-Headed Distillation
Published 28-11-2022“…Decentralized learning with private data is a central problem in machine learning. We propose a novel distillation-based decentralized learning technique that…”
Get full text
Journal Article -
14
Fine-tuning Image Transformers using Learnable Memory
Published 29-03-2022“…In this paper we propose augmenting Vision Transformer models with learnable memory tokens. Our approach allows the model to adapt to new tasks, using few…”
Get full text
Journal Article -
15
Inverting face embeddings with convolutional neural networks
Published 13-06-2016“…Deep neural networks have dramatically advanced the state of the art for many areas of machine learning. Recently they have been shown to have a remarkable…”
Get full text
Journal Article -
16
Transformers learn in-context by gradient descent
Published 15-12-2022“…At present, the mechanisms of in-context learning in Transformers are not well understood and remain mostly an intuition. In this paper, we suggest that…”
Get full text
Journal Article -
17
Information-Bottleneck Approach to Salient Region Discovery
Published 22-07-2019“…We propose a new method for learning image attention masks in a semi-supervised setting based on the Information Bottleneck principle. Provided with a set of…”
Get full text
Journal Article -
18
Resonant Wave-Particle Manipulation Techniques
Published 01-01-2012“…Charged particle dynamics can be altered considerably even by weak electromagnetic waves if some of the particles are in resonance. Depending on the wave…”
Get full text
Dissertation -
19
Large-Scale Generative Data-Free Distillation
Published 10-12-2020“…Knowledge distillation is one of the most popular and effective techniques for knowledge transfer, model compression and semi-supervised learning. Most…”
Get full text
Journal Article -
20
Non-discriminative data or weak model? On the relative importance of data and model resolution
Published 07-09-2019“…We explore the question of how the resolution of the input image ("input resolution") affects the performance of a neural network when compared to the…”
Get full text
Journal Article