Search Results - "Zhmoginov, Andrey"

Refine Results
  1. 1

    Antimatter interferometry for gravity measurements by Hamilton, Paul, Zhmoginov, Andrey, Robicheaux, Francis, Fajans, Joel, Wurtele, Jonathan S, Müller, Holger

    Published in Physical review letters (28-03-2014)
    “…We describe a light-pulse atom interferometer that is suitable for any species of atom and even for electrons and protons as well as their antiparticles, in…”
    Get full text
    Journal Article
  2. 2

    MobileNetV2: Inverted Residuals and Linear Bottlenecks by Sandler, Mark, Howard, Andrew, Zhu, Menglong, Zhmoginov, Andrey, Chen, Liang-Chieh

    “…In this paper we describe a new mobile architecture, MobileNetV2, that improves the state of the art performance of mobile models on multiple tasks and…”
    Get full text
    Conference Proceeding
  3. 3

    Fine-tuning Image Transformers using Learnable Memory by Sandler, Mark, Zhmoginov, Andrey, Vladymyrov, Max, Jackson, Andrew

    “…In this paper we propose augmenting Vision Transformer models with learnable memory tokens. Our approach allows the model to adapt to new tasks, using few…”
    Get full text
    Conference Proceeding
  4. 4

    Decentralized Learning with Multi-Headed Distillation by Zhmoginov, Andrey, Sandler, Mark, Miller, Nolan, Kristiansen, Gus, Vladymyrov, Max

    “…Decentralized learning with private data is a central problem in machine learning. We propose a novel distillation-based decentralized learning technique that…”
    Get full text
    Conference Proceeding
  5. 5

    Continual HyperTransformer: A Meta-Learner for Continual Few-Shot Learning by Vladymyrov, Max, Zhmoginov, Andrey, Sandler, Mark

    Published 11-01-2023
    “…We focus on the problem of learning without forgetting from multiple tasks arriving sequentially, where each task is defined using a few-shot episode of novel…”
    Get full text
    Journal Article
  6. 6

    HyperTransformer: Model Generation for Supervised and Semi-Supervised Few-Shot Learning by Zhmoginov, Andrey, Sandler, Mark, Vladymyrov, Max

    Published 11-01-2022
    “…In this work we propose a HyperTransformer, a Transformer-based model for supervised and semi-supervised few-shot learning that generates weights of a…”
    Get full text
    Journal Article
  7. 7

    Compositional Models: Multi-Task Learning and Knowledge Transfer with Modular Networks by Zhmoginov, Andrey, Bashkirova, Dina, Sandler, Mark

    Published 22-07-2021
    “…Conditional computation and modular networks have been recently proposed for multitask learning and other problems as a way to decompose problem solving into…”
    Get full text
    Journal Article
  8. 8

    Learning and Unlearning of Fabricated Knowledge in Language Models by Sun, Chen, Miller, Nolan Andrew, Zhmoginov, Andrey, Vladymyrov, Max, Sandler, Mark

    Published 29-10-2024
    “…ICML 2024 Workshop on Mechanistic Interpretability What happens when a new piece of knowledge is introduced into the training data and how long does it last…”
    Get full text
    Journal Article
  9. 9

    Non-Discriminative Data or Weak Model? On the Relative Importance of Data and Model Resolution by Sandler, Mark, Baccash, Jonathan, Zhmoginov, Andrey, Howard, Andrew

    “…We explore the question of how the resolution of the input image ("input resolution") affects the performance of a neural network when compared to the…”
    Get full text
    Conference Proceeding
  10. 10

    MELODI: Exploring Memory Compression for Long Contexts by Chen, Yinpeng, Hutchins, DeLesley, Jansen, Aren, Zhmoginov, Andrey, Racz, David, Andersen, Jesper

    Published 04-10-2024
    “…We present MELODI, a novel memory architecture designed to efficiently process long documents using short context windows. The key principle behind MELODI is…”
    Get full text
    Journal Article
  11. 11

    Training trajectories, mini-batch losses and the curious role of the learning rate by Sandler, Mark, Zhmoginov, Andrey, Vladymyrov, Max, Miller, Nolan

    Published 05-01-2023
    “…Stochastic gradient descent plays a fundamental role in nearly all applications of deep learning. However its ability to converge to a global minimum remains…”
    Get full text
    Journal Article
  12. 12

    Narrowing the Focus: Learned Optimizers for Pretrained Models by Kristiansen, Gus, Sandler, Mark, Zhmoginov, Andrey, Miller, Nolan, Goyal, Anirudh, Lee, Jihwan, Vladymyrov, Max

    Published 17-08-2024
    “…In modern deep learning, the models are learned by applying gradient updates using an optimizer, which transforms the updates based on various statistics…”
    Get full text
    Journal Article
  13. 13

    Decentralized Learning with Multi-Headed Distillation by Zhmoginov, Andrey, Sandler, Mark, Miller, Nolan, Kristiansen, Gus, Vladymyrov, Max

    Published 28-11-2022
    “…Decentralized learning with private data is a central problem in machine learning. We propose a novel distillation-based decentralized learning technique that…”
    Get full text
    Journal Article
  14. 14

    Fine-tuning Image Transformers using Learnable Memory by Sandler, Mark, Zhmoginov, Andrey, Vladymyrov, Max, Jackson, Andrew

    Published 29-03-2022
    “…In this paper we propose augmenting Vision Transformer models with learnable memory tokens. Our approach allows the model to adapt to new tasks, using few…”
    Get full text
    Journal Article
  15. 15

    Inverting face embeddings with convolutional neural networks by Zhmoginov, Andrey, Sandler, Mark

    Published 13-06-2016
    “…Deep neural networks have dramatically advanced the state of the art for many areas of machine learning. Recently they have been shown to have a remarkable…”
    Get full text
    Journal Article
  16. 16

    Transformers learn in-context by gradient descent by von Oswald, Johannes, Niklasson, Eyvind, Randazzo, Ettore, Sacramento, João, Mordvintsev, Alexander, Zhmoginov, Andrey, Vladymyrov, Max

    Published 15-12-2022
    “…At present, the mechanisms of in-context learning in Transformers are not well understood and remain mostly an intuition. In this paper, we suggest that…”
    Get full text
    Journal Article
  17. 17

    Information-Bottleneck Approach to Salient Region Discovery by Zhmoginov, Andrey, Fischer, Ian, Sandler, Mark

    Published 22-07-2019
    “…We propose a new method for learning image attention masks in a semi-supervised setting based on the Information Bottleneck principle. Provided with a set of…”
    Get full text
    Journal Article
  18. 18

    Resonant Wave-Particle Manipulation Techniques by Zhmoginov, Andrey I

    Published 01-01-2012
    “…Charged particle dynamics can be altered considerably even by weak electromagnetic waves if some of the particles are in resonance. Depending on the wave…”
    Get full text
    Dissertation
  19. 19

    Large-Scale Generative Data-Free Distillation by Luo, Liangchen, Sandler, Mark, Lin, Zi, Zhmoginov, Andrey, Howard, Andrew

    Published 10-12-2020
    “…Knowledge distillation is one of the most popular and effective techniques for knowledge transfer, model compression and semi-supervised learning. Most…”
    Get full text
    Journal Article
  20. 20

    Non-discriminative data or weak model? On the relative importance of data and model resolution by Sandler, Mark, Baccash, Jonathan, Zhmoginov, Andrey, Howard, Andrew

    Published 07-09-2019
    “…We explore the question of how the resolution of the input image ("input resolution") affects the performance of a neural network when compared to the…”
    Get full text
    Journal Article