Search Results - "Ryabinin, Max"

Refine Results
  1. 1

    Hypernymy Understanding Evaluation of Text-to-Image Models via WordNet Hierarchy by Baryshnikov, Anton, Ryabinin, Max

    Published 13-10-2023
    “…Text-to-image synthesis has recently attracted widespread attention due to rapidly improving quality and numerous practical applications. However, the language…”
    Get full text
    Journal Article
  2. 2

    It's All in the Heads: Using Attention Heads as a Baseline for Cross-Lingual Transfer in Commonsense Reasoning by Tikhonov, Alexey, Ryabinin, Max

    Published 22-06-2021
    “…Commonsense reasoning is one of the key problems in natural language processing, but the relative scarcity of labeled data holds back the progress for…”
    Get full text
    Journal Article
  3. 3

    Mind Your Format: Towards Consistent Evaluation of In-Context Learning Improvements by Voronov, Anton, Wolf, Lena, Ryabinin, Max

    Published 12-01-2024
    “…Large language models demonstrate a remarkable capability for learning to solve new tasks from a few examples. The prompt template, or the way the input…”
    Get full text
    Journal Article
  4. 4

    Towards Crowdsourced Training of Large Neural Networks using Decentralized Mixture-of-Experts by Ryabinin, Max, Gusev, Anton

    Published 10-02-2020
    “…Advances in Neural Information Processing Systems 33 (2020) 3659-3672 Many recent breakthroughs in deep learning were achieved by training increasingly larger…”
    Get full text
    Journal Article
  5. 5

    Multilingual Pretraining Using a Large Corpus Machine-Translated from a Single Source Language by Wang, Jiayi, Lu, Yao, Weber, Maurice, Ryabinin, Max, Chen, Yihong, Tang, Raphael, Stenetorp, Pontus

    Published 31-10-2024
    “…English, as a very high-resource language, enables the pretraining of high-quality large language models (LLMs). The same cannot be said for most other…”
    Get full text
    Journal Article
  6. 6

    Is This Loss Informative? Faster Text-to-Image Customization by Tracking Objective Dynamics by Voronov, Anton, Khoroshikh, Mikhail, Babenko, Artem, Ryabinin, Max

    Published 09-02-2023
    “…Text-to-image generation models represent the next step of evolution in image synthesis, offering a natural way to achieve flexible yet fine-grained control…”
    Get full text
    Journal Article
  7. 7

    SWARM Parallelism: Training Large Models Can Be Surprisingly Communication-Efficient by Ryabinin, Max, Dettmers, Tim, Diskin, Michael, Borzunov, Alexander

    Published 27-01-2023
    “…Many deep learning applications benefit from using large models with billions of parameters. Training these models is notoriously expensive due to the need for…”
    Get full text
    Journal Article
  8. 8

    Scaling Ensemble Distribution Distillation to Many Classes with Proxy Targets by Ryabinin, Max, Malinin, Andrey, Gales, Mark

    Published 14-05-2021
    “…Ensembles of machine learning models yield improved system performance as well as robust and interpretable uncertainty estimates; however, their inference…”
    Get full text
    Journal Article
  9. 9

    SpecExec: Massively Parallel Speculative Decoding for Interactive LLM Inference on Consumer Devices by Svirschevski, Ruslan, May, Avner, Chen, Zhuoming, Chen, Beidi, Jia, Zhihao, Ryabinin, Max

    Published 04-06-2024
    “…As large language models gain widespread adoption, running them efficiently becomes crucial. Recent works on LLM inference use speculative decoding to achieve…”
    Get full text
    Journal Article
  10. 10

    Sequoia: Scalable, Robust, and Hardware-aware Speculative Decoding by Chen, Zhuoming, May, Avner, Svirschevski, Ruslan, Huang, Yuhsun, Ryabinin, Max, Jia, Zhihao, Chen, Beidi

    Published 19-02-2024
    “…As the usage of large language models (LLMs) grows, performing efficient inference with these models becomes increasingly important. While speculative decoding…”
    Get full text
    Journal Article
  11. 11

    Distributed Inference and Fine-tuning of Large Language Models Over The Internet by Borzunov, Alexander, Ryabinin, Max, Chumachenko, Artem, Baranchuk, Dmitry, Dettmers, Tim, Belkada, Younes, Samygin, Pavel, Raffel, Colin

    Published 13-12-2023
    “…Large language models (LLMs) are useful in many NLP tasks and become more capable with size, with the best open-source models having over 50 billion…”
    Get full text
    Journal Article
  12. 12

    Distributed Methods with Compressed Communication for Solving Variational Inequalities, with Theoretical Guarantees by Beznosikov, Aleksandr, Richtárik, Peter, Diskin, Michael, Ryabinin, Max, Gasnikov, Alexander

    Published 07-10-2021
    “…https://proceedings.neurips.cc/paper_files/paper/2022/hash/5ac1428c23b5da5e66d029646ea3206d-Abstract-Conference.html Variational inequalities in general and…”
    Get full text
    Journal Article
  13. 13

    Secure Distributed Training at Scale by Gorbunov, Eduard, Borzunov, Alexander, Diskin, Michael, Ryabinin, Max

    Published 21-06-2021
    “…Many areas of deep learning benefit from using increasingly larger neural networks trained on public data, as is the case for pre-trained models for NLP and…”
    Get full text
    Journal Article
  14. 14

    Moshpit SGD: Communication-Efficient Decentralized Training on Heterogeneous Unreliable Devices by Ryabinin, Max, Gorbunov, Eduard, Plokhotnyuk, Vsevolod, Pekhimenko, Gennady

    Published 04-03-2021
    “…Training deep neural networks on large datasets can often be accelerated by using multiple compute nodes. This approach, known as distributed training, can…”
    Get full text
    Journal Article
  15. 15

    RuCoLA: Russian Corpus of Linguistic Acceptability by Mikhailov, Vladislav, Shamardina, Tatiana, Ryabinin, Max, Pestova, Alena, Smurov, Ivan, Artemova, Ekaterina

    Published 23-10-2022
    “…Linguistic acceptability (LA) attracts the attention of the research community due to its many uses, such as testing the grammatical knowledge of language…”
    Get full text
    Journal Article
  16. 16

    Petals: Collaborative Inference and Fine-tuning of Large Models by Borzunov, Alexander, Baranchuk, Dmitry, Dettmers, Tim, Ryabinin, Max, Belkada, Younes, Chumachenko, Artem, Samygin, Pavel, Raffel, Colin

    Published 02-09-2022
    “…Many NLP tasks benefit from using large language models (LLMs) that often have more than 100 billion parameters. With the release of BLOOM-176B and OPT-175B,…”
    Get full text
    Journal Article
  17. 17

    Training Transformers Together by Borzunov, Alexander, Ryabinin, Max, Dettmers, Tim, Lhoest, Quentin, Saulnier, Lucile, Diskin, Michael, Jernite, Yacine, Wolf, Thomas

    Published 07-07-2022
    “…The infrastructure necessary for training state-of-the-art models is becoming overly expensive, which makes training such models affordable only to large…”
    Get full text
    Journal Article
  18. 18

    The Hallucinations Leaderboard -- An Open Effort to Measure Hallucinations in Large Language Models by Hong, Giwon, Gema, Aryo Pradipta, Saxena, Rohit, Du, Xiaotang, Nie, Ping, Zhao, Yu, Perez-Beltrachini, Laura, Ryabinin, Max, He, Xuanli, Fourrier, Clémentine, Minervini, Pasquale

    Published 08-04-2024
    “…Large Language Models (LLMs) have transformed the Natural Language Processing (NLP) landscape with their remarkable ability to understand and generate…”
    Get full text
    Journal Article
  19. 19

    Embedding Words in Non-Vector Space with Unsupervised Graph Learning by Ryabinin, Max, Popov, Sergei, Prokhorenkova, Liudmila, Voita, Elena

    Published 06-10-2020
    “…It has become a de-facto standard to represent words as elements of a vector space (word2vec, GloVe). While this approach is convenient, it is unnatural for…”
    Get full text
    Journal Article
  20. 20

    FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU by Sheng, Ying, Zheng, Lianmin, Yuan, Binhang, Li, Zhuohan, Ryabinin, Max, Fu, Daniel Y, Xie, Zhiqiang, Chen, Beidi, Barrett, Clark, Gonzalez, Joseph E, Liang, Percy, Ré, Christopher, Stoica, Ion, Zhang, Ce

    Published 13-03-2023
    “…The high computational and memory requirements of large language model (LLM) inference make it feasible only with multiple high-end accelerators. Motivated by…”
    Get full text
    Journal Article