Search Results - "Keskar, Nitish Shirish"

Refine Results
  1. 1

    Deep learning-enabled breast cancer hormonal receptor status determination from base-level H&E stains by Naik, Nikhil, Madani, Ali, Esteva, Andre, Keskar, Nitish Shirish, Press, Michael F., Ruderman, Daniel, Agus, David B., Socher, Richard

    Published in Nature communications (16-11-2020)
    “…For newly diagnosed breast cancer, estrogen receptor status (ERS) is a key molecular marker used for prognosis and treatment decisions. During clinical…”
    Get full text
    Journal Article
  2. 2

    Balancing Communication and Computation in Distributed Optimization by Berahas, Albert S., Bollapragada, Raghu, Keskar, Nitish Shirish, Wei, Ermin

    Published in IEEE transactions on automatic control (01-08-2019)
    “…Methods for distributed optimization have received significant attention in recent years owing to their wide applicability in various domains including machine…”
    Get full text
    Journal Article
  3. 3

    Limits of Detecting Text Generated by Large-Scale Language Models by Varshney, Lav R., Shirish Keskar, Nitish, Socher, Richard

    “…Some consider large-scale language models that can generate long and coherent pieces of text as dangerous, since they may be used in misinformation campaigns…”
    Get full text
    Conference Proceeding
  4. 4

    A nonmonotone learning rate strategy for SGD training of deep neural networks by Keskar, Nitish Shirish, Saon, George

    “…The algorithm of choice for cross-entropy training of deep neural network (DNN) acoustic models is mini-batch stochastic gradient descent (SGD). One of the…”
    Get full text
    Conference Proceeding
  5. 5

    Deep learning-enabled breast cancer hormonal receptor status determination from base-level H E stains by Nikhil Naik, Ali Madani, Andre Esteva, Nitish Shirish Keskar, Michael F. Press, Daniel Ruderman, David B. Agus, Richard Socher

    Published in Nature communications (01-11-2020)
    “…Determination of estrogen receptor status (ERS) in breast cancer tissue requires immunohistochemistry, which is sensitive to the vagaries of sample processing…”
    Get full text
    Journal Article
  6. 6

    Second-Order Methods for Stochastic and Nonsmooth Optimization by Keskar, Nitish Shirish

    Published 2017
    “…The goal of this thesis is to design practical algorithms for nonlinear optimization in the case when the objective function is stochastic or nonsmooth. The…”
    Get full text
    Dissertation
  7. 7

    Improving Generalization Performance by Switching from Adam to SGD by Keskar, Nitish Shirish, Socher, Richard

    Published 20-12-2017
    “…Despite superior training outcomes, adaptive optimization methods such as Adam, Adagrad or RMSprop have been found to generalize poorly compared to Stochastic…”
    Get full text
    Journal Article
  8. 8

    Generating Negative Samples for Sequential Recommendation by Chen, Yongjun, Li, Jia, Liu, Zhiwei, Keskar, Nitish Shirish, Wang, Huan, McAuley, Julian, Xiong, Caiming

    Published 07-08-2022
    “…To make Sequential Recommendation (SR) successful, recent works focus on designing effective sequential encoders, fusing side information, and mining extra…”
    Get full text
    Journal Article
  9. 9

    Modeling Multi-hop Question Answering as Single Sequence Prediction by Yavuz, Semih, Hashimoto, Kazuma, Zhou, Yingbo, Keskar, Nitish Shirish, Xiong, Caiming

    Published 18-05-2022
    “…Fusion-in-decoder (Fid) (Izacard and Grave, 2020) is a generative question answering (QA) model that leverages passage retrieval with a pre-trained transformer…”
    Get full text
    Journal Article
  10. 10

    A Limited-Memory Quasi-Newton Algorithm for Bound-Constrained Nonsmooth Optimization by Keskar, Nitish Shirish, Waechter, Andreas

    Published 21-12-2016
    “…We consider the problem of minimizing a continuous function that may be nonsmooth and nonconvex, subject to bound constraints. We propose an algorithm that…”
    Get full text
    Journal Article
  11. 11

    Limits of Detecting Text Generated by Large-Scale Language Models by Varshney, Lav R, Keskar, Nitish Shirish, Socher, Richard

    Published 09-02-2020
    “…Some consider large-scale language models that can generate long and coherent pieces of text as dangerous, since they may be used in misinformation campaigns…”
    Get full text
    Journal Article
  12. 12

    An Analysis of Neural Language Modeling at Multiple Scales by Merity, Stephen, Keskar, Nitish Shirish, Socher, Richard

    Published 22-03-2018
    “…Many of the leading approaches in language modeling introduce novel, complex and specialized architectures. We take existing state-of-the-art word level…”
    Get full text
    Journal Article
  13. 13

    Pretrained AI Models: Performativity, Mobility, and Change by Varshney, Lav R, Keskar, Nitish Shirish, Socher, Richard

    Published 07-09-2019
    “…The paradigm of pretrained deep learning models has recently emerged in artificial intelligence practice, allowing deployment in numerous societal settings…”
    Get full text
    Journal Article
  14. 14

    Weighted Transformer Network for Machine Translation by Ahmed, Karim, Keskar, Nitish Shirish, Socher, Richard

    Published 06-11-2017
    “…State-of-the-art results on neural machine translation often use attentional sequence-to-sequence models with some form of convolution or recursion. Vaswani et…”
    Get full text
    Journal Article
  15. 15

    Unifying Question Answering, Text Classification, and Regression via Span Extraction by Keskar, Nitish Shirish, McCann, Bryan, Xiong, Caiming, Socher, Richard

    Published 19-04-2019
    “…Even as pre-trained language encoders such as BERT are shared across many tasks, the output layers of question answering, text classification, and regression…”
    Get full text
    Journal Article
  16. 16

    Regularizing and Optimizing LSTM Language Models by Merity, Stephen, Keskar, Nitish Shirish, Socher, Richard

    Published 07-08-2017
    “…Recurrent neural networks (RNNs), such as long short-term memory networks (LSTMs), serve as a fundamental building block for many sequence learning tasks,…”
    Get full text
    Journal Article
  17. 17

    Unsupervised Paraphrasing with Pretrained Language Models by Niu, Tong, Yavuz, Semih, Zhou, Yingbo, Keskar, Nitish Shirish, Wang, Huan, Xiong, Caiming

    Published 24-10-2020
    “…Paraphrase generation has benefited extensively from recent progress in the designing of training objectives and model architectures. However, previous…”
    Get full text
    Journal Article
  18. 18

    Coarse-grain Fine-grain Coattention Network for Multi-evidence Question Answering by Zhong, Victor, Xiong, Caiming, Keskar, Nitish Shirish, Socher, Richard

    Published 02-01-2019
    “…End-to-end neural models have made significant progress in question answering, however recent studies show that these models implicitly assume that the answer…”
    Get full text
    Journal Article
  19. 19

    Mirostat: A Neural Text Decoding Algorithm that Directly Controls Perplexity by Basu, Sourya, Ramachandran, Govardana Sachitanandam, Keskar, Nitish Shirish, Varshney, Lav R

    Published 29-07-2020
    “…Neural text decoding is important for generating high-quality texts using language models. To generate high-quality text, popular decoding algorithms like…”
    Get full text
    Journal Article
  20. 20

    A Closer Look at Deep Learning Heuristics: Learning rate restarts, Warmup and Distillation by Gotmare, Akhilesh, Keskar, Nitish Shirish, Xiong, Caiming, Socher, Richard

    Published 29-10-2018
    “…The convergence rate and final performance of common deep learning models have significantly benefited from heuristics such as learning rate schedules,…”
    Get full text
    Journal Article