Search Results - "Glasgow, Margalit"

  • Showing 1 - 20 results of 20
Refine Results
  1. 1
  2. 2

    Approximate Gradient Coding With Optimal Decoding by Glasgow, Margalit, Wootters, Mary

    “…Gradient codes use data replication to mitigate the effect of straggling machines in distributed machine learning. Approximate gradient codes consider codes…”
    Get full text
    Journal Article
  3. 3

    On the rank, Kernel, and core of sparse random graphs by DeMichele, Patrick, Glasgow, Margalit, Moreira, Alexander

    Published in Random structures & algorithms (01-12-2024)
    “…We study the rank of the adjacency matrix A$$ A $$ of a random Erdős‐Rényi graph G∼𝔾(n,p). It is well known that when p=(log(n)−ω(1))/n$$ p=\left(\log…”
    Get full text
    Journal Article
  4. 4

    Feature Learning in Neural Networks and Other Stochastic Explorations by Glasgow, Margalit

    Published 01-01-2024
    “…Recent years have empirically demonstrated the unprecedented success of deep learning. Yet our theoretical understanding of why gradient descent succeeds in…”
    Get full text
    Dissertation
  5. 5

    SGD Finds then Tunes Features in Two-Layer Neural Networks with near-Optimal Sample Complexity: A Case Study in the XOR problem by Glasgow, Margalit

    Published 26-09-2023
    “…In this work, we consider the optimization process of minibatch stochastic gradient descent (SGD) on a 2-layer neural network with data separated by a…”
    Get full text
    Journal Article
  6. 6
  7. 7

    Convergence of Distributed Adaptive Optimization with Local Updates by Cheng, Ziheng, Glasgow, Margalit

    Published 19-09-2024
    “…We study distributed adaptive algorithms with local updates (intermittent communication). Despite the great empirical success of adaptive methods in…”
    Get full text
    Journal Article
  8. 8

    Invertibility of the 3-core of Erdos Renyi Graphs with Growing Degree by Glasgow, Margalit

    Published 02-06-2021
    “…Let $A \in \mathbb{R}^{n \times n}$ be the adjacency matrix of an Erd\H{o}s R\'enyi graph $G(n, d/n)$ for $d = \omega(1)$ and $d \leq 3\log(n)$. We show that…”
    Get full text
    Journal Article
  9. 9

    Tight Bounds for $\gamma$-Regret via the Decision-Estimation Coefficient by Glasgow, Margalit, Rakhlin, Alexander

    Published 06-03-2023
    “…In this work, we give a statistical characterization of the $\gamma$-regret for arbitrary structured bandit problems, the regret which arises when comparing…”
    Get full text
    Journal Article
  10. 10

    Approximate Gradient Coding with Optimal Decoding by Glasgow, Margalit, Wootters, Mary

    “…In distributed optimization problems, a technique called gradient coding, which involves replicating data points, has been used to mitigate the effect of…”
    Get full text
    Conference Proceeding
  11. 11

    Approximate Gradient Coding with Optimal Decoding by Glasgow, Margalit, Wootters, Mary

    Published 06-08-2021
    “…In distributed optimization problems, a technique called gradient coding, which involves replicating data points, has been used to mitigate the effect of…”
    Get full text
    Journal Article
  12. 12

    Asynchronous Distributed Optimization with Stochastic Delays by Glasgow, Margalit, Wootters, Mary

    Published 22-09-2020
    “…We study asynchronous finite sum minimization in a distributed-data setting with a central parameter server. While asynchrony is well understood in parallel…”
    Get full text
    Journal Article
  13. 13

    A central limit theorem for the matching number of a sparse random graph by Glasgow, Margalit, Kwan, Matthew, Sah, Ashwin, Sawhney, Mehtaab

    Published 08-02-2024
    “…In 1981, Karp and Sipser proved a law of large numbers for the matching number of a sparse Erd\H{o}s-R\'enyi random graph, in an influential paper pioneering…”
    Get full text
    Journal Article
  14. 14

    Sharp Bounds for Federated Averaging (Local SGD) and Continuous Perspective by Glasgow, Margalit, Yuan, Honglin, Ma, Tengyu

    Published 05-11-2021
    “…Federated Averaging (FedAvg), also known as Local SGD, is one of the most popular algorithms in Federated Learning (FL). Despite its simplicity and popularity,…”
    Get full text
    Journal Article
  15. 15

    The Exact Rank of Sparse Random Graphs by Glasgow, Margalit, Kwan, Matthew, Sah, Ashwin, Sawhney, Mehtaab

    Published 09-03-2023
    “…Two landmark results in combinatorial random matrix theory, due to Koml\'os and Costello-Tao-Vu, show that discrete random matrices and symmetric discrete…”
    Get full text
    Journal Article
  16. 16

    On the Rank, Kernel, and Core of Sparse Random Graphs by DeMichele, Patrick, Glasgow, Margalit, Moreira, Alexander

    Published 25-05-2021
    “…We study the rank of the adjacency matrix $A$ of a random Erdos Renyi graph $G\sim \mathbb{G}(n,p)$. It is well known that when $p = (\log(n) - \omega(1))/n$,…”
    Get full text
    Journal Article
  17. 17

    Feature Dropout: Revisiting the Role of Augmentations in Contrastive Learning by Tamkin, Alex, Glasgow, Margalit, He, Xiluo, Goodman, Noah

    Published 16-12-2022
    “…What role do augmentations play in contrastive learning? Recent work suggests that good augmentations are label-preserving with respect to a specific…”
    Get full text
    Journal Article
  18. 18

    Max-Margin Works while Large Margin Fails: Generalization without Uniform Convergence by Glasgow, Margalit, Wei, Colin, Wootters, Mary, Ma, Tengyu

    Published 15-06-2022
    “…A major challenge in modern machine learning is theoretically understanding the generalization properties of overparameterized models. Many existing tools rely…”
    Get full text
    Journal Article
  19. 19

    The Limits and Potentials of Local SGD for Distributed Heterogeneous Learning with Intermittent Communication by Patel, Kumar Kshitij, Glasgow, Margalit, Zindari, Ali, Wang, Lingxiao, Stich, Sebastian U, Cheng, Ziheng, Joshi, Nirmit, Srebro, Nathan

    Published 19-05-2024
    “…Local SGD is a popular optimization method in distributed learning, often outperforming other algorithms in practice, including mini-batch SGD. Despite this…”
    Get full text
    Journal Article
  20. 20

    Beyond NTK with Vanilla Gradient Descent: A Mean-Field Analysis of Neural Networks with Polynomial Width, Samples, and Time by Mahankali, Arvind, Haochen, Jeff Z, Dong, Kefan, Glasgow, Margalit, Ma, Tengyu

    Published 28-06-2023
    “…Despite recent theoretical progress on the non-convex optimization of two-layer neural networks, it is still an open question whether gradient descent on…”
    Get full text
    Journal Article