Search Results - "Simsek, Berfin"

  • Showing 1 - 14 results of 14
Refine Results
  1. 1

    Online Bounded Component Analysis: A Simple Recurrent Neural Network with Local Update Rule for Unsupervised Separation of Dependent and Independent Sources by Simsek, Berfin, Erdogan, Alper T.

    “…A low complexity recurrent neural network structure is proposed for unsupervised separation of both independent and dependent sources from their linear…”
    Get full text
    Conference Proceeding
  2. 2

    Learning Gaussian Multi-Index Models with Gradient Flow: Time Complexity and Directional Convergence by Simsek, Berfin, Bendjeddou, Amire, Hsu, Daniel

    Published 13-11-2024
    “…This work focuses on the gradient flow dynamics of a neural network model that uses correlation loss to approximate a multi-index function on high-dimensional…”
    Get full text
    Journal Article
  3. 3

    Learning Associative Memories with Gradient Descent by Cabannes, Vivien, Simsek, Berfin, Bietti, Alberto

    Published 28-02-2024
    “…This work focuses on the training dynamics of one associative memory module storing outer products of token embeddings. We reduce this problem to the study of…”
    Get full text
    Journal Article
  4. 4

    Loss Landscape of Shallow ReLU-like Neural Networks: Stationary Points, Saddle Escaping, and Network Embedding by Wu, Zhengqing, Simsek, Berfin, Ged, Francois

    Published 08-02-2024
    “…In this paper, we investigate the loss landscape of one-hidden-layer neural networks with ReLU-like activation functions trained with the empirical squared…”
    Get full text
    Journal Article
  5. 5

    Understanding out-of-distribution accuracies through quantifying difficulty of test samples by Simsek, Berfin, Hall, Melissa, Sagun, Levent

    Published 28-03-2022
    “…Existing works show that although modern neural networks achieve remarkable generalization performance on the in-distribution (ID) dataset, the accuracy drops…”
    Get full text
    Journal Article
  6. 6

    Expand-and-Cluster: Parameter Recovery of Neural Networks by Martinelli, Flavio, Simsek, Berfin, Gerstner, Wulfram, Brea, Johanni

    Published 25-04-2023
    “…Can we identify the weights of a neural network by probing its input-output mapping? At first glance, this problem seems to have many solutions because of…”
    Get full text
    Journal Article
  7. 7

    Weight-space symmetry in deep networks gives rise to permutation saddles, connected by equal-loss valleys across the loss landscape by Brea, Johanni, Simsek, Berfin, Illing, Bernd, Gerstner, Wulfram

    Published 05-07-2019
    “…The permutation symmetry of neurons in each layer of a deep neural network gives rise not only to multiple equivalent global minima of the loss function, but…”
    Get full text
    Journal Article
  8. 8

    Should Under-parameterized Student Networks Copy or Average Teacher Weights? by Şimşek, Berfin, Bendjeddou, Amire, Gerstner, Wulfram, Brea, Johanni

    Published 02-11-2023
    “…Any continuous function $f^*$ can be approximated arbitrarily well by a neural network with sufficiently many neurons $k$. We consider the case when $f^*$…”
    Get full text
    Journal Article
  9. 9

    Statistical physics, Bayesian inference and neural information processing by Grant, Erin, Nestler, Sandra, Şimşek, Berfin, Solla, Sara

    Published 29-09-2023
    “…Lecture notes from the course given by Professor Sara A. Solla at the Les Houches summer school on "Statistical physics of Machine Learning". The notes discuss…”
    Get full text
    Journal Article
  10. 10

    MLPGradientFlow: going with the flow of multilayer perceptrons (and finding minima fast and accurately) by Brea, Johanni, Martinelli, Flavio, Şimşek, Berfin, Gerstner, Wulfram

    Published 25-01-2023
    “…MLPGradientFlow is a software package to solve numerically the gradient flow differential equation $\dot \theta = -\nabla \mathcal L(\theta; \mathcal D)$,…”
    Get full text
    Journal Article
  11. 11

    Saddle-to-Saddle Dynamics in Deep Linear Networks: Small Initialization Training, Symmetry, and Sparsity by Jacot, Arthur, Ged, François, Şimşek, Berfin, Hongler, Clément, Gabriel, Franck

    Published 30-06-2021
    “…The dynamics of Deep Linear Networks (DLNs) is dramatically affected by the variance $\sigma^2$ of the parameters at initialization $\theta_0$. For DLNs of…”
    Get full text
    Journal Article
  12. 12

    Kernel Alignment Risk Estimator: Risk Prediction from Training Data by Jacot, Arthur, Şimşek, Berfin, Spadaro, Francesco, Hongler, Clément, Gabriel, Franck

    Published 17-06-2020
    “…We study the risk (i.e. generalization error) of Kernel Ridge Regression (KRR) for a kernel $K$ with ridge $\lambda>0$ and i.i.d. observations. For this, we…”
    Get full text
    Journal Article
  13. 13

    Implicit Regularization of Random Feature Models by Jacot, Arthur, Şimşek, Berfin, Spadaro, Francesco, Hongler, Clément, Gabriel, Franck

    Published 19-02-2020
    “…Proceedings of the International Conference on Machine Learning, 2020, pp. 7397-7406 Random Feature (RF) models are used as efficient parametric approximations…”
    Get full text
    Journal Article
  14. 14

    Geometry of the Loss Landscape in Overparameterized Neural Networks: Symmetries and Invariances by Şimşek, Berfin, Ged, François, Jacot, Arthur, Spadaro, Francesco, Hongler, Clément, Gerstner, Wulfram, Brea, Johanni

    Published 25-05-2021
    “…We study how permutation symmetries in overparameterized multi-layer neural networks generate `symmetry-induced' critical points. Assuming a network with $ L $…”
    Get full text
    Journal Article