Search Results - "Biderman, Stella"
-
1
MP‐NeRF: A massively parallel method for accelerating protein structure reconstruction from internal coordinates
Published in Journal of computational chemistry (05-01-2022)“…The conversion of proteins between internal and cartesian coordinates is a limiting step in many pipelines, such as molecular dynamics simulations and machine…”
Get full text
Journal Article -
2
Quality at a Glance: An Audit of Web-Crawled Multilingual Datasets
Published in Transactions of the Association for Computational Linguistics (31-01-2022)“…With the success of large-scale pre-training and multilingual modeling in Natural Language Processing (NLP), recent years have seen a proliferation of large,…”
Get full text
Journal Article -
3
OpenFold: retraining AlphaFold2 yields new insights into its learning mechanisms and capacity for generalization
Published in Nature methods (01-08-2024)“…AlphaFold2 revolutionized structural biology with the ability to predict protein structures with exceptionally high accuracy. Its implementation, however,…”
Get full text
Journal Article -
4
Magic: the Gathering is as Hard as Arithmetic
Published 11-03-2020“…Magic: the Gathering is a popular and famously complicated card game about magical combat. Recently, several authors including Chatterjee and Ibsen-Jensen…”
Get full text
Journal Article -
5
Fooling MOSS Detection with Pretrained Language Models
Published 18-01-2022“…As artificial intelligence (AI) technologies become increasingly powerful and prominent in society, their misuse is a growing concern. In educational settings,…”
Get full text
Journal Article -
6
LLM Circuit Analyses Are Consistent Across Training and Scale
Published 15-07-2024“…Most currently deployed large language models (LLMs) undergo continuous training or additional finetuning. By contrast, most research into LLMs' internal…”
Get full text
Journal Article -
7
Neural Networks on Groups
Published 12-06-2019“…Although neural networks traditionally are typically used to approximate functions defined over $\mathbb{R}^n$, the successes of graph neural networks,…”
Get full text
Journal Article -
8
Pitfalls in Machine Learning Research: Reexamining the Development Cycle
Published 04-11-2020“…NeurIPS 2020 Machine learning has the potential to fuel further advances in data science, but it is greatly hindered by an ad hoc design process, poor data…”
Get full text
Journal Article -
9
Grokking Group Multiplication with Cosets
Published 11-12-2023“…The complex and unpredictable nature of deep neural networks prevents their safe use in many high-stakes applications. There have been many techniques…”
Get full text
Journal Article -
10
Datasheet for the Pile
Published 13-01-2022“…This datasheet describes the Pile, a 825 GiB dataset of human-authored text compiled by EleutherAI for use in large-scale language modeling. The Pile is…”
Get full text
Journal Article -
11
A Walsh Hadamard Derived Linear Vector Symbolic Architecture
Published 29-10-2024“…Vector Symbolic Architectures (VSAs) are one approach to developing Neuro-symbolic AI, where two vectors in $\mathbb{R}^d$ are `bound' together to produce a…”
Get full text
Journal Article -
12
EleutherAI: Going Beyond "Open Science" to "Science in the Open"
Published 12-10-2022“…Over the past two years, EleutherAI has established itself as a radically novel initiative aimed at both promoting open-source research and conducting research…”
Get full text
Journal Article -
13
Holographic Global Convolutional Networks for Long-Range Prediction Tasks in Malware Detection
Published 23-03-2024“…Malware detection is an interesting and valuable domain to work in because it has significant real-world impact and unique machine-learning challenges. We…”
Get full text
Journal Article -
14
Suppressing Pink Elephants with Direct Principle Feedback
Published 12-02-2024“…Existing methods for controlling language models, such as RLHF and Constitutional AI, involve determining which LLM behaviors are desirable and training them…”
Get full text
Journal Article -
15
Transformer-Based Models Are Not Yet Perfect At Learning to Emulate Structural Recursion
Published 23-01-2024“…This paper investigates the ability of transformer-based models to learn structural recursion from examples. Recursion is a universal concept in both natural…”
Get full text
Journal Article -
16
Why Has Predicting Downstream Capabilities of Frontier AI Models with Scale Remained Elusive?
Published 06-06-2024“…Predictable behavior from scaling advanced AI systems is an extremely desirable property. Although a well-established literature exists on how pretraining…”
Get full text
Journal Article -
17
Stay on topic with Classifier-Free Guidance
Published 30-06-2023“…Classifier-Free Guidance (CFG) has recently emerged in text-to-image generation as a lightweight technique to encourage prompt-adherence in generations. In…”
Get full text
Journal Article -
18
LEACE: Perfect linear concept erasure in closed form
Published 06-06-2023“…Concept erasure aims to remove specified features from a representation. It can improve fairness (e.g. preventing a classifier from using gender or race) and…”
Get full text
Journal Article -
19
Recasting Self-Attention with Holographic Reduced Representations
Published 30-05-2023“…In recent years, self-attention has become the dominant paradigm for sequence modeling in a variety of domains. However, in domains with very long sequence…”
Get full text
Journal Article -
20
Can Transformers Learn to Solve Problems Recursively?
Published 24-05-2023“…Neural networks have in recent years shown promise for helping software engineers write programs and even formally verify them. While semantic information…”
Get full text
Journal Article