Search Results - "Luccioni, Sasha Alexandra"
-
1
Position: Key Claims in LLM Research Have a Long Tail of Footnotes
Published 14-08-2023“…Much of the recent discourse within the ML community has been centered around Large Language Models (LLMs), their functionality and potential -- yet not only…”
Get full text
Journal Article -
2
Hype, Sustainability, and the Price of the Bigger-is-Better Paradigm in AI
Published 21-09-2024“…With the growing attention and investment in recent AI approaches such as large language models, the narrative that the larger the AI system the more valuable,…”
Get full text
Journal Article -
3
Bugs in the Data: How ImageNet Misrepresents Biodiversity
Published 24-08-2022“…ImageNet-1k is a dataset often used for benchmarking machine learning (ML) models and evaluating tasks such as image recognition and object detection. Wild…”
Get full text
Journal Article -
4
Power Hungry Processing: Watts Driving the Cost of AI Deployment?
Published 28-11-2023“…ACM Conference on Fairness, Accountability, and Transparency (ACM FAccT '24), June 3--6, 2024, Rio de Janeiro, Brazil Recent years have seen a surge in the…”
Get full text
Journal Article -
5
Metaethical Perspectives on 'Benchmarking' AI Ethics
Published 11-04-2022“…Benchmarks are seen as the cornerstone for measuring technical progress in Artificial Intelligence (AI) research and have been developed for a variety of tasks…”
Get full text
Journal Article -
6
Counting Carbon: A Survey of Factors Influencing the Emissions of Machine Learning
Published 16-02-2023“…Machine learning (ML) requires using energy to carry out computations during the model training process. The generation of this energy comes with an…”
Get full text
Journal Article -
7
Stable Bias: Analyzing Societal Representations in Diffusion Models
Published 20-03-2023“…As machine learning-enabled Text-to-Image (TTI) systems are becoming increasingly prevalent and seeing growing adoption as commercial services, characterizing…”
Get full text
Journal Article -
8
What's in the Box? A Preliminary Analysis of Undesirable Content in the Common Crawl Corpus
Published 06-05-2021“…Whereas much of the success of the current generation of neural language models has been driven by increasingly large training corpora, relatively little…”
Get full text
Journal Article -
9
Estimating the Carbon Footprint of BLOOM, a 176B Parameter Language Model
Published 03-11-2022“…Progress in machine learning (ML) comes with a cost to the environment, given that training ML models requires significant computational resources, energy and…”
Get full text
Journal Article -
10
CIVICS: Building a Dataset for Examining Culturally-Informed Values in Large Language Models
Published 22-05-2024“…This paper introduces the "CIVICS: Culturally-Informed & Values-Inclusive Corpus for Societal impacts" dataset, designed to evaluate the social and cultural…”
Get full text
Journal Article -
11
Into the LAIONs Den: Investigating Hate in Multimodal Datasets
Published 06-11-2023“…'Scale the model, scale the data, scale the compute' is the reigning sentiment in the world of generative AI today. While the impact of model scaling has been…”
Get full text
Journal Article -
12
A Framework for Deprecating Datasets: Standardizing Documentation, Identification, and Communication
Published 09-05-2022“…Datasets are central to training machine learning (ML) models. The ML community has recently made significant improvements to data stewardship and…”
Get full text
Journal Article -
13
The ROOTS Search Tool: Data Transparency for LLMs
Published 27-02-2023“…ROOTS is a 1.6TB multilingual text corpus developed for the training of BLOOM, currently the largest language model explicitly accompanied by commensurate data…”
Get full text
Journal Article -
14
Ensuring the Inclusive Use of Natural Language Processing in the Global Response to COVID-19
Published 11-08-2021“…Natural language processing (NLP) plays a significant role in tools for the COVID-19 pandemic response, from detecting misinformation on social media to…”
Get full text
Journal Article -
15
Measuring Data
Published 09-12-2022“…We identify the task of measuring data to quantitatively characterize the composition of machine learning data and datasets. Similar to an object's height,…”
Get full text
Journal Article -
16
Open Problems in Technical AI Governance
Published 20-07-2024“…AI progress is creating a growing range of risks and opportunities, but it is often unclear how they should be navigated. In many cases, the barriers and…”
Get full text
Journal Article -
17
Measuring the Carbon Intensity of AI in Cloud Instances
Published 10-06-2022“…By providing unprecedented access to computational resources, cloud computing has enabled rapid growth in technologies such as machine learning, the…”
Get full text
Journal Article -
18
Evaluate & Evaluation on the Hub: Better Best Practices for Data and Model Measurements
Published 30-09-2022“…Evaluation is a key part of machine learning (ML), yet there is a lack of support and tooling to enable its informed and systematic practice. We introduce…”
Get full text
Journal Article -
19
ClimateGAN: Raising Climate Change Awareness by Generating Images of Floods
Published 06-10-2021“…ICLR 2022 Climate change is a major threat to humanity, and the actions required to prevent its catastrophic consequences include changes in both policy-making…”
Get full text
Journal Article -
20
Data Governance in the Age of Large-Scale Data-Driven Language Technology
Published 02-11-2022“…Proceedings of 2022 ACM Conference on Fairness, Accountability, and Transparency (FAccT '22) The recent emergence and adoption of Machine Learning technology,…”
Get full text
Journal Article