Search Results - "Parameswaran, Aditya G"
-
1
Leveraging Analysis History for Improved In Situ Visualization Recommendation
Published in Computer graphics forum (01-06-2022)“…Existing visualization recommendation systems commonly rely on a single snapshot of a dataset to suggest visualizations to users. However, exploratory data…”
Get full text
Journal Article -
2
Efficient and Compact Spreadsheet Formula Graphs
Published in 2023 IEEE 39th International Conference on Data Engineering (ICDE) (01-04-2023)“…Spreadsheets are one of the most popular data analysis tools, wherein users can express computation as formulae alongside data. The ensuing dependencies are…”
Get full text
Conference Proceeding -
3
DocETL: Agentic Query Rewriting and Evaluation for Complex Document Processing
Published 15-10-2024“…Analyzing unstructured data, such as complex documents, has been a persistent challenge in data processing. Large Language Models (LLMs) have shown promise in…”
Get full text
Journal Article -
4
"We Have No Idea How Models will Behave in Production until Production": How Engineers Operationalize Machine Learning
Published 25-03-2024“…Proc. ACM Hum.-Comput. Interact. 8, CSCW1, Article 206 (April 2024) Organizations rely on machine learning engineers (MLEs) to deploy models and maintain ML…”
Get full text
Journal Article -
5
Rethinking Streaming Machine Learning Evaluation
Published 23-05-2022“…While most work on evaluating machine learning (ML) models focuses on computing accuracy on batches of data, tracking accuracy alone in a streaming setting…”
Get full text
Journal Article -
6
Moving Fast With Broken Data
Published 10-03-2023“…Machine learning (ML) models in production pipelines are frequently retrained on the latest partitions of large, continually-growing datasets. Due to…”
Get full text
Journal Article -
7
Transactional Panorama: A Conceptual Framework for User Perception in Analytical Visual Interfaces
Published 10-02-2023“…Many tools empower analysts and data scientists to consume analysis results in a visual interface, such as a dashboard. When the underlying data changes, these…”
Get full text
Journal Article -
8
Operationalizing Machine Learning: An Interview Study
Published 16-09-2022“…Organizations rely on machine learning engineers (MLEs) to operationalize ML, i.e., deploy and maintain ML pipelines in production. The process of…”
Get full text
Journal Article -
9
Towards Accurate and Efficient Document Analytics with Large Language Models
Published 07-05-2024“…Unstructured data formats account for over 80% of the data currently stored, and extracting value from such formats remains a considerable challenge. In…”
Get full text
Journal Article -
10
Who Validates the Validators? Aligning LLM-Assisted Evaluation of LLM Outputs with Human Preferences
Published 18-04-2024“…Due to the cumbersome nature of human evaluation and limitations of code-based evaluation, Large Language Models (LLMs) are increasingly being used to assist…”
Get full text
Journal Article -
11
Flow with FlorDB: Incremental Context Maintenance for the Machine Learning Lifecycle
Published 05-08-2024“…In this paper we present techniques to incrementally harvest and query arbitrary metadata from machine learning pipelines, without disrupting agile practices…”
Get full text
Journal Article -
12
Human-powered Data Management
Published 01-01-2013“…Fully automated algorithms are inadequate for a number of data analysis tasks, especially those involving images, video, or text. Thus, there is often a need…”
Get full text
Dissertation -
13
Revisiting Prompt Engineering via Declarative Crowdsourcing
Published 07-08-2023“…Large language models (LLMs) are incredibly powerful at comprehending and generating data in the form of text, but are brittle and error-prone. There has been…”
Get full text
Journal Article -
14
Efficient and Compact Spreadsheet Formula Graphs
Published 10-02-2023“…Spreadsheets are one of the most popular data analysis tools, wherein users can express computation as formulae alongside data. The ensuing dependencies are…”
Get full text
Journal Article -
15
SPADE: Synthesizing Data Quality Assertions for Large Language Model Pipelines
Published 05-01-2024“…Large language models (LLMs) are being increasingly deployed as part of pipelines that repeatedly process or generate data of some sort. However, a common…”
Get full text
Journal Article -
16
Enhancing the Interactivity of Dataframe Queries by Leveraging Think Time
Published 02-03-2021“…We propose opportunistic evaluation, a framework for accelerating interactions with dataframes. Interactive latency is critical for iterative,…”
Get full text
Journal Article -
17
Lux: Always-on Visualization Recommendations for Exploratory Dataframe Workflows
Published 30-04-2021“…Exploratory data science largely happens in computational notebooks with dataframe APIs, such as pandas, that support flexible means to transform, clean, and…”
Get full text
Journal Article -
18
DataHub: Collaborative Data Science & Dataset Version Management at Scale
Published 02-09-2014“…Relational databases have limited support for data collaboration, where teams collaboratively curate and analyze large datasets. Inspired by software version…”
Get full text
Journal Article