Search Results - "Huang, Zezhou"
-
1
Spatial and hedonic analysis of housing prices in Shanghai
Published in Habitat international (01-09-2017)“…•12, 732 valid house property information was collected with web crawler in Shanghai.•Strong spatial auto-correlation exists in housing price with Moran’s I of…”
Get full text
Journal Article -
2
Disambiguate Entity Matching using Large Language Models through Relation Discovery
Published 25-03-2024“…Entity matching is a critical challenge in data integration and cleaning, central to tasks like fuzzy joins and deduplication. Traditional approaches have…”
Get full text
Journal Article -
3
Relationalizing Tables with Large Language Models: The Promise and Challenges
Published in 2024 IEEE 40th International Conference on Data Engineering Workshops (ICDEW) (13-05-2024)“…Tables in the wild are usually not relationalized, making querying them difficult. To relationalize tables, recent works designed seven transformation…”
Get full text
Conference Proceeding -
4
Cocoon: Semantic Table Profiling Using Large Language Models
Published 18-04-2024“…Data profilers play a crucial role in the preprocessing phase of data analysis by identifying quality issues such as missing, extreme, or erroneous values…”
Get full text
Journal Article -
5
Lightweight Materialization for Fast Dashboards Over Joins
Published 23-08-2023“…SIGMOD 2024 Dashboards are vital in modern business intelligence tools, providing non-technical users with an interface to access comprehensive business data…”
Get full text
Journal Article -
6
Calibration: A Simple Trick for Wide-table Delta Analytics
Published 07-10-2022“…Data analytics over normalized databases typically requires computing and materializing expensive joins (wide-tables). Factorized query execution models…”
Get full text
Journal Article -
7
Data Cleaning Using Large Language Models
Published 20-10-2024“…Data cleaning is a crucial yet challenging task in data analysis, often requiring significant manual effort. To automate data cleaning, previous systems have…”
Get full text
Journal Article -
8
Reptile: Aggregation-level Explanations for Hierarchical Data
Published 11-03-2021“…Recent query explanation systems help users understand anomalies in aggregation results by proposing predicates that describe input records that, if deleted,…”
Get full text
Journal Article -
9
Data Ambiguity Strikes Back: How Documentation Improves GPT's Text-to-SQL
Published 28-10-2023“…Text-to-SQL allows experts to use databases without in-depth knowledge of them. However, real-world tasks have both query and data ambiguities. Most works on…”
Get full text
Journal Article -
10
The Fast and the Private: Task-based Dataset Search
Published 10-08-2023“…Modern dataset search platforms employ ML task-based utility metrics instead of relying on metadata-based keywords to comb through extensive dataset…”
Get full text
Journal Article -
11
JoinBoost: Grow Trees Over Normalized Data Using Only SQL
Published 01-07-2023“…VLDB 2023 Although dominant for tabular data, ML libraries that train tree models over normalized databases (e.g., LightGBM, XGBoost) require the data to be…”
Get full text
Journal Article -
12
Aggregation Consistency Errors in Semantic Layers and How to Avoid Them
Published 01-07-2023“…Proceedings of the Workshop on Human-In-the-Loop Data Analytics 2023 Analysts often struggle with analyzing data from multiple tables in a database due to…”
Get full text
Journal Article -
13
Kitana: Efficient Data Augmentation Search for AutoML
Published 17-05-2023“…AutoML services provide a way for non-expert users to benefit from high-quality ML models without worrying about model design and deployment, in exchange for a…”
Get full text
Journal Article -
14
Saibot: A Differentially Private Data Search Platform
Published 01-07-2023“…VLDB 2023 Recent data search platforms use ML task-based utility measures rather than metadata-based keywords, to search large dataset corpora. Requesters…”
Get full text
Journal Article