Autonomous self-evolving research on biomedical data: the DREAM paradigm

In contemporary biomedical research, the efficiency of data-driven approaches is hindered by large data volumes, tool selection complexity, and human resource limitations, necessitating the development of fully autonomous research systems to meet complex analytical needs. Such a system should includ...

Full description

Saved in:
Bibliographic Details
Main Authors: Deng, Luojia, Wu, Yijie, Ren, Yongyong, Lu, Hui
Format: Journal Article
Language:English
Published: 18-07-2024
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In contemporary biomedical research, the efficiency of data-driven approaches is hindered by large data volumes, tool selection complexity, and human resource limitations, necessitating the development of fully autonomous research systems to meet complex analytical needs. Such a system should include the ability to autonomously generate research questions, write analytical code, configure the computational environment, judge and interpret the results, and iteratively generate in-depth questions or solutions, all without human intervention. Here we developed DREAM, the first biomedical Data-dRiven self-Evolving Autonomous systeM, which can independently conduct scientific research without human involvement. Utilizing a clinical dataset and two omics datasets, DREAM demonstrated its ability to raise and deepen scientific questions, with difficulty scores for clinical data questions surpassing top published articles by 5.7% and outperforming GPT-4 and bioinformatics graduate students by 58.6% and 56.0%, respectively. Overall, DREAM has a success rate of 80% in autonomous clinical data mining. Certainly, human can participate in different steps of DREAM to achieve more personalized goals. After evolution, 10% of the questions exceeded the average scores of top published article questions on originality and complexity. In the autonomous environment configuration of the eight bioinformatics workflows, DREAM exhibited an 88% success rate, whereas GPT-4 failed to configure any workflows. In clinical dataset, DREAM was over 10,000 times more efficient than the average scientist with a single computer core, and capable of revealing new discoveries. As a self-evolving autonomous research system, DREAM provides an efficient and reliable solution for future biomedical research. This paradigm may also have a revolutionary impact on other data-driven scientific research fields.
DOI:10.48550/arxiv.2407.13637