A bold experiment: Accelerating scientific progress with AI

CSE researchers have developed the first-ever AI framework targeting experimentation, making the research process less time- and cost-intensive.
Stylized illustration of researchers in a futuristic lab with a robot on a lit-up platform in the center of the room and large computer screens throughout the room.
Image generated by DALL-E

Experimentation forms the foundation of scientific discovery. From Galileo’s tests of gravity to mapping the human genome, experimentation is where researchers put their ideas to the test and iterate on their innovations to unlock new advances.

However, as any researcher knows, progressing from hypothesis to conclusion is rarely a straightforward process. While experimentation may be the most crucial element of the scientific process, it is also the most resource-intensive. Designing, implementing, and evaluating experiments requires significant amounts of time and effort to ensure that the results are reliable, reproducible, and scientifically meaningful.

Now, researchers in Computer Science and Engineering at the University of Michigan have introduced Curie, the first AI-driven framework that specifically targets the process of scientific experimentation. This research was  led by CSE PhD candidates Patrick Kon and Jiachen Liu, Prof. Ang Chen, and Prof. Mosharaf Chowdhury, along with undergrad Qiuyu Ding, PhD students Yiming Qiu and Zhenning Yang, and research fellow Yibo Huang, together with their collaborators from Cisco, Jayanth Srinivasa and Myungjin Lee.

Curie is designed to tackle the myriad complexities and inefficiencies that plague scientific workflows. By embedding rigor, efficiency, and transparency into the experimentation process, Curie streamlines the journey from hypothesis to verifiable knowledge, allowing researchers to focus more on what matters most: creativity and discovery.

A flowchart demonstrating the structure and workflow of Curie.
Curie overview

“Researchers often spend weeks on literature reviews and setting up experiments just to validate a hypothesis,” said Liu. “Curie aims to automate this process so that researchers can concentrate on generating innovative ideas.”

In practical terms, Curie takes an experimental question and relevant context, such as domain-specific knowledge or starter code, and employs AI agents—called architect and technician agents—to design, implement, and analyze experiments. “The architect generates high-level plans and reflects on findings, while the technician agents focus on executing controlled experiments,” Liu explained.

Curie’s approach is groundbreaking in its ability to replicate and extend scientific findings with rigor. The system automates the experimental process using three key components: an intra-agent rigor module, which ensures reliability; an inter-agent rigor module, which maintains methodical control; and an experiment knowledge module, which enhances interpretability. These elements work in unison to enable researchers to reproduce, expand, and even challenge existing research findings.

Four-part panel showing how Curie can 1) take an original finding, 2) replicate the results, 3) generate new findings, and 4) challenge existing methodology.
Curie can help researchers validate, expand, and critique existing research on the benefits of repeated sampling in LLM reasoning (Brown et al., 2024). The first panel (Original Finding) presents a result from the original paper. The second panel (Reproduce) has Curie confirming this finding through rigorous experimentation. The third panel (Extend) has Curie exploring the impact of sampling temperature on repeated sampling. The final panel (Challenge) shows Curie identifying a limitation in the original methodology, suggesting an avenue for future research.

Curie’s performance on a novel experimental benchmark designed by the research team underscores its efficacy in designing and executing experiments. Evaluating Curie across multiple domains related to computer science, the researchers found that it achieved a 3.4x improvement in correctly answering experimental questions compared to other leading AI agents. This success points to Curie’s potential to streamline experimental workflows and, in the process, make the scientific process less resource-intensive.

By seamlessly integrating AI tools into the scientific process, Curie reduces the burden of manual experimentation and allows researchers to allocate their efforts and resources more strategically. As scientific research demands increasing speed and precision, Curie represents a step forward in reimagining how experiments are performed and validated.

With their open-source code now available on Github, the Curie researchers hope to refine their framework even further with input from the research community. The team aims to eventually expand Curie to other fields by collaborating with experts in other disciplines, such as drug discovery and chemistry.

“Our goal is to automate the experimentation process so researchers can focus more on creative work,” said Liu. “By making experimental workflows more efficient, Curie not only saves time but could also accelerate scientific progress across fields.”

Curie will soon be available for testing (no cost and no setup required) for a limited time. Those interested are invited to join the Slack Community Channel or waitlist. The research team welcomes outside feedback, as it will be invaluable in shaping their continued refinement of Curie.