Brown receives $3.1 million grant for data exploration software research

INCLUDING STATISTICAL controls in data exploration software could help reduce false discoveries, and that's what Brown computer scientists aim to build with a new $3.1 million grant. /COURTESY BROWN UNIVERSITY
INCLUDING STATISTICAL controls in data exploration software could help reduce false discoveries, and that's what Brown computer scientists aim to build with a new $3.1 million grant. /COURTESY BROWN UNIVERSITY

PROVIDENCE – Brown University has received a $3.1 million grant from the Defense Advanced Research Projects Agency, the university announced Wednesday.

The grant money will be used to work on new data exploration software that includes statistical safeguards against false discoveries.

“The goal is to build a user-friendly system than can easily explore data and produce useful visualizations, but also continuously controls for the statistical validity of the results,” Eli Upfal, professor of computer science at Brown and the project’s principal investigator, said in a statement.

The press release says such a project will place an automated investigative rigor to academic data visualizations and make it easier to find true patterns in a data set. Doing so could ease scientist and statistician’s work to verify their research as well as allow data driven research to reach less math-savvy industries.

- Advertisement -

The DARPA grant brings together a team of Brown professors, postdoctoral scholars and students to tackle different aspects of the project. Tim Kraska, an assistant professor, and Carsten Binnig, adjunct associate professor, are machine learning and database experts who will work mainly on the data management side of the project. Computer graphics pioneer Andy van Dam will work on the user interface and visualizations. Upfal, an expert in computational theory, will work mainly on the statistical side of the project.

According to a Brown press release, modern data exploration tools make it easy to poke and prod a data set in myriad ways with a few mouse clicks. That can create an issue known to statisticians as the multiple comparisons problem, in which researchers conflate coincidence with pattern, and it’s one of the things Upfal and his colleagues hope to address.

“To some extent it’s our fault here in computer science that we have made analysis of data so easy,” Upfal said. “If I give you a huge database and let you simply push a button to ask question after question, you’re eventually reach something that’s there purely by chance.”

The project will be part of Brown’s recently launched Data Science Initiative, which is broadly aimed at developing these kinds of novel approaches to dealing with data.

Chris Bergenheim is a PBN staff writer.