The Data Science Incubator is a program that brings together data science experts with domain experts throughout the campus. The goal is to advance science by solving problems related to data scale, data visualization, machine learning, etc. The incubator works on the principle of direct collaboration. Each project has a project lead who sits with the incubator staff two days per week.
Anyone is welcome to apply. We simply require that each project has a project lead who is willing to physically co-locate with the incubator staff two days per week.
Our initial plan is to schedule incubator projects on quarter boundaries. See the incubator page for the current schedule.
Not necessarily. We frequently find that data preparation tasks take an inordinate amount of project resources. We are interested in exploring solutions to these problems that allow researchers to focus on "doing science" instead of munging around with data formats.
It is helpful to have the data "in hand" or easily accessible before starting a project.
Yes. We have extensive expertise with moving code and data to cloud infrastructures such Amazon Web Services, Microsoft Azure, and the Google Cloud Platform. A reasonable incubator project would be to move an existing workflow to the cloud in a cost-effective manner.
Yes. Our preference is to publish our code and data to the web, as this promotes transparency and reproducibility. That said, we understand that not all data sets are suitable for publication. We are flexible.