Data Incubator FAQ

What is the Data Science Incubator?

The Data Science Incubator is a program that brings together data science experts with domain experts throughout the campus. The goal is to advance science by solving problems related to data scale, data visualization, machine learning, etc. The incubator works on the principle of direct collaboration. Each project has a project lead who sits with the incubator staff two days per week.

Who is this for? Can I apply if I am a {professor, grad student, post-doc, undergrad, etc.}?

Anyone is welcome to apply. We simply require that each project has a project lead who is willing to physically co-locate with the incubator staff two days per week.

How long does an incubator project last?

Our initial plan is to schedule incubator projects on quarter boundaries. See the incubator page for the current schedule.

Does my data need to be "clean" before submitting a proposal?

Not necessarily. We frequently find that data preparation tasks take an inordinate amount of project resources. We are interested in exploring solutions to these problems that allow researchers to focus on "doing science" instead of munging around with data formats.

It is helpful to have the data "in hand" or easily accessible before starting a project.

I've heard that I need to move my work to the "cloud"... Can you help with this?

Yes. We have extensive expertise with moving code and data to cloud infrastructures such Amazon Web Services, Microsoft Azure, and the Google Cloud Platform. A reasonable incubator project would be to move an existing workflow to the cloud in a cost-effective manner.

Is there a way to get informal help before submitting an incubator proposal?

We plan to hold office hours starting in Spring of 2014. Details to be announced.

I work with sensitive data. Can I still work with the incubator staff?

Yes. Our preference is to publish our code and data to the web, as this promotes transparency and reproducibility. That said, we understand that not all data sets are suitable for publication. We are flexible.