Reproducible Software Stack

less than 1 minute read

You may have hear this echoing in those conferences that “Data Science is a team sport”. So how do you go about collborating when each members of the team working on their PC? What ran on my machine might not run on yours

Replicating what someone else has done is his or her digital environment is the starting point for collaboration.


It takes URL to a Git repository (You already should have this as part of your workflow) and creates a suitable Docker image.

It gives data scientists the benefits of containerization technology without needing to learn Docker itself.

Thereby enable you to replicate data science environments and share it, allowing your team to verify the results of analyses.



This R package from Microsoft is designed to make it easy to write reproducible R code by allowing you to go backward (or forward) in time to retrieve the exact versions of the packages you need.


MyBinder are JupterHub with repo2Docker. It lets you host interactive Jupyter notebooks that you can share. It’s a site that runs IPython/Jupyter Notebooks from GitHub for free

Resource limits maximum 2GB RAM, 10 minute inactivity timeout, 12 hour session.