Increasingly important with:
- Data science in everything
- Journal expectations for availability of data and code
- Cloud computing for PHI
Unique to R
Very easy for R users to accidentally break something by updating R, or a package.
R/Medicine
August 28, 2020
Increasingly important with:
Very easy for R users to accidentally break something by updating R, or a package.
To maximize reproduciblity without sacrificing the RStudio experience we love!
Nothing is perfect, use multiple layers of control to improve reproducibility
Components to control:
If all of these are controlled, your analysis will always be reproducible.
Proactive > Reactive!
For two years I've used:
packrat
.For complete control over the OS, we can use Docker.
For complete control over the OS, we can use Docker.
Fantastic set of Rocker images, particularly the version-stable ones: https://hub.docker.com/r/rocker/r-ver
Very easy to setup:
$ docker pull rocker/rstudio:3.4.1
$ docker run -d rocker/rstudio:3.4.1
The versioned Rocker images change the default CRAN repo to a dated Microsoft snapshot (MRAN)
To simplify everything I recommend:
packrat
project within the mounted source directory$ docker run -d -e USERID=$UID -e PASSWORD=fake -v $(pwd):/work
-p 7009:8787 rocker/rstudio:3.4.1
> install.packages("packrat")
> packrat::init("/work")
Code here:
github.com/vincentmajor/reproducible_RStudio_projects
rstudio
,packrat
, packrat
Code here:
github.com/vincentmajor/reproducible_RStudio_projects