Practical Data Science for Stats

PeerJ Preprints has recently published a collection of articles that focus on the practical side of statistical analysis: Practical Data Science for Stats. While the articles are not peer-reviewed, they have been selected and edited by Jennifer Bryan and Hadley Wickham, both well-respected members of the R community. And while the articles provide great advice for any data scientist, the content does heavily feature the use of R, so it's particularly useful to R users.

Practical-data-science

There are 16 articles in the collection (with possibly more to come?). Here are just a few examples that caught my eye:

  • Bryan: "Excuse me, do you have a moment to talk about version control?". Practical advice on using Git, Github and Markdown for data science projects, with a focus on R.
  • Marwick, Boettiger and Mullen: Packaging data analytical work reproducibly using R (and friends). On using R packages as a vehicle for sharing research in a reproducible manner.
  • Taylor and Letham: Forecasting at Scale. On using the Prophet package for production-scale forecasting of time series.
  • Ross, Wickham, and Robinson: Declutter your R workflow with tidy tools. On using the tidyverse to make data analysis in R as smooth as possible.
  • Eddelbuettel and Balamuta: Extending R with C++: A Brief Introduction to Rcpp. An introduction to the Rcpp package for R.

There's lots more to explore in the collection as well, including case studies on using R at the likes of AirBnB and the New York Mets. Check out the entire collection at the link below.

PeerJ Collections: Practical Data Science for Stats (via Jenny Bryan)