Data Publishing with washr

Welcome

Publishing data can be challenging, especially when adhering to standards of reproducibility. Although technically available, an Excel workbook with multiple tabs, tucked away in an online archive, is far from practical. In other words, it lacks the key principles of being findable, accessible, interoperable, and reproducible — commonly known as FAIR.

This guide aims to provide a detailed walkthrough of how data can be published according to the FAIR principles. It builds on washr, an R package developed for swift data publication. The package emerged from the need to streamline certain steps when publishing the numerous datasets collected during GHE’s openwashdata academy.

The guide follows a chronological structure, starting from an empty repository and resulting in the publication of data as a website. 1  Creating a repository introduces version control, guiding readers through setting up both local and remote repositories to track all development steps. 2  Preparing your dataset focuses on transforming the dataset into a tidy format. 3  Documenting Your Dataset covers the documentation of the tidy dataset. Finally, 4  Publishing Your Dataset explains how users can create a website to showcase their dataset.

This book is a work in progress. As we publish more data, we learn more and these insights flow back into this book. If something is unclear or incomplete, please open an issue here and we’ll try to address it.