class: center, middle, inverse, title-slide .title[ # Open Science, Reproducible Research, and R ] .subtitle[ ## An Overview ] .author[ ### Ernest Guevarra ] .date[ ### 2024-01-22 ] --- class: inverse, center, middle # Go to the people. Live with them. Learn from them. Love them. # Start with what they know. Build with what they have. # But with the best leaders, when the work is done, the task accomplished, the people will say "We have done it ourselves". --- # Outline 1. What is Open Science? 2. What is Reproducible Research? 3. Why Reproducible Research? 4. How to do Reproducible Research? 5. Why R? --- background-color: #FFFFFF # What is Open Science? .pull-left[ .center[![](images/openscience.png)] .footnote[based on UNESCO concept 2021] ] .pull-right[ * movement to make scientific research and dissemination accessible to anyone * aims for greater transparency in research and removes barriers for sharing outputs, resources, methods or tools at any stage of the research process ] ??? Open Science is a movement to make scientific research and dissemination accessible to anyone. As such, it aims for greater transparency in research and removes barriers for sharing outputs, resources, methods or tools at any stage of the research process. --- background-color: #FFFFFF # What is Reproducible Research? .pull-left[ .center[![](images/scientific_method.png)] .footnote[from the [Open Science Training Handbook](https://open-science-training-handbook.github.io/Open-Science-Training-Handbook_EN//02OpenScienceBasics/04ReproducibleResearchAndDataAnalysis.html)] ] .pull-right[ * The concept of reproducibility is directly linked to the scientific method 1. Formulating a hypothesis 2. Designing the study 3. Running the study and collecting the data 4. Analyzing the data 5. Reporting the study * Each of these steps should be clearly reported by providing clear and open documentation, and thus making the study transparent and reproducible. ] ??? Reproducible research is directly linked to the scientific method. Each step in the scientific method should be clearly reported through clear and open documentation. This makes the research transparent and reproducible. From ths perspective, we see reproducibility as an inherent quality of good research. --- background-color: #FFFFFF # What is Reproducible Research? .pull-left[ * Research papers with accompanying software tools that allow the reader to directly reproduce the results and employ the methods that are presented in the research paper ([Gentleman and Lang, 2004](https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.684.9629&rep=rep1&type=pdf)) * Research data and code are made available so that others are able to reach the same results as are claimed in scientific outputs ([Open Science Training Handbook](https://open-science-training-handbook.github.io/Open-Science-Training-Handbook_EN//02OpenScienceBasics/04ReproducibleResearchAndDataAnalysis.html)) * The standard of reproducibility calls for the data and the computer code used to analyze the data be made available to others ([Peng, 2012](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3383002/pdf/nihms382015.pdf)) ] .pull-right[ .center[![](images/reproducible_research1.jpg)] .footnote[from https://www.displayr.com/what-is-reproducible-research] ] ??? Reproducible research is underpinned by the open availability of the data and the code used in the study/research which, in theory, will allow others to reproduce the various steps and methods implemented in the study and its results. Whilst the use of the term `code` often refers to computer code and is commonly related to computational sciences, the concept of `code` can also be expanded to include detailed rules, protocols, process documentation, and other scientific process used in other sciences e.g., social sciences, economics, anthropology --- background-color: #FFFFFF background-image: url(images/reproducible_research3.jpg) background-size: 55% # Differentiating Reproducible Research .footnote[This image was created by [Scriberia](https://www.scriberia.com/) for [The Turing Way](https://the-turing-way.netlify.app/welcome) community and is used under a CC-BY licence.] ??? Reproducibility is often conflated with other similar concepts of `replicability`, `robustness`, and `generalisability`. Here, we disambiguate these concepts to delineate what they specifically pertain to. In `reproducible` research, given the same data and the same analysis, the same results are produced. In `replicable` research, given different data and the same analysis, the same results are produced. In `robust` research, given the same data and different analysis, the same results are produced. In `generalisable` research, given different data and different analysis, the same results are produced. --- background-color: #FFFFFF background-image: url(images/nature_repro_research.jpg) background-size: 65% # Why Reproducible Research? .footnote[Baker, M. 1,500 scientists lift the lid on reproducibility. Nature 533, 452–454 (2016). https://doi.org/10.1038/533452a] ??? Most researchers believe that the majority of research that has not been published are not reproducible. More than 50% of researchers report that they have experienced not being able to reproduce other researchers' results. More interestingly, more than 40% of researchers report that they have experienced not being able to reproduce their own research's results. --- background-color: #FFFFFF background-image: url(images/why_repro_research1.gif) background-size: 90% # Factors in irreproducible research --- # Factors in irreproducible research * Not enough documentation on how experiment is conducted and data is generated * Data used to generate original results unavailable * Software used to generate original results unavailable * Difficult to recreate software environment (libraries, versions) used to generate original results * Difficult to rerun the computational steps --- background-color: #FFFFFF # How to do Reproducible Research? .pull-left[ ### The reproducibility spectrum .center[![](images/reproducible_research2.png)] ] .pull-right[ ### Steps in reproducible research * Record the project’s provenance * Data and metadata curation * Establish a testing/analysis workflow * Test, document, and publish your code * Share ] --- # Why R for Reproducible Research? .pull-left[ * freely available * huge user and developer community * has a robust set of user- and community-developed packages that support reproducible research ] .pull-right[ .center[![](https://upload.wikimedia.org/wikipedia/commons/thumb/1/1b/R_logo.svg/724px-R_logo.svg.png)] ] --- class: inverse, center, middle # Questions? --- class: inverse, center, middle ## Go to the people. Live with them. Learn from them. Love them. ## Start with what they know. Build with what they have. ## But with the best leaders, when the work is done, the task accomplished, the people will say "We have done it ourselves". ### Lao Tzu --- class: inverse, center, middle # Thank you! Slides can be viewed at https://oxford-ihtm.io/open-reproducible-science/session6.html PDF version of slides can be downloaded at https://oxford-ihtm.io/open-reproducible-science/pdf/session6-open-reproducible-science.pdf R scripts for slides available [here](https://github.com/OxfordIHTM/open-reproducible-science/blob/main/session6.Rmd)