Everyone here is smart; distinguish yourself by being kind.
Learn more about this initiative here, as well as a recent article delving into what kindness in science is and why it does matter (Boulter et al., 2023).
In-person in Science Bldg, Rm 149.
Tuesday & Thursday from 9:00 to 10:15 AM.
The tentative timetable for this course is available here. The instructor wants to warn students that he might adjust the timetable to accommodate any needs. However, in case of changes, the instructor will make sure to contact enrolled students to keep them posted.
Our priority is your safety!
There are currently no specific health guidelines, but for more information on this topic please visit the BSU public health response webpage.
The scientific community widely acknowledges that we are in the midst of a reproducibility crisis (see e.g. Baker, 2016). This course starts by reviewing the evidence and causes supporting this crisis and aims at highlighting factors to boost reproducibility in science, especially in the fields of Ecology, Evolution & Behavior. This course also aims at providing a platform to develop, practice and strengthen science communication skills (see e.g. Soltis et al., 2023 for some innovative ways to share your research).
The overarching aim of this course is therefore to provide students with the theoretical knowledge and bioinformatic tools necessary to improve transparency, reproducibility and efficiency in scientific research. Across the course of the course, students will be taught how to use open source software for their research such as R, RStudio and R Markdown (incl. knitr). To further master these latter skills and improve communication and teaching skills, students will i) design bioinformatic tutorials and teach those to their peers and ii) design and implement individual projects aiming at developing a reproducible workflow specific to their research interest.
Overall, this course provides students with key knowledge to gather, store, share, prepare and analyze data as well as communicate results to the scientific community and various stakeholders (see Figure 4.1).
The course is subdivided into three parts:
Part 1 provides students with key theoretical knowledge on reproducible science allowing them to successfully design and implement a reproducible approach tailored to Ecology, Evolution & Behavior. This part will also cover material associated with open science and data management and how these practices intertwine with scientific publications.
Part 2 provides students with opportunities to further learn and apply the coding and bioinformatic tools required to implement a reproducible workflow (Figure 4.1). Here, students (with the support of the instructor) will develop and teach a tutorial on a specific bioinformatic subject, which will be taught over two full classes. Tutorials will be written in RMarkdown and distributed to the class one week in advance. Depending on class size, this assignment will be conducted in group or individually.
Finally, part 3 was designed to provide students with an opportunity to develop individual reproducible workflows tailored to their research project (or to data presented in a publication if they don’t have a thesis subject yet). This assignment will be done by applying knowledge gained during the previous parts and working in collaboration with the instructor (in some cases, we might seek support from thesis advisers).
PART 1: The Big Picture
PART 2: Bioinformatics for Reproducible Science
PART 3: Apply a Reproducible Approach to your Data
The reading material at the basis of this course is composed of a mixture of publications and chapters mostly from two textbooks (Gandrud, 2015; Wickham and Grolemund, 2017). We will also study the “Guides to” published by the British Ecological Society. Please find below the references used in each chapter. This list is not exhaustive and additional literature will be provided in class.
Chapter | Reference(s) |
---|---|
Chap. 1 | Chapter 3 of Gandrud (2015) |
Chap. 2 | Baker (2016); Freedman et al. (2015); Munafo et al. (2017); Peng and Hicks (2021); Sarewitz (2016) |
Chap. 3 | Bone et al. (2015); Markowetz (2015); Smith et al. (2016) |
Chap. 4 | Carroll et al. (2021); Creative Commons; Wagner et al. (2022); Williams et al. (2023) |
Chap. 5 | British Ecological Society (2014a) & Chapter 4 of Gandrud (2015); British Ecological Society (2014d) & Chapter 2 of Gandrud (2015); Trisovic et al. (2022) |
Chap. 6 | British Ecological Society (2014b); British Ecological Society (2014c) |
Chap. 7 | Chapter 6 of Gandrud (2015) |
Chap. 8 | Chapter 7 of Gandrud (2015) & Chapters 9-10 of Wickham and Grolemund (2017) |
Chap. 9 | Chapter 8 of Gandrud (2015) |
Chap. 10 | Chapter 9 of Gandrud (2015) |
Chap. 11 | Chapter 10 of Gandrud (2015), Chapters 1 and 22 of Wickham and Grolemund (2017) & Guangchuang et al. (2017) |
Chap. 12 | Chapter 5 of Gandrud (2015) |
To further exemplify how a reproducible approach can be implemented in your research, the instructor provides here examples of publications and software produced by students that have attended this course (alphabetically sorted):
Research is often presented in the form of slideshows, articles or books. These presentation documents announce a project’s findings, but they are not the research, they are the advertisement part of the research project!
The research is the full software environment, code, and data that produced the results (Donoho, 2010).
When we separate the research from its advertisement, we are making it difficult for others to verify the findings by reproducing them.
This course will give you the tools to dynamically combine your research with the presentation of your findings. The first tool will be a workflow for reproducible research weaving the principles of reproducibility throughout your entire research project, from data gathering to the statistical analysis, and the presentation of results. To reach this goal, you will learn how to use a number of computer tools that make this workflow possible.
The main bioinformatic tools covered in this course are:
As shown above, R and RStudio are at the core of this course and will have to be installed on your computers. This can be easily done by downloading the software from the following websites:
The download webpages for these software have comprehensive information on how to install them, so please refer to those pages for more information.
If you are planning to create LaTeX documents, you will need to install a Tex distribution. Please refer to this website for more details: https://www.latex-project.org/get/
If you want to create Markdown documents you can separately install the rmarkdown package in R (see below for more details).
We will be using a number of R packages especially designed to support reproducible research. Many of those packages are not included in the default R installation and will need to be installed separately. To install key packages used in class, copy the following code and paste it into your R console:
install.packages(c("brew", "countrycode", "devtools", "dplyr", "ggplot2", "googleVis",
"knitr", "rmarkdown", "tidyr", "xtable"))
Once you enter this code, you may be asked to select a CRAN “mirror” to download the packages from. Simply select the mirror closest to you.
Finally, it is highly likely that we will have to install additional packages. In this case, you can simply install it by using the same R function install.packages()
or by using RStudio as follows: Select “Tools” -> “Install Packages …” and then type the name of the package in the window (make sure to tick the “Install dependencies” box).
RStudio provides a suite of cheat sheets that can be accessed by going to the “Help” menu and selecting “Cheatsheets”.
Five cheat sheets are especially relevant to chapters taught in this course:
These documents together with the material presented in publications & textbooks will provide the basis to design your bioinformatic tutorials.
Please find below two documents providing a comprehensive introduction to R:
There will not be any classical exams in this course, but we will rather focus on developing theoretical and bioinformatic skills and applying those to your research. In this context, each student will be asked to produce a bioinformatic tutorial and teach it to their peers (see PART 2). Each student will also be tasked to produce a report (tailored to their thesis project or a publication) and present their results and conclusions in class.
Students will be graded based on the following four tasks:
Exams are summing to a total of 550 points and Table 12.1 exhibits the grading scale applied in this course.
Percentage | Grade |
---|---|
100-98 | A+ |
97.9-93 | A |
92.9-90 | A- |
89.9-88 | B+ |
87.9-83 | B |
82.9-80 | B- |
79.9-78 | C+ |
77.9-73 | C |
72.9-70 | C- |
69.9-68 | D+ |
67.9-60 | D |
59.9-0 | F |
During week 1, students will be assigned a chapter of PART 2 to study and produce a bioinformatic tutorial. Based on enrollment, students might work individually or in pairs (please see below for more information).
Tutorials will have to be written in the knitr/rmarkdown language as implemented in RStudio. Tutorials should be focused on developing a suite of exercises aiming at gaining key bioinformatic skills specific to each chapter (see PART 2). Students will be welcomed to use material presented in Gandrud (2015) and Wickham and Grolemund (2017) to develop their tutorials, but they can also use other sources as long as they are properly cited in their documents. See Publications & Textbooks and RStudio Cheat Sheets sections for more details.
Students should design their tutorials to be completed within 2 laboratory sessions (see below). Tutorials should be submitted to the instructor 1 week in advance for correction and to be uploaded onto the shared Google drive.
While designing your tutorials, please think about the following points:
Based on the information provided above, your tutorial should include:
The instructor is asking students to sign up for designing and teaching a bioinformatic tutorial by accessing the following Google sheet (the number of students per chapter is provided, please respect this guideline):
Students are expected to prepare a 10 minutes presentation providing general guidelines to complete the tutorial. Presentations will be uploaded onto the shared Google drive and made accessible to all students. Students are expected to support their peers in completing tutorials by answering questions. The instructor will also be answering questions, but students are leading the teaching of the bioinformatic laboratories.
Students will be graded according to their abilities to teach their tutorials and answer questions. The instructor might also use student’s feedback to grade this test.
Students will work alongside the instructor to develop a reproducible workflow specific to their thesis project. In cases where students do not yet have a clear idea on their thesis project, they will work with instructor to identify a publication that can serve as basis for their individual project.
Reports will be written using the knitr/rmarkdown markup language as implemented in RStudio. The instructor expects students to provide a list of references supporting their reports. References will have to be cited in the text: it is not enough to through a bunch of references at the end of the report. This exercise aims at supporting methodological decisions taken in the report and increasing transparency.
Each student will have to present their report during final week. The presentation should follow the same structure as the report and not exceed 15 minutes. There will be 5 minutes at the end of the presentation allocated for questions.
The instructor expects students to deliver their assignments on time and set enough time aside to work on their projects (see above for more details). However, if you have any issues preventing completion of your work on time, please contact the instructor as soon as possible to find common solutions.
The instructor is expecting students to attend classes (please join on time and for the full duration of the class) and actively engage by asking questions and giving feedback on teaching material and course content. This course was designed to help students implementing a reproducible approach to their research projects. If you are judging that additional content should be covered in class, please contact the instructor. The instructor will do his very best to obtain information or seek support from colleagues to cover the requested material. In the case that you have any issues attending class, please contact the instructor by email (svenbuerki@boisestate.edu) asap and see below for more details.
The instructor will be prepared for class, on time and not leave early. He will also be respectful of you and your opinions. Overall, the instructor wants to foster a kind and respectful class environment where all students can express themselves and share their opinions. This means that meaningful and constructive dialogue is encouraged in this class and it requires a degree of mutual respect, willingness to listen, and tolerance of opposing points of view. Respect for individual differences and alternative viewpoints will be maintained at all times in this class. One’s words and use of language should be temperate and within acceptable bounds of civility and decency. Finally, the instructor will reply to emails and grade tests as soon as possible (and provide positive criticism) to allow students mastering the material presented in class.
The instructor has developed this course to provide a welcoming environment and effective, equitable learning experience for all students. If you encounter barriers in this course, please bring them to my attention so that I may work to address them.
Students in this class represent a rich variety of backgrounds and perspectives. The Biological Sciences department is committed to providing an atmosphere for learning that respects diversity and creates inclusive environments in our courses. While working together to build this community, we ask all members to:
Please let the instructor know of your preferred or adopted name and gender pronoun(s), and he will make those changes to his own records and address you that way in all cases.
To change to a preferred name so that it displays on all BSU sites, including Canvas and the course roster, contact the Registrar’s Office at (208) 426-4249. Note that only a legal name change can alter your name on BSU official and legal documents (e.g., your transcript).
The instructor recognizes that navigating your education and life can often be more difficult if you have disabilities. I want you to achieve at your highest capacity in this class. If you have a disability, the instructor needs to know if you encounter inequitable opportunities in this course related to: - Accessing and understanding course materials. - Engaging with course materials and other students in the course. - Demonstrating your skills and knowledge on assignments and exams.
If you have a documented disability, you may be eligible for accommodations in all of your courses. To learn more, make an appointment with the university’s Educational Access Center.
The instructor recognizes the unique challenges that can arise for students who are also parents or guardians of children. If you have any specific needs related to this topic, please contact the instructor asap.
To create a welcoming, engaging, and effective learning environment, the instructor expects all of us to exhibit behavior that reflects Boise State’s Statement of Shared Values. The Shared Values emphasize academic excellence, caring, citizenship, fairness, respect, responsibility, and trustworthiness. In keeping with these values, the instructor expects students in this course to uphold the standards outlined in the Boise State University Student Code of Conduct.
If you are struggling for any reason (e.g., COVID, relationship, family, or life’s stresses) and believe these may impact your performance in the course, the instructor is encouraging you to contact the Dean of Students at (208) 426-1527 or email deanofstudents@boisestate.edu for support. Additionally, if you are comfortable doing so, please reach out to me and I will provide any resources or accommodations that I can. If you notice a significant change in your mood, sleep, feelings of hopelessness or a lack of self worth, consider connecting immediately with Counseling Services (1529 Belmont Street, Norco Building) at (208) 426-1459 or email healthservices@boisestate.edu.
The university has many resources designed to support you as a learner and human being. Among these are:
Citations of all R packages used to generate this report.
[1] J. Allaire, Y. Xie, C. Dervieux, et al. rmarkdown: Dynamic Documents for R. R package version 2.21. 2023. https://CRAN.R-project.org/package=rmarkdown.
[2] C. Boettiger. knitcitations: Citations for Knitr Markdown Files. R package version 1.0.12. 2021. https://github.com/cboettig/knitcitations.
[3] M. C. Koohafkan. kfigr: Integrated Code Chunk Anchoring and Referencing for R Markdown Documents. R package version 1.2.1. 2021. https://github.com/mkoohafkan/kfigr.
[4] R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. Vienna, Austria, 2022. https://www.R-project.org/.
[5] H. Wickham, J. Bryan, and M. Barrett. usethis: Automate Package and Project Setup. R package version 2.1.6. 2022. https://CRAN.R-project.org/package=usethis.
[6] H. Wickham, R. François, L. Henry, et al. dplyr: A Grammar of Data Manipulation. R package version 1.0.9. 2022. https://CRAN.R-project.org/package=dplyr.
[7] H. Wickham, J. Hester, W. Chang, et al. devtools: Tools to Make Developing R Packages Easier. R package version 2.4.4. 2022. https://CRAN.R-project.org/package=devtools.
[8] Y. Xie. bookdown: Authoring Books and Technical Documents with R Markdown. ISBN 978-1138700109. Boca Raton, Florida: Chapman and Hall/CRC, 2016. https://bookdown.org/yihui/bookdown.
[9] Y. Xie. bookdown: Authoring Books and Technical Documents with R Markdown. R package version 0.33. 2023. https://CRAN.R-project.org/package=bookdown.
[10] Y. Xie. Dynamic Documents with R and knitr. 2nd. ISBN 978-1498716963. Boca Raton, Florida: Chapman and Hall/CRC, 2015. https://yihui.org/knitr/.
[11] Y. Xie. formatR: Format R Code Automatically. R package version 1.12. 2022. https://github.com/yihui/formatR.
[12] Y. Xie. “knitr: A Comprehensive Tool for Reproducible Research in R”. In: Implementing Reproducible Computational Research. Ed. by V. Stodden, F. Leisch and R. D. Peng. ISBN 978-1466561595. Chapman and Hall/CRC, 2014.
[13] Y. Xie. knitr: A General-Purpose Package for Dynamic Report Generation in R. R package version 1.42. 2023. https://yihui.org/knitr/.
[14] Y. Xie and J. Allaire. tufte: Tufte’s Styles for R Markdown Documents. R package version 0.12. 2022. https://github.com/rstudio/tufte.
[15] Y. Xie, J. Allaire, and G. Grolemund. R Markdown: The Definitive Guide. Boca Raton, Florida: Chapman and Hall/CRC, 2018. ISBN: 9781138359338. https://bookdown.org/yihui/rmarkdown.
[16] Y. Xie, C. Dervieux, and E. Riederer. R Markdown Cookbook. Boca Raton, Florida: Chapman and Hall/CRC, 2020. ISBN: 9780367563837. https://bookdown.org/yihui/rmarkdown-cookbook.
[17] H. Zhu. kableExtra: Construct Complex Table with kable and Pipe Syntax. R package version 1.3.4. 2021. https://CRAN.R-project.org/package=kableExtra.