Chapter 2 Course information

This is a the coursebook for the Colorado State University course ERHS 732, Advanced Epidemiological Analysis. This course provides the opportunity to implement theoretical expertise through designing and conducting advanced epidemiologic research analyses and to gain in-depth experience analyzing datasets from the environmental epidemiology literature. This course will complement the student’s training in advanced epidemiological methods, leveraging regression approaches and statistical programming, providing the opportunity to implement their theoretical expertise through designing and conducting advanced epidemiologic research analyses. Although basic theoretical frameworks behind analysis and statistical modeling approaches will be introduced, this course will not go into depth into statistical and epidemiologic theory and students are expected to be familiar with general epidemiologic concepts such as confounding, selection bias etc. During the course, students will gain in-depth experience analyzing two datasets from the environmental epidemiology literature—(1) time series data with daily measures of weather, air pollution, and cardiorespiratory outcomes in London, England and (2) a dataset with measures from the Framingham Heart Study. Additional datasets and studies will be discussed and explored as a supplement.

This class will utilize a variety of instructional formats, including short lectures, readings, topic specific examples from the substantive literature, discussion and directed group work on in-course coding exercises putting lecture and discussion content into practice. A variety of teaching modalities will be used, including group discussions, student directed discussions, and in-class group exercises. It is expected that before coming to class, students will read the required papers for the week, as well as any associated code included in the papers’ supplemental materials. Students should come to class prepared to do statistical programming (i.e., bring a laptop with statistical software, download any datasets needed for the week etc). Participation is based on in-class coding exercises based on each week’s topic. If a student misses a class, they will be expected to complete the in-course exercise outside of class to receive credit for participation in that exercise. Students will be required to do mid-term and final projects which will be presented in class and submitted as a written write-up describing the project.

2.1 Course learning objectives

The learning objectives for this proposed course complement core epidemiology and statistics courses required by the program and provide the opportunity for students to implement theoretical skills and knowledge gained in those courses in a more applied setting.

Upon successful completion of this course students will be able to:

  1. List several possible statistical approaches to answering an epidemiological research questions. (Knowledge)
  2. Choose among analytical approaches learned in previous courses to identify one that is reasonable for an epidemiological research question. (Application)
  3. Design a plan for cleaning and analyzing data to answer an epidemiological research question, drawing on techniques learned in previous and concurrent courses. (Synthesis)
  4. Justify the methods and code used to answer an epidemiological research question. (Evaluation)
  5. Explain the advantages and limitations of a chosen methodological approach for evaluating epidemiological data. (Evaluation)
  6. Apply advanced epidemiological methods to analyze example data, using a regression modeling framework. (Application)
  7. Apply statistical programming techniques learned in previous courses to prepare epidemiological data for statistical analysis and to conduct the analysis. (Application)
  8. Interpret the output from statistical analyses of data for an epidemiological research question. (Evaluation)
  9. Defend conclusions from their analysis. (Comprehension)
  10. Write a report describing the methods, results, and conclusions from an epidemiological analysis. (Application)
  11. Construct a reproducible document with embedded code to clean and analyze data to answer an epidemiological research question. (Application)

2.2 Meeting time and place

The class will meet on Mondays, 2:00–3:40 PM on the Colorado State University campus in MRB 312.

2.3 Class Structure and Expectations

  • Homework/preparation: It is expected that before coming to class, students will read the required papers for the week, as well as the online book sections assigned for the week. Reading assignments will be announced the week before each class session. Students should come to class prepared to do statistical programming (i.e., bring in a laptop with statistical software, download any datasets needed for the week).
  • In-class schedule:
    • Topic overview: Each class will start with a brief overview of the week’s topic. This will focus on the material covered in that week’s assigned reading in the online book and papers.
    • Discussion of analysis and coding points: Students and faculty will be divided into small groups to discuss the assigned reading and think more deeply about the content. This is a time to bring up questions and relate the chapter concepts to other datasets and/or analysis methods you are familiar with.
    • Group work: In small groups, students will work on designing an epidemiological analysis for the week’s topic and developing code to implement that analysis. This will follow the prompts given in the assigned reading from the online book for the week.
    • Wrap-up: We will reconvene as one group at the end to discuss topics that came up in small group work and to outline expectations for students before the next meeting.

2.4 Course grading

Assessment Components Percentage of Grade
Midterm written report 30
Midterm presentation 15
Final written report 30
Final presentation 15
Participation in in-course exercises 10
  • Midterm report (written and presentation): Students will work in groups to prepare an oral presentation and accompanying written report presenting an epidemiologic analysis using a time series dataset similar to the London dataset used for the first half of the course. The group may pick a research question based on the topics covered in the first half of the course. The presentation should be 15 minutes and should be structured like a conference presentation (Introduction, Methods, Results, and Discussion). The written report should be approximately six pages (single spaced) and should cover the same topics. It should include at least two (up to four) well-designed figures and / or tables. The written report should be created following reproducible research principles and using a bibliography referencing system (e.g., BibTex if the student uses RMarkdown to write the report). The report should be written to the standard expected for a peer-reviewed publication in terms of clarity, grammar, spelling, and referencing. These Midterm reports will be due (and presented) the eighth week of class (seventh class session, since there will be no class for Labor Day).
  • Final report (written and presentation): Each student will prepare an oral presentation and accompanying written report presenting an epidemiologic analysis using either a dataset similar to the Framingham dataset used for the second half of the course or their own dataset from their research. The student may pick a research question based on the topics covered in the second half of the course. The presentation should be 10 minutes and should be structured like a conference presentation (Introduction, Methods, Results, and Discussion). The written report should be approximately six pages (single spaced) and should cover the same topics. It should include at least two well-designed figures and / or tables. The written report should be created following reproducible research principles and using a bibliography referencing system (e.g., BibTex if the student uses RMarkdown to write the report). The report should be written to the standard expected for a peer-reviewed publication in terms of clarity, grammar, spelling, and referencing. The final presentations will be given during the assigned time period for finals for our course.
  • Participation: Attendance is an essential part of participating in the class. We understand things come up, however it is expected you attend every class and come prepared. Further, it is expected that you will actively participate in discussions and group work during the class period.

2.5 Course Schedule

Class Date Study type Topic Book sections
1 August 21 Time series Time series / case-crossover study designs 3.1–3.4
2 August 28 Time series Time series / case-crossover study designs 3.5
3 September 11 Time series Generalized linear models 4.1–4.2
4 September 18 Time series Generalized linear models 4.3
5 September 25 Time series Natural experiments 5.1–5.3
6 October 2 Time series Risk assessment 6.1–6.3
7 October 9 Time series Group midterm presentations None
8 October 16 Cohort Longitudinal cohort study designs 7.1–7.3
9 October 23 Cohort Longitudinal cohort study designs 7.4
10 October 30 Cohort Inverse probability weighting, Propensity scores 8.1–8.3
11 November 6 Cohort Mixed models 9.1–9.5
12 November 13 Cohort Instrumental variables 10.1–10.7
13 November 27 Cohort Discuss Final Presentation/Prepatation None
14 December 4 Cohort Finals presentations None
15 December 11 Cohort Finals presentations None

2.6 Textbooks and Course Materials

Readings for this course will focus on peer-reviewed literature that will be posted for the students in the class, as well as assigned reading from this online book.

Additional general references that will be useful to students throughout the semester include:

  • Garrett Grolemund and Hadley Wickham, R for Data Science, O’Reilly, 2017. (Available for free online at https://r4ds.had.co.nz/ and in print through most large book sellers.)
  • Miguel A. Hernán and James M. Robins, Causal Inference: What If, Boca Raton: Chapman & Hall/CRC, 2020. (Available for free online at https://cdn1.sph.harvard.edu/wp-content/uploads/sites/1268/2021/01/ciwhatif_hernanrobins_31jan21.pdf with a print version anticipated in 2021.)
  • Francesca Dominici and Roger D. Peng, Statistical Methods for Environmental Epidemiology with R, Springer, 2008. (Available online through the CSU library or in print through Springer.)

2.7 Prerequisites and Preparation

This course assumes experience in epidemiology and some experience programming in statistical programming (e.g., R, SAS) and statistics. Students should have taken (1) Epidemiologic Methods (ERHS 532/PBHL 570), (2) Advanced Epidemiology (ERHS 640), (3) R Programming for Research (ERHS 535) or
SAS and Epidemiologic Data Management (ERHS 534), and (4) Design and Data Analysis for Researchers I (STAT 511), or equivalent courses or experience, prior to taking this course. While previous SAS experience is acceptable for the course, example code will be in R, so you will find it helpful to review the basics of R prior to class.

If you would like to prepare for this course, the best way is by reviewing R programming, epidemiology, and regression modeling. In each chapter of the book, we provide the required and supplemental reading for the week. If you need a review in any of these topics, we recommend starting with the papers and book chapters listed as general reviews of these topics in the chapters’ supplemental reading sections.

2.8 Academic Honesty

Lack of knowledge of the academic honesty policy is not a reasonable explanation for a violation. For more on Colorado State University’s policies on Academic Integrity / Misconduct.