Course summary

This course provides an overview of statistical methods for analyzing data where the outcome variable is categorical or discrete. The course will emphasize the theoretical underpinnings of the methods as well as an applied understanding of the computation and interpretation, both of which are necessary to succeed with real data analysis. We will cover inference for binomial and multinomial variables with contingency tables, generalized linear models, logistic regression for binary responses, logit models for multiple response categories, log-linear models, some statistical machine learning approaches, inference for matched-pairs, and correlated/clustered data. Examples will be taken from public health and biomedical research. Students will be evaluated on homework assignments, a mid-term exam, and a final project.


Course Details

Course number: BIOSTATS 743

Instructor: Nicholas Reich

Office hours: Tuesday 2:30-3:30pm, Arnold 425

TA: Zhengfan Wang

Prerequisites: BIOSTATS 540 (Intro Biostatistics), STAT 515 (Intro Statistics I), STAT 516 (Intro Statistics II), BIOSTATS 650/STAT 525 (both of these are courses on linear regression), or equivalent coursework. Prior programming experience is required. If you have not taken some of the listed pre-requisites but still think the course is the right choice for you, you may petition the instructor directly for permission to enroll in the course.

Lectures: T/Th 1-2:15pm, LGRT 141

Required books Categorical Data Analysis, 3rd Edition by Alan Agresti

Recommended books Data Analysis Using Regression and Multilevel/Hierarchical Models, by Gelman and Hill


The source for the website