## Statistical Modeling and Data Visualization

#### Course summary

The aim of this course is to provide fundamental statistical concepts and tools relevant to the practice of summarizing, analyzing, and visualizing data. This course will build your knowledge of the fundamental principles of biostatistical inference. We will focus on linear regression and generalized linear regression models using a variety of examples and exercises from medical and public health research.

#### Course Details

**Course number**: PUBHLTH 690NR

**Instructor**: Nicholas Reich

**Office hours**: Wed 9:30-10:30 or by appointment

**Prerequisites**:

A first course in statistics or biostatistics.

Familiarity with the R statistical programming language.

Working knowledge of basic matrix methods and calculus (optional, but recommended).

**Lectures**: Tu/Th, 11:15am–12:30pm, LGRC 204

**Required books** (all freely available online)

Faraway JJ. 2002. Practical Regression and Anova using R.

James G, Witten D, Hastie T, and Tibshirani R. 2014. An Introduction to Statistical Learning.

Diez D, Barr C, and Çetinkaya-Rundel M. 2012. OpenIntro Statistics, 2nd Ed.

**Recommended books**

Hefferon J. 2014. Linear Algebra (free textboox)

Weisberg S. 2005. *Applied Linear Regression, 3rd Edition*.

Kutner M, Nachtsheim C, Neter J, and Li W. 2004. *Applied Linear Regression Models, 4th Edition*.

Hosmer DW and Lemeshow S. 2000. *Applied Logistic Regression, 2nd Edition*.

**Topics covered**

Simple and multiple linear regression

Least squares estimation, interpretation and inference about linear regression

Goodness of fit, model diagnostics

Model selection

Inference using bootstrapping

Smooth splines

Logistic regression (introduction)

Longitudinal data analysis (introduction)

Poisson regression (introduction, time permitting)