January 27, 2016
You have three central choices for making graphics in R:
library(NHANES) data(NHANES) str(NHANES)
## Classes 'tbl_df', 'tbl' and 'data.frame': 10000 obs. of 76 variables: ## $ ID : int 51624 51624 51624 51625 51630 51638 51646 51647 51647 51647 ... ## $ SurveyYr : Factor w/ 2 levels "2009_10","2011_12": 1 1 1 1 1 1 1 1 1 1 ... ## $ Gender : Factor w/ 2 levels "female","male": 2 2 2 2 1 2 2 1 1 1 ... ## $ Age : int 34 34 34 4 49 9 8 45 45 45 ... ## $ AgeDecade : Factor w/ 8 levels " 0-9"," 10-19",..: 4 4 4 1 5 1 1 5 5 5 ... ## $ AgeMonths : int 409 409 409 49 596 115 101 541 541 541 ... ## $ Race1 : Factor w/ 5 levels "Black","Hispanic",..: 4 4 4 5 4 4 4 4 4 4 ... ## $ Race3 : Factor w/ 6 levels "Asian","Black",..: NA NA NA NA NA NA NA NA NA NA ... ## $ Education : Factor w/ 5 levels "8th Grade","9 - 11th Grade",..: 3 3 3 NA 4 NA NA 5 5 5 ... ## $ MaritalStatus : Factor w/ 6 levels "Divorced","LivePartner",..: 3 3 3 NA 2 NA NA 3 3 3 ... ## $ HHIncome : Factor w/ 12 levels " 0-4999"," 5000-9999",..: 6 6 6 5 7 11 9 11 11 11 ... ## $ HHIncomeMid : int 30000 30000 30000 22500 40000 87500 60000 87500 87500 87500 ... ## $ Poverty : num 1.36 1.36 1.36 1.07 1.91 1.84 2.33 5 5 5 ... ## $ HomeRooms : int 6 6 6 9 5 6 7 6 6 6 ... ## $ HomeOwn : Factor w/ 3 levels "Own","Rent","Other": 1 1 1 1 2 2 1 1 1 1 ... ## $ Work : Factor w/ 3 levels "Looking","NotWorking",..: 2 2 2 NA 2 NA NA 3 3 3 ... ## $ Weight : num 87.4 87.4 87.4 17 86.7 29.8 35.2 75.7 75.7 75.7 ... ## $ Length : num NA NA NA NA NA NA NA NA NA NA ... ## $ HeadCirc : num NA NA NA NA NA NA NA NA NA NA ... ## $ Height : num 165 165 165 105 168 ... ## $ BMI : num 32.2 32.2 32.2 15.3 30.6 ... ## $ BMICatUnder20yrs: Factor w/ 4 levels "UnderWeight",..: NA NA NA NA NA NA NA NA NA NA ... ## $ BMI_WHO : Factor w/ 4 levels "12.0_18.5","18.5_to_24.9",..: 4 4 4 1 4 1 2 3 3 3 ... ## $ Pulse : int 70 70 70 NA 86 82 72 62 62 62 ... ## $ BPSysAve : int 113 113 113 NA 112 86 107 118 118 118 ... ## $ BPDiaAve : int 85 85 85 NA 75 47 37 64 64 64 ... ## $ BPSys1 : int 114 114 114 NA 118 84 114 106 106 106 ... ## $ BPDia1 : int 88 88 88 NA 82 50 46 62 62 62 ... ## $ BPSys2 : int 114 114 114 NA 108 84 108 118 118 118 ... ## $ BPDia2 : int 88 88 88 NA 74 50 36 68 68 68 ... ## $ BPSys3 : int 112 112 112 NA 116 88 106 118 118 118 ... ## $ BPDia3 : int 82 82 82 NA 76 44 38 60 60 60 ... ## $ Testosterone : num NA NA NA NA NA NA NA NA NA NA ... ## $ DirectChol : num 1.29 1.29 1.29 NA 1.16 1.34 1.55 2.12 2.12 2.12 ... ## $ TotChol : num 3.49 3.49 3.49 NA 6.7 4.86 4.09 5.82 5.82 5.82 ... ## $ UrineVol1 : int 352 352 352 NA 77 123 238 106 106 106 ... ## $ UrineFlow1 : num NA NA NA NA 0.094 ... ## $ UrineVol2 : int NA NA NA NA NA NA NA NA NA NA ... ## $ UrineFlow2 : num NA NA NA NA NA NA NA NA NA NA ... ## $ Diabetes : Factor w/ 2 levels "No","Yes": 1 1 1 1 1 1 1 1 1 1 ... ## $ DiabetesAge : int NA NA NA NA NA NA NA NA NA NA ... ## $ HealthGen : Factor w/ 5 levels "Excellent","Vgood",..: 3 3 3 NA 3 NA NA 2 2 2 ... ## $ DaysPhysHlthBad : int 0 0 0 NA 0 NA NA 0 0 0 ... ## $ DaysMentHlthBad : int 15 15 15 NA 10 NA NA 3 3 3 ... ## $ LittleInterest : Factor w/ 3 levels "None","Several",..: 3 3 3 NA 2 NA NA 1 1 1 ... ## $ Depressed : Factor w/ 3 levels "None","Several",..: 2 2 2 NA 2 NA NA 1 1 1 ... ## $ nPregnancies : int NA NA NA NA 2 NA NA 1 1 1 ... ## $ nBabies : int NA NA NA NA 2 NA NA NA NA NA ... ## $ Age1stBaby : int NA NA NA NA 27 NA NA NA NA NA ... ## $ SleepHrsNight : int 4 4 4 NA 8 NA NA 8 8 8 ... ## $ SleepTrouble : Factor w/ 2 levels "No","Yes": 2 2 2 NA 2 NA NA 1 1 1 ... ## $ PhysActive : Factor w/ 2 levels "No","Yes": 1 1 1 NA 1 NA NA 2 2 2 ... ## $ PhysActiveDays : int NA NA NA NA NA NA NA 5 5 5 ... ## $ TVHrsDay : Factor w/ 7 levels "0_hrs","0_to_1_hr",..: NA NA NA NA NA NA NA NA NA NA ... ## $ CompHrsDay : Factor w/ 7 levels "0_hrs","0_to_1_hr",..: NA NA NA NA NA NA NA NA NA NA ... ## $ TVHrsDayChild : int NA NA NA 4 NA 5 1 NA NA NA ... ## $ CompHrsDayChild : int NA NA NA 1 NA 0 6 NA NA NA ... ## $ Alcohol12PlusYr : Factor w/ 2 levels "No","Yes": 2 2 2 NA 2 NA NA 2 2 2 ... ## $ AlcoholDay : int NA NA NA NA 2 NA NA 3 3 3 ... ## $ AlcoholYear : int 0 0 0 NA 20 NA NA 52 52 52 ... ## $ SmokeNow : Factor w/ 2 levels "No","Yes": 1 1 1 NA 2 NA NA NA NA NA ... ## $ Smoke100 : Factor w/ 2 levels "No","Yes": 2 2 2 NA 2 NA NA 1 1 1 ... ## $ Smoke100n : Factor w/ 2 levels "Non-Smoker","Smoker": 2 2 2 NA 2 NA NA 1 1 1 ... ## $ SmokeAge : int 18 18 18 NA 38 NA NA NA NA NA ... ## $ Marijuana : Factor w/ 2 levels "No","Yes": 2 2 2 NA 2 NA NA 2 2 2 ... ## $ AgeFirstMarij : int 17 17 17 NA 18 NA NA 13 13 13 ... ## $ RegularMarij : Factor w/ 2 levels "No","Yes": 1 1 1 NA 1 NA NA 1 1 1 ... ## $ AgeRegMarij : int NA NA NA NA NA NA NA NA NA NA ... ## $ HardDrugs : Factor w/ 2 levels "No","Yes": 2 2 2 NA 2 NA NA 1 1 1 ... ## $ SexEver : Factor w/ 2 levels "No","Yes": 2 2 2 NA 2 NA NA 2 2 2 ... ## $ SexAge : int 16 16 16 NA 12 NA NA 13 13 13 ... ## $ SexNumPartnLife : int 8 8 8 NA 10 NA NA 20 20 20 ... ## $ SexNumPartYear : int 1 1 1 NA 1 NA NA 0 0 0 ... ## $ SameSex : Factor w/ 2 levels "No","Yes": 1 1 1 NA 2 NA NA 2 2 2 ... ## $ SexOrientation : Factor w/ 3 levels "Bisexual","Heterosexual",..: 2 2 2 NA 2 NA NA 1 1 1 ... ## $ PregnantNow : Factor w/ 3 levels "Yes","No","Unknown": NA NA NA NA NA NA NA NA NA NA ...
plot(NHANES$Weight, NHANES$Height)
ggplot2
graphiclibrary(ggplot2) qplot(Weight, Height, data=NHANES)
lattice
graphiclibrary(lattice) xyplot(Height ~ Weight, data=NHANES)
mosaic
package can translate between themlibrary(mosaic) mplot(NHANES)
{mplot()} loads a very nice graphical user interface for creating graphics, and spits out the code for ggplot or lattice graphics.
Try it out! Practice saving a graphic from RStudio.
ggplot2
ggplot2
syntax## defaults to scatterplot qplot(Weight, Height, data=NHANES)
ggplot2
syntax## defaults to scatterplot ## same as default qplot(Weight, Height, data=NHANES, geom="point")
ggplot2
syntaxqplot(Weight, Height, data=NHANES, geom="smooth")
ggplot2
syntaxqplot(Weight, Height, data=NHANES, geom=c("smooth", "point"))
ggplot2
syntaxqplot(Weight, Height, data=NHANES, geom=c("smooth", "point"), alpha=I(.01))
ggplot2
building blocksThe grammar …
aes
)From http://ggplot2.org/resources/2007-vanderbilt.pdf:
For more info check out the documentation: http://docs.ggplot2.org/current/
Aesthetics define a mapping between data and the display.
Each geom has a different set of aesthetics.
What aesthetics might we need for {geom_point}?
{geom_point} understands the following aesthetics
What aesthetics might we need for {geom_line}?
{geom_line} understands the following aesthetics
qplot(Weight, Height, data=NHANES, geom="smooth", se=FALSE, color=Race1, facets=.~Gender)
qplot(Weight, Height, data=NHANES, geom="smooth", se=FALSE, color=Race1, facets=.~Gender)
ggplot(NHANES) + geom_smooth(aes(x=Weight, y=Height, color=Race1), se=FALSE) + geom_point(aes(x=Weight, y=Height, color=Race1), alpha=I(.01)) + facet_grid(.~Gender)
library(mosaic) mplot(NHANES)
Work on creating a data visualization with a teammate using the NHANES dataset.
By 3:30pm, you should have posted a final version of your data graphic on Piazza, along with the code you used to create it. A few of them will be chosen and discussed by the whole class.