--- title: "Duncan's Occupational Prestige Regression" author: "John Fox" date: "`r Sys.Date() # generates today's date`" output: html_document --- ```{r echo=FALSE} # include this code chunk as-is to set options knitr::opts_chunk$set(comment=NA, prompt=TRUE, fig.height=8, fig.width=8) # use fig.height=6.5, fig.width=6.5 for word output ``` Duncan's Occupational Prestige Regression ----------------------------------------- * Reading and examining the data: ```{r} Duncan <- read.table("data/Duncan.txt", header=TRUE) # assumes the file is in data subdirectory of current directory summary(Duncan) library(car) scatterplotMatrix(~ prestige + income + education, id=list(n=3), data=Duncan) ``` * Duncan's regression: ```{r} duncan.mod <- lm(prestige ~ income + education, data=Duncan) summary(duncan.mod) ``` * Some diagnostics for Duncan's regression: ```{r fig.width=6, fig.height=6} qqPlot(duncan.mod, id=list(n=2), reps=1000) influencePlot(duncan.mod, id=list(n=2)) ``` ```{r fig.height=4} avPlots(duncan.mod, id=list(n=3, method="mahal")) crPlots(duncan.mod, smooth=list(span=0.9)) ``` The residual QQ plot is unremarkable, as are the component+residual plots, but the influence plot, and especially the added-variable plots suggest that minister and railroad conductor might be an influential pair of cases, decreasing the magnitude of the income coefficient and increasing the magnitude of the education coefficient. * Try refitting the model removing conductors and ministers: ```{r} duncan.mod.2 <- update(duncan.mod, subset= - whichNames(c("minister", "conductor"), Duncan)) summary(duncan.mod.2) compareCoefs(duncan.mod, duncan.mod.2) ``` The income coefficient is now more than twice the education coefficient.