Selected Annotated Bibliography of Sources on Nonparametric Regression

John Fox


In the 1990s, nonparametric regression analysis moved from the journal literature in statistics into textbooks and monographs. I particularly recommend the following sources:

R.A. Becker, J.M. Chambers, and A.R. Wilks. The New S Language: A Programming Environment for Data Analysis and Statistics. Pacific Grove, CA: Wadsworth, 1988.

J. M. Chambers and T.J. Hastie, eds., Statistical Models in S. Pacific Grove, CA: Wadsworth, 1992. The statisticians at Bell Labs have virtually defined the field of modern statistical graphics. The statistical programming language S, a product of Bell Labs, is rapidly becoming the standard for new statistical applications such as nonparametric regression. The latter volume includes excellent introductions to three aspects of nonparametric regression, which are of value independent of interest in S: A chapter on additive regression models (generalized additive models) by Hastie; another on local polynomial regression (lowess or loess) models by Cleveland, Grosse, and Shyu; and a third on regression and classification trees by Clark and Pregibon.

A.W. Bowman and A. Azzalini. Applied Smoothing Techniques for Data Analysis: The Kernel Approach with S-Plus Illustrations. Oxford, Oxford University Press, 1997. An accessible treatment of nonparametric regression and related methods, with a useful library of S programs and worked-out S examples.

W. Cleveland. Visualizing Data. Summit, NJ: Hobart Press, 1993. Similar in coverage to his earlier book, The Elements of Graphing Data, if not quite as extensive, this companion text from Cleveland focuses on a series of examples that illustrate the use of graphical methods in analyzing data, including some material on nonparametric regression and on the closely associated topic of 'coplots' (conditioning plots).

J. Fan and I. Gijbels. Local Polynomial Modelling and Its Applications. London: Chapman and Hall, 1996. A technical presentation of the theoretical underpinnings of local polynomial regression estimators (such as lowess/loess). Includes an extensive set of references to the journal literature.

P.J. Green and B.W. Silverman. Nonparametric Regression and Generalized Linear Models: A Roughness Penalty Approach. London, Chapman and Hall, 1994. Describes smoothing splines, the major alternative to local polynomial regression. A relatively difficult, but very high quality text.

T. J. Hastie and R. J. Tibshirani. Generalized Additive Models. London: Chapman and Hall, 1990. This is -- for the most part -- a very readable book. Generalized additive models include additive regression models, but extend additive nonparametric regression to other 'link' functions -- such as logistic regression, probit regression, and Poisson regression. The book provides a fine general introduction to nonparametric regression.

C. Loader. Local Regression and Likelihood. New York: Springer, 1999. This is a wide-ranging and reasonably accessible treatment of local polynomial estimation for a variety of statistical problems, including density estimation, regression models, generalized regression models, and survival models. Loader's book is associated with an excellent library of S functions.

B. W. Silverman. Density Estimation for Statistics and Data Analysis, London: Chapman and Hall, 1986. Kernel density estimation -- smoothing the distribution of a variable or variables -- is a relatively narrow topic in graphical data analysis, but it is valuable in its own right and provides a basis for methods of nonparametric regression. Silverman's short book is a paragon of clarity.

J. S. Simonoff. Smoothing Methods in Statistics. New York: Springer, 1996. This book covers a variety of applications of smoothing, including -- but not limited to -- nonparametric density estimation and nonparametric regression. Simonoff develops a number of illustrative applications and provides good references to the journal literature and to computer programs. Some of the theoretical material is relatively difficult, but of the several texts devoted to general ideas in smoothing with which I am familiar, this and Bowman and Azzalini are the most accessible.

W. Venables and B. Ripley. Modern Applied Statistics with S-PLUS, Third Edition. New York: Springer, 1999. S-PLUS is the commercial version of the Bell Labs statistical programming language S. To my knowledge, S has more extensive capabilities for nonparametric regression than any other broadly available statistical programming environment or package. Venables and Ripley show you how to use S-PLUS for a wide variety of data analysis tasks, including various methods of nonparametric regression. A free implementation of S, called R, has most of the nonparametric-regression capabilities in S-PLUS (with the notable exception of generalized additive models); Venables and Ripley provide "on-line" complements to their book describing the differences between S-PLUS and R.



Last modified: 29 May 2000 by John Fox jfox@mcmaster.ca .