Datasets Available Online

There is a vast amount of data available on-line.  Follow these links to National Institutes, U and US Government Departments for data that I have found useful.

Data from National Institutes

National Institute of Standards and Technology works to ensure the computational accuracy of statistical software for conducting descriptive, multiple regression, ANOVA and nonlinear regression analyses, by providing a library of statistical reference datasets.


University-Related Data Resources

There are innumerable university, departmental and faculty sites, in the US and internationally, that provide data, including:

Manchester Metropolitan University provides examples of behavioral, biological, medical and weather data, suitable for principal components analysis, cluster analysis, multiple regression analysis, discriminant analysis, etc., in ASCII, EXCEL and SPSS system files.

German Rodriguez of Princeton University provides about 20 (largely frequency) well-documented datasets on issues like births, deaths, salaries of professors, time-to-doctorate, contraceptive use, ship damage, etc. for his WWS509 Course on generalized linear models.

UCLA Academic Technology Services provides many "textbook" examples, each containing the datasets, and programming (HLM, MLwiN, S+, SAS, SPSS, Stata) for generating the book exhibits, in over 30 applied statistics books, including many of the standards (ours too!).


Government Sources of Data

Many government departments, in the USA and elsewhere, provide access to enormous amounts of important data, both aggregated and disaggregated:

US Census Department ...

National Center for Education Statistics provides data from the major educational surveys in the USA (and overseas), including "standards" like: ECLS, HS&B, NLS72, NELS88/2000, SASS, etc.  All datasets are free, and are distributed by mail on CD-ROM (some by online download).


