Harvard Graduate School of Education
Harvard Graduate School of Education Harvard Graduate School of Education

Courses in Quantitative Methods at HGSE and Harvard

There are a great many courses in quantitative methods available at the Harvard Graduate School of Education and Harvard University, at different levels of technicality and practicality.

They are listed below, by academic department, each with a course number, a brief description of the content, and the name of the professor(s) responsible for the course:

Within each department-specific table, there is a link that will return you to the head of this page.

Harvard Graduate School of Education

(Return to top of page)

Introductory-Level Courses

Answering Questions with Quantitative Data
Introductory to the concepts, principles and vocabulary of quantitative methods. Prerequisite: none (restricted to entering doctoral students in APSP).
Introduction to Educational Research
Introduction to the rationale and procedures of educational research. Prerequisite: none, for master's and first-year doctoral students.
Principles of Educational Assessment

Introduction to the fundamental principles of educational measurement designed for students who will need to evaluate test-based information in their later work. The course has three main goals. First, it will provide a context for understanding assessment results students will encounter outside the course. Second, the course will provide a basic understanding of essential concepts in measurement, such as reliability, validity, and bias. Third, the course will apply these principles to a variety of current issues in education policy.

Empirical Methods: Introduction to Statistics for Research
Introduces basic principles of elementary statistics, for students who want to continue course work in statistical methods. Prerequisite: none, for master's and first-year doctoral students.

Intermediate-Level Courses

Intermediate Statistics

Students will learn how to use correlation analysis, regression analysis, analysis of variance and covariance to address educational, psychological, and social questions. Using real data as a catalyst, we will discuss how to (1) formulate research questions; (2) select appropriate statistical techniques; (3) conduct necessary calculations; (4) examine assumptions; (5) interpret results; (6) identify rival explanations; and (7) summarize findings in a convincing argument. Computer-based statistical analyses are an integral part of the course. Prequisite: A-010Y or H-012.


Advanced-Level Courses

Applied Data Analysis
Extends data-analytic skills beyond basic regression analysis and ANOVA. Topics include: extensive use of transformations, influence statistics, building taxonomies of regression models, general linear hypothesis testing, intro to multilevel modeling, nonlinear regression analysis, binomial and multinomial logistic regression analysis, ordinal logit analysis, principal components analysis, cluster analysis, exploratory factor analysis, intro to discrete-time survival analysis, etc. Applied course offering conceptual explanations of statistical techniques along with practice in real data. Prerequisite: S030.
Methods of Educational Measurement

Survey course on educational measurement for students with prior statistical training who will become critical consumers of test-based information or who will apply the methods in their own research. Topics include: traditional psychometric methods (classical test theory), generalizability theory and item response theory (IRT). Course addresses current policy issues in education and requires the application of psychometric methods to real data. Prequisite: S052.

Applied Longitudinal Data Analysis

Course covers the practical application of two analytic strategies for analyzing longitudinal data: discrete-time survival analysis and individual growth modeling. Class lectures will be devoted to introducing basic concepts underlying the models, describing computer programs for conducting analyses, and interpreting the results. Prequisite: S052.

Answering Complex Questions with Multivariate Methods
Methods of covariance structure analysis, including path analysis, structural equation modeling, confirmatory factor analysis, and latent growth modeling, using LISRELPrequisite: S052.
Quantitative Methods for Improving Causal Inference in Educational Research
Seminar in techniques of research design and data analysis for strengthening causal inferences in quantitative research, including: randomized experiments, instrumental variables estimation, regression discontinuity designs, correction for selection bias, etc. Prerequisite: S052.
Murnane & Willett


Faculty of Arts & Sciences: Department of Economics

(Return to top of page)
ECON 1126
Quantitative Methods in Economics
Statistical decision theory and related experimental evidence; game theory and related experimental evidence; maximum likelihood; logit, normal, probit, and ordered probit regression models; panel data models with random effects; omitted variable bias and random assignment; incidental parameters and conditional likelihood; demand and supply.

ECON 2110

Introductory Probability and Statistics for Economists
Introduction to probability and statistics. Emphasis on general methods applicable to both econometrics and economic theory. Topics include probability spaces, random variables, limit laws, estimation, hypothesis testing, and Bayesian methods.
ECON 2120
Introduction to Applied Econometrics
Introduction to applied econometrics, including linear regression, instrumental variables, panel data techniques, generalized method of moments, and maximum likelihood. Includes discussion of papers in applied econometrics and computer exercises.
ECON 2130
Applied Econometrics
Advanced methods in applied econometrics, including nonlinear regression, discrete and limited dependent variables, models of selection, and stationary and non-stationary time series. Includes detailed discussion of empirical applications.
ECON 2140
Econometric Methods
Statistical decision theory with applications to portfolio choice, panel data topics, selection bias, demand and supply, qualitative choice, and quantile regression.
ECON 2141
Analysis of Cross Section and Panel Data
Topics include censoring, sample selection, attrition, stratified sampling, estimation of average treatment effects, and duration analysis.
ECON 2142
Time Series Analysis
Survey of modern time series econometrics. Topics include univariate models, vector auto-regressions, linear and nonlinear filtering, frequency domain methods, unit roots, structural breaks, empirical process theory asymptotics, forecasting, and applications to macroeconomics and finance.
ECON 2110
Introductory Probability and Statistics for Economists
Introduction to probability and statistics. Emphasis on general methods applicable to both econometrics and economic theory. Topics include probability spaces, random variables, limit laws, estimation, hypothesis testing, and Bayesian methods.
ECON 2120
Introduction to Applied Econometrics
Introduction to applied econometric methods, including linear regression, instrumental variables, panel data techniques, generalized method of moments, and maximum likelihood. Includes discussion of applied econometrics papers and use of standard econometric computer packages.
ECON 2130
Applied Econometrics
Advanced methods in applied econometrics, including nonlinear regression, discrete and limited dependent variables, models of selection, and stationary and non-stationary time series. Includes detailed discussion of empirical applications.
ECON 2140
Econometric Methods
Statistical decision theory with applications to portfolio choice, panel data topics, selection bias, demand and supply, qualitative choice, and quantile regression.
ECON 2141
Analysis of Cross Section and Panel Data
Topics include censoring, sample selection, attrition, stratified sampling, estimation of average treatment effects, and duration analysis.
ECON 2142
Time Series Analysis
Survey course. Topics include univariate models, vector auto-regressions, linear and nonlinear filtering, frequency domain methods, unit roots, structural breaks, empirical process theory asymptotics, forecasting, and applications to macroeconomics and finance.


Faculty of Arts & Sciences: Department of Government

(Return to top of page)
GOV 1000
Quantitative Methods for Political Science I
Introduces key ideas that underlie statistical and quantitative reasoning, including probability spaces, random variables, distributions, descriptive and summary statistics, sampling, hypothesis testing, and estimation.
GOV 1001
Introduction to Quantitative Methods in Political Science
Designed for undergraduates who wish to use quantitative research methods in their own work. Topics include research design, causal inference, descriptive and summary statistics, probability, sampling, and statistical inference including estimation and tests of hypotheses. Course emphasizes multiple regression, with applications that focus on substantive research questions such as "How do citizens evaluate elected officials?" or "Is it really the economy, stupid?"
GOV 2001
Advanced Quantitative Research Methodology
Introduces theories of inference underlying most statistical methods and how new approaches are developed. Examples include discrete choice, event counts, durations, missing data, ecological inference, time-series cross sectional analysis, compositional data, causal inference, and others.
GOV 2002
Topics in Quantitative Methods
Focuses on the robust estimation of generalized linear models but also covers some time series cross-section methods.
GOV 2003
Hierarchical Bayesian Modeling
Provides a solid understanding of Bayesian inference and Markov chain Monte Carlo methods. Topics covered include: Bayesian treatment of the linear model, Markov chain Monte Carlo methods, assessing model adequacy, and hierarchical models.


Faculty of Arts & Sciences: Department of Psychology

(Return to top of page)
PSY 1901
Methods of Behavioral Research
Theoretical and practical introduction to planning, conducting, reporting, and evaluating research in the social and behavioral sciences. Topics include experimental design, reliability and validity, experimental artifacts, and analysis of published research.
PSY 1951
Intermediate Quantitative Methods
Emphasizes analysis of variance designs and contrasts for applied behavioral research. Additional topics include reliability, validity, correlation, effect size, and meta-analysis.
PSY 1952
Multivariate Analysis in Psychology
Emphasizes multiple regression analysis and factor analysis. Additional topics include multivariate analysis of variance, analysis of covariance, discriminant analysis, and logistic regression.
PSY 2100
Research Methodology
Covers all major steps in conducting an empirical research project, with emphasis on studies that involve human participants. Topics include finding and formulating research problems; research design strategies; developing and validating concepts; designing and assessing empirical measures and manipulations; issues in data collection, analysis, and interpretation; and writing and publishing research reports.
PSY 3800
Psychometric Theory
Basic psychometric theory and methods essential for reliable and valid measurement. Reliability, validity, and generalizability reviewed. Detailed survey of techniques used to create and evaluate a scale.

Faculty of Arts & Sciences: Department of Sociology

(Return to top of page)
SOC 203a
Methods of Quantitative Sociological Research I
Matrix approach to regression analysis with an emphasis on the assumptions behind OLS. Instrumental variables, generalized least squares, probit and logit models, survival analysis, hierarchical linear models, and systems of equations are studied.
SOC 203b
Methods of Quantitative Sociological Research II
Treats longitudinal design and methods for the statistical analysis of longitudinal data with an emphasis on the analysis of change in discrete variables, or event history analysis. Includes an introduction to time series analysis. Both statistical theory and practical applications covered.


Faculty of Arts & Sciences: Department of Statistics

(Return to top of page)
STAT 100
Introduction to Quantitative Methods
Introduces key ideas underlying statistical and quantitative reasoning, including fundamentals of probability. Topics may include elements of sample surveys, experimental design and observational studies, descriptive and summary statistics for both measured and counted variables, and statistical inference including estimation and tests of hypotheses as applied to one- and two-sample problems, regression with one or more predictors, correlation, and analysis of variance. Emphasizes simple and multiple regression and applications in non-experimental fields including, but not limited to, economics.
Harrington Taback
STAT 101
Introduction to Quantitative Methods
Same topics as STAT100. Emphasizes the analysis of variance, applied in experimental fields such as psychology and other behavioral sciences.
STAT 104
Introduction to Quantitative Methods
Same topics as STAT100 and STAT101 combined, at a slightly higher level. Applications will be drawn from fields such as economics, behavioral and health sciences, policy analysis, and law.
STAT 110
Introduction to Probability
Comprehensive introduction to probability. Basics: sample space, conditional probability, Bayes Theorem. Univariate distributions: mass functions and density, expectation and variance, binomial, Poisson, normal, and gamma distributions. Multivariate distributions: joint and conditional distribution, independence, transformation, multivariate normal and related distributions. Limit laws: probability inequalities, law of large numbers, central limit theorem. Markov chains: transition probability, stationary distribution and convergence.
STAT 111
Introduction to Theoretical Statistics
Basic concepts of statistical inference from frequentist and Bayesian perspectives. Topics include maximum likelihood methods, confidence and Bayesian interval estimation, hypothesis testing, least squares methods, and analysis of variance.
STAT 139
Statistical Sleuthing Through Linear Models
Formerly "Regression Analysis", now a serious introduction to statistical inference when linear models and related methods are used. Topics include the pros and cons of t-tools and their alternatives, multiple-group comparisons, linear regressions, model checking and refinement. The emphasis is on statistical thinking and tools for real-life problems, including current events whenever relevant.
STAT 140
Design of Experiments
Statistical designs for the estimation of the effects of treatments in randomized experiments. Topics include brief review of some basic structural inference procedures, analysis of variance, randomized block and Latin square designs, balanced incomplete block designs, factorial designs, nested factorial designs, confounding in blocks, and fractional replications.
STAT 149
Generalized Linear Models
An introduction to methods for analyzing categorical data. Emphasis will be on understanding models and applying them to datasets. Topics include visualizing categorical data, analysis of contingency tables, odds ratios, log-linear models, generalized linear models, logistic regression, Poisson regression and model diagnostics. Examples drawn from many fields, including biology, medicine and the social sciences.
STAT 160
Survey Methods
Methods for design and analysis of sample surveys. The toolkit of sample design features, their use in optimal sample design strategies, and sampling weights) and variance estimation methods (including resampling methods). Brief overview of nonstatistical aspects of survey methodology such as questionnaire design and validation. Additional topics include variance estimation for complex surveys and estimators, nonresponse, missing data, hierarchical models for survey data, and small-area estimation.


John F. Kennedy School of Government

(Return to top of page)
API 201
Quantitative Analysis and Empirical Methods
Introduces students to concepts and techniques for the quantitative analysis of policy issues. Combines material typically found in an introductory course on probability and statistics with selected topics in decision analysis, and illustrates the ways in which theory can be applied to policy questions. A secondary goal of the course is to familiarize students with the use of spreadsheet programs for analyzing quantitative data. Topics include: descriptive statistics, basic probability, conditional probability, Bayes' Theorem, expected utility theory, risk aversion, decision-making under uncertainty, insurance markets (including moral hazard and adverse selection), probability distributions, statistical inference, hypothesis testing.
Avery DeLeire Piehl Jacob
API 202
Empirical Methods II
Continuation of API201. Equips students with an understanding of the most common tools of empirical analysis in policy applications using hands-on analysis of data sets. The first part of the course covers regression analysis, including multiple regression, dummy variables, and binary dependent variables. The second part of the course covers program evaluation, including selection effects; the advantages and disadvantages of experimental, quasi-experimental, and observational data; and instrumental variable techniques. The final part of the course is an integrative empirical exercise.
Wise Cooper DeLeire Abadie
API 208
Program Evaluation: Estimating Program Effectiveness with Empirical Analysis
This methodological course develops skills in quantitative program evaluation. Students will study a variety of evaluation designs (from random assignment to quasi-experimental evaluation methods) and analyze data from actual evaluations. The course evaluates the strengths and weaknesses of alternative evaluation methods.
API 209
Advanced Quantitative Methods I: Constrained Optimization and Mathematical Statistics
Introduction to tools of quantitative reasoning and analytic approaches used to address policy problems. The course introduces modeling, optimization theory, probability theory, statistical estimation, hypothesis testing, and experimental design. Students learn the theoretical foundations, basic derivations, and complete illustrative applications.
API 210
Advanced Quantitative Methods II: Econometric Methods
Continuation of Advanced Quantitative Methods I, this course focuses on developing the theoretical basis and practical application of the most common tools of empirical analysis. Foundations of analysis will be coupled with hands-on examples and assignments involving analysis of data sets. The first part of the course covers the linear model in detail. The second part treats extensions to the linear model, as well as model specification and testing.
API 212
Advanced Empirical Analysis for Public Choice
Applies probability models and statistical techniques to questions of public concern. Topics include: analysis of individual "discrete" choices, like college attendance, employment status, high school dropout. Social experimentation and the analysis of experimental data versus observations collected by more traditional surveys are considered. Empirical studies are used to demonstrate methods of analysis.
API 213
Research Methods: Primary Data Collection
Course familiarizes students with different primary data collection and analysis strategies and equips them to develop and conduct surveys. Course covers strategies for collecting and analyzing survey data, including briefly addressing qualitative data collection techniques, such as focus groups and in-depth interviewing. Topics covered include study design, survey development, sample design, and data collection protocols. Also briefly covers analytic techniques specific to such data, such as psychometric analytic techniques.


Harvard School of Public Health

(Return to top of page)

BIO 111

Introduction to Programming in SAS
Provides an overview in the use of SAS to prepare data for statistical analysis. The focus is on database management and programming problems.
Fenton Pagano
BIO 113
Introduction to Data Management and Programming in SAS
Provides intensive instruction in the use of SAS to prepare data for statistical analysis. The focus is on database management and programming problems.
Allred Pagano
BIO 200
Principles of Biostatistics
Lectures and laboratory exercises acquaint the student with the basic concepts of biostatistics and their applications and interpretation. The computer is used throughout the course. Topics include descriptive statistics, graphics, diagnostic tests, probability distributions, inference, tests of significance, association, linear and logistic regression, life tables, and survival analysis.
BIO 201
Introduction to Statistical Methods
Covers basic statistical techniques important for analyzing data arising from epidemiology, environmental health, biomedical and other public health-related research. Major topics include descriptive statistics, elements of probability, introduction to estimation and hypothesis testing, nonparametric methods, techniques for categorical data, regression analysis, analysis of variance, and elements of study design. Designed as an alternate to BIO200, for students desiring more emphasis on theoretical developments. Background in algebra and calculus strongly recommended.
BIO 202
Principles of Biostatistics I
First part of introductory biostatistics on the basic concepts and methods of biostatistics, their applications, and their interpretation. The material covered includes: data presentation, numerical summary measures, rates and standardization, and life tables. Probability is introduced to quantify uncertainty, especially as it pertains to diagnostic and screening methods. Also covered are sampling distributions, confidence intervals and hypothesis testing. The computer is used throughout , with the software package STATA.
BIO 203
Principles of Biostatistics II
Second part of introductory biostatistics; it continues to explore inference in greater depth. Lectures and laboratory exercises emphasize applied data analysis, building upon the fundamentals in BIO202. Topics covered include the comparison of two means, analysis of variance, non-parametric methods, inference on proportions, contingency tables, multiple 2 X 2 tables, correlation, simple regression, multiple regression and logistic regression, analysis of survival data, and sampling theory. The computer is used throughout the course, with STATA.
Park Lagakos
BIO 205
Statistical Methods for Health and Social Policy
Introduces students to probability and statistics, illustrating their application in the areas of health policy and management and the behavioral sciences. Understanding of basic statistical concepts will be emphasized through problem solving and examples. Topics include: descriptive statistics, diagnostic testing, probability distributions, sampling methods, hypothesis testing, confidence intervals, sample size determination, parametric and non-parametric methods, categorical data and simple linear and logistic regression.
Lagakos Lindsey
BIO 206
Introductory Statistics for Medical Research
Introduces basic biostatistical techniques with an emphasis on applications to clinical research. Topics include probability and statistics, hypothesis testing, confidence intervals, non-parametrics, and power calculations.
BIO 207
Statistics for Medical Research II
Presents additional biostatistical techniques that commonly appear in the analysis of clinical databases and trials. Topics include contingency table analyses, log-rank tests, paired and matched analyses, analysis of variance and multiple comparisons procedures.
Reed Orav
BIO 208
Statistics for Medical Research, Advanced
Presents additional biostatistical techniques that commonly appear in the analysis of clinical databases and trials. This course will move at a faster pace than the alternative BIO207 while covering all of the same topics (contingency tables, log-rank tests, paired and matched analyses, analysis of variance and multiple comparisons procedures). In addition, linear and logistic regression will be introduced.
BIO 209
Statistics for Medical Research, Translational
Presents additional biostatistical techniques that are most relevant to researchers involved with designed experiments. Topics include contingency tables, paired analyses, simple analysis of variance, multiple comparisons procedures, two-way analysis of variance, and simple repeated measures analysis of variance.
BIO 210
Analysis of Rates and Proportions
Emphasizes concepts and methods for analysis of data which are categorical, rate-of-occurrence (e.g., incidence rate), and time-to-event (survival duration). Stresses applications in epidemiology, clinical trials, and other public health research. Topics include measures of association, 2x2 tables, stratification, matched pairs, logistic regression, model building, analysis of rates, and survival data analysis using proportional hazards models.

BIO 211
Regression and Analysis of Variance in Experimental Research
Covers analysis of variance and regression, including details of data-analytic techniques and implications for study design. Also included are probability models and computing. Students learn to formulate a scientific question in terms of a statistical model, leading to objective and quantitative answers.
BIO 212
Survey Research Methods In Community Health
Covers research design, sample selection, questionnaire construction, interviewing techniques, the reduction and interpretation of data, and related facets of population survey investigations. Focuses primarily on the application of survey methods to problems of health program planning and evaluation. Treatment of methodology is sufficiently broad to be suitable for students who are concerned with epidemiological, nutritional, or other types of survey research.
Mangione Lagakos

BIO 213
Applied Regression for Clinical Research
This course will introduce students involved with clinical research to the practical application of multiple regression analysis. Linear regression, logistic regression and proportional hazards survival models will be covered, as well as general concepts in model selection, goodness-of-fit, and testing procedures. Each lecture will be accompanied by data analysis using SAS. The course will introduce, but will not develop the underlying likelihood theory.
BIO 214
Principles of Clinical Trials
Designed for individuals interested in the scientific, policy, and management aspects of clinical trials. Topics include types of clinical research, study design, treatment allocation, randomization and stratification, quality control, sample size requirements, patient consent, and interpretation of results. Students design a clinical investigation in their own field of interest, write a proposal for it, and critique recently published medical literature.
Ware Antman
Stanley Gelber
BIO 222
Basics of Statistical Inference
This course will provide an introduction to the probability theory and mathematical statistics that underlie commonly used techniques in public health research. Topics to be covered include probability distributions (normal, binomial, Poisson), means, variances and expected values, finite sampling distributions, parameter estimation (method of moments, maximum likelihood), confidence intervals, hypothesis testing (likelihood ratio, Wald and score tests). All theoretical material will be motivated with problems from epidemiology, biostatistics, environmental health and other public health areas. This course is aimed towards second year doctoral students in fields other than Biostatistics. Background in algebra and calculus required.

BIO 223
Applied Survival Analysis and Discrete Data Analysis
This course will cover topics in both discrete data analysis and applied survival analysis. The course will begin with a review of sampling plans and contingency table for discrete data. Further topics in discrete data analysis will include logistic regression, exact inference, and conditional logistic regression. This short survey of discrete data topics will provide a natural transition to analysis of survival data. Survival topics include: hazard, survivor, and cumulative hazard functions, Kaplan-Meier and actuarial estimation of the survival distribution, comparison of survival using log rank and other tests, regression models including the Cox proportional hazards model and accelerated failure time model, adjustment for time-varying covariates, and use of parametric distributions (exponential, Weibull) in survival analysis. Class material will include presentation of statistical methods for estimation and testing, along with current software (SAS, Stata, Splus) for implementing analyses of discrete data and survival data.
BIO 224
Survival Methods in Clinical Research
This course will cover the common approaches to the display and analysis of survival data, including Kaplan-Meier curves, log rank tests, and Cox proportional hazards regression. Computing, using SAS, will be an integral component of the course.

Davis Jiang

BIO 226
Applied Longitudinal Analysis
This course covers modern methods for the analysis of repeated measures, correlated outcomes and longitudinal data, including the unbalanced and incomplete data sets characteristic of biomedical research. Topics include an introduction to the analysis of correlated data, repeated measures ANOVA, random effects and growth curve models, and generalized linear models for correlated data, including generalized estimating equations (GEE).
BIO 230
Probability Theory and Applications I
A first course in probability. Topics include axiomatic foundations, frequency and personal concepts of probability, combinatorics, discrete and continuous sample spaces, independence and conditional probability, random variables, expectation operator, moments, generating functions and characteristic functions, standard distributions, transformations, sampling distributions related to the normal distribution, convergence concepts, weak and strong laws of large numbers, the central limit theorem, and elements of stochastic processes. Background in multivariable calculus required.

BIO 231
Statistical Inference I
A fundamental course in statistical inference. Discusses general principles of data reduction: exponential families, sufficiency, ancilliarity and completeness. Describes general methods of point and interval parameter estimation and the small and large sample properties of estimators: method of moments, maximum likelihood, unbiased estimation, Rao-Blackwell and Lehmann-Scheffe theorems, information inequality, asymptotic relative efficiency of estimators. Describes general methods of hypothesis testing and optimality properties of tests: Neyman-Pearson theory, likelihood ratio tests, score and Wald tests, uniformly and locally most powerful tests, asymptotic relative efficiency of tests.
BIO 232
Methods I
Introductory methods course aimed at first year Biostatistics students. The course deals mainly with the linear regression family of models including simple linear regression, analysis of variance and one- and two-sample t-tests. Robust alternatives such as the Wilcoxon signed rank test will be treated. Exploratory data analysis, model formulation and fitting, diagnosis, interpretation and graphical display of results will be emphasized. Some of the underlying theory will also be given. Several public health problems and data sets will be used to illustrate the methods. The use of software packages S-PLUS, Stata and SAS will be described. Background in calculus, linear algebra and introductory statistics is required.
BIO 233
Methods II
This course focuses on the analysis of categorical data and count data, and provides an introduction to methods for analysis of survival data. Topics include a review of sampling plans, analysis of contingency tables, large sample and exact methods for constructing confidence intervals and hypothesis tests, measures of association, logistic regression, and log-linear analysis. Survival topics will include estimation of survival distributions, comparison of groups, and regression models such as the Cox proportional hazards model and the accelerated failure time models.
BIO 234
Research Synthesis & Meta-Analysis in Public Health
Concerned with the use of existing data to inform clinical decision making and health care policy, the course focuses on research synthesis (meta-analysis). The principles of meta-analytic statistical methods are reviewed and the application of these to data sets is explored. Application of methods includes considerations for clinical trials and observational studies. The use of meta-analysis to explore data and identify sources of variation among studies is emphasized, as is the use of meta-analysis to identify future research questions.
BIO 235
Regression and Analysis of Variance
Advanced course in data analysis for linear models - regression and analysis of variance. Estimation methods (maximum likelihood and least squares) and issues of inference (confidence intervals, hypothesis testing, analysis of residuals) are presented from a theoretical and data analysis perspective. Background in matrix algebra and linear regression required.
BIO 240
Sample Surveys
Methods for design and analysis of sample surveys. A brief introduction to questionnaire design and evaluation will be followed by a discussion of sample design techniques. Estimation methods, including calculation and use of sampling weights, and variance estimation methods.
Zaslavsky Lagakos
BIO 243
Nonparametric Methods
Presents the theory and application of nonparametric methods. Topics include permutation tests, permutation limit theorems, 2-sample rank tests and their asymptotic efficiency, k-sample rank tests, 1-sample tests of location, paired comparisons, rank tests for symmetry and independence, and analogues of linear modeling based on ranks.
BIO 244
Analysis of Failure Time Data
Discusses the theoretical basis of concepts and methodologies associated with survival data and censoring, nonparametric tests, and competing risk models. Much of the theory is developed using counting processes and martingale methods.

BIO 245
Analysis of Multivariate and Longitudinal Data
Presents classical and modern approaches to the analysis of multivariate observations, repeated measures, and longitudinal data. Topics include the multivariate normal distribution, Hotelling's T2, MANOVA, the multivariate linear model, random effects and growth curve models, generalized estimating equations, statistical analysis of multivariate categorical outcomes, and estimation with missing data. Discusses computational issues for both traditional and new methodologies.
BIO 247
Design of Scientific Investigations
Discusses those aspects of statistical theory and practice relevant to the design of scientific investigations in the health sciences. Topics include sample size considerations, basic principles of experimental design (randomization, replication, and balance), block designs, factorial experiments, response surface modeling, clinical trials, adaptive designs, cohort studies, early detection trials, and double sampling techniques.

BIO 248
Advanced Statistical Computing
Course in computing algorithms useful in statistical research and advanced statistical applications. Topics include computer arithmetic, matrix algebra, numerical optimization methods with application to maximum likelihood estimation and GEEs, spline smoothing and penalized likelihood, numerical integration, random number generation and simulation methods, Gibbs sampling, bootstrap methods, missing data problems and EM, imputation, data augmentation algorithms, and Fourier transforms. Students should be proficient with C or Fortran programming.
BIO 249
Bayesian Methods in Biostatistics
This course examines basic aspects of the Bayesian paradigm including Bayes theorem, decision theory, general principles (likelihood, exchangeability, de Finetti's theorem), prior distributions (conjugate, non-conjugate, reference), single-parameter models (binomial, poisson, normal), multi-parameter models (normal, multinomial, linear regression, general linear model, hierarchical regression), inference (exact, normal approximations, non-normal approximations, non-normal iterative approximations), computation (Monte Carlo, convergence diagnostics), model diagnostics (Bayes factors, predictive ordinates), design, and empirical Bayes methods.
BIO 250
Probability Theory and Applications II
Sequel to BIO230, covering a variety of advanced topics in probability theory. Topics include a brief overview of measure theory integration, convergence on sequences of random variables and stochastic processes, limit theorems, projections, and conditional expectation.
BIO 251
Statistical Inference II
Sequel to BIO231. Presents advanced topics in statistical inference, including limit theorems, multivariate delta method, properties of maximum likelihood estimators, saddlepoint approximations, asymptotic relative efficiency, robust and rank-based procedures, resampling methods, and nonparametric curve estimation.
BIO 262
Statistical Problems in Drug Development
This course will introduce the student to the "real life" applications of statistical methodology required for pharmaceutical drug development. Weekly seminars will cover statistical techniques used in the various phases of drug development, including assessment of pharmacologic activity; preclinical animal models and toxicology studies; clinical trials (Phase I dose ranging through Phase III comparative efficacy trials); and post-surveillance, pharmacoepidemiologic and pharmacoeconomic studies. Statistical techniques and examples include applications of optimum screening designs, use of non-parametric estimators, problems of multiplicity, tests for monotonicity, parametric and nonparametric regression, ordered categorical data analysis, survival methods, issues of power and sample size, bioequivalence studies, longitudinal data analysis, univariate and multivariate general linear models, multiple endpoint problems and quality-of-life measurement models. Exposure to linear models and non-parametric statistics recommended.

BIO 263
Computational Methods for Categorical Data Analysis
This course deals with exact nonparametric methods of inference. These methods use fast numerical algorithms to permute the observed data in all possible ways, and thereby derive exact distributions for the test statistics of interest without making any distributional or large-sample assumptions. Exact nonparametric methods are particularly important for small, sparse or unbalanced data where the usual asymptotic theory breaks down. This course will cover exact inference for one, two and K-sample problems, ordered and unordered RxC contingency tables, 2x2 and 2xC contingency tables with or without stratification, and logistic regression. A unified view, encompassing both continuous and categorical data, will be presented based on the permutation principle. Modern algorithmic advances that make exact permutational inference computationally feasible will be treated in depth. The methods will be illustrated by several biomedical data sets. This course will use StatXact and LogXact statistical packages.
BIO 268
Seminar on Statistical Methods in Human Genetics
This course provides a self-contained introduction to statistical genetics. Emphasis is placed on modern methods for gene mapping. Topics include Hardy-Weinberg disequilibrium, estimation of allele frequencies, linkage analysis, association analysis and haplotypes. Genetic concepts will be introduced as necessary.

BIO 270
Statistical Science Outreach
Seminar for broadening the background of students in probability and statistics. Students will give short presentations from expository articles and papers. This course is suitable for students in any year of the Biostatistics program.
Gelman DeGruttola
BIO 271
Statistical Computing Environments
Acquaints students with modern computing environments needed for careers in biostatistics. Course contains lectures and computer labs, with guest lecturers. Topics include: programming environments in statistics, algorithmic and symbolic mathematics, source language programming and its tools, editors, typesetters, Internet tools, UNIX and other tools that have potential for research in and practice of statistics.

BIO 274
Applied Stochastic Processes and Models in Public Health
Course develops stochastic processes for modeling important problems in public health. Among the topics to be covered are: Poisson processes, birth and death processes, Markov chains and processes, semi-Markov processes. Applications will be made to models of prevalence and incidence of disease, therapeutic clinical trials, clinical trials for prevention of disease, length biased sampling, models for early detection of disease, cell kinetics and family history problems.
BIO 276
Sequential Analysis
This course covers the basic theory underlying the design and interim monitoring of group sequential clinical trials and illustrates the theory with examples taken from real clinical trials. Topics include: 1) Distribution theory for sequentially computed test statistics derived from normal binomial and time-to-event data; 2) General distribution theory based on the theorem of Scharfstein, Tsiatis and Robbins (JASA, 1997); 3) The recursive integration algorithm for computing boundary crossing probabilities for processes of independent increments; 4) Stopping boundaries -- Haybittle-Peto p-value boundaries, Wang-Tsiatis power boundaries, efficacy boundaries derived from alpha spending functions, futility boundaries derived from beta spending functions; 5) Design of superiority trials with normal, binomial and time-to-event endpoints; 6) Design of non-inferiority trials with normal, binomial and time to event endpoints; 7) Design of maximum information trials; 8) Interim monitoring -- recomputing stopping boundaries from error spending functions, optimal placing of last look, conditional power, repeated confidence intervals, p-values, point estimates and confidence intervals adjusted for multiple looks; 9) Adaptive designs.
Mehta Betensky
BIO 279
Smoothing in Biostatistical Modeling
Smoothing is means by which non-linear structure can be incorporated into a statistical model without the need for parametric modeling. This course will describe some of the main smoothing techniques and illustrate their use in biostatistical modeling. Computational and some theoretical issues will also be discussed. The package S-PLUS will be used.
BIO 284
Spatial Statistics for Health Research
This course will introduce students to a broad range of topics in spatial statistics, including but not limited to types of spatial data, kriging, parametric and non-parametric methods, tests for spatial randomness. The course will draw on many real examples. Students will become proficient in the use of Splus SpatialStats and ARCView.

BIO 288
Semiparametric Methods for Analysis of Missing and Censored Data
Discusses estimation techniques for low dimensional parameters of semiparametric models (i.e. models with infinite dimensional nuisance parameters) for complex longitudinal data subject to informative censoring or missingness. The course will start with the discussion of the fundamental notions and results of semiparametric theory: pathwise derivatives, tangent space, semiparametric variance and information bounds, and influence functions. It will then provide a general estimating function methodology for locally semiparametric efficient estimation and doubly robust estimation under data that are coarsened at random. This general methodology will then be applied to derive locally efficient doubly robust estimators of 1) regression parameters in multivariate generalized linear models subject to missing at random data, 2) the survival function of an endpoint subject to dependent right censoring, 3) the quality of life adjusted survival time subject to dependent right censoring 4) the survival function of multivariate failure time data subject to univariate (dependent) censoring, 5) Cox regression parameters based on dependent right censored data and 6) smooth parameters of the distribution of a time to an endpoint outcome based on current status data and interval censored data.
Rotnitzky Robins


Contact me: John_Willett@Harvard.Edu

Page last updated: May 31, 2005

Read HGSE Publishing Policies & Disclaimers
President & Fellows of Harvard College