The Longitudinal
Analysis for Diverse Populations Project
The “Longitudinal Analysis for
Diverse Populations” (LADP) project R01CA096885 is generously funded by a grant
from the National Institutes
of Health. The project is based at
the
Community
based interventions (e.g. to reduce obesity and increase physical activity) can
play an important role in reducing the risk and overall mortality and morbidity
of diseases such as coronary heart disease and cancer. They are especially
important for African Americans, who are disproportionately at risk for a wide
range of negative health conditions, including cancer of the breast, colon,
esophagus, prostate, pancreas, ant stomach; mortality from cardiovascular
disease; hypertension; and elevated serum cholesterol. This project will
develop more efficient and cost-effective methods for analysis of longitudinal
studies using quasi-least squares (QLS), with special emphasis on studies in
diverse populations. Our aims are: (1) To develop more efficient and
informative methods for analysis in longitudinal studies and community-based
interventions, by applying QLS for a wide range of correlation models not
currently implemented for generalized estimating equations (GEE) and constructing
confidence intervals and tests of hypotheses for the parameters of the new
structures, for data with one or more levels of within-cluster associations
(e.g. both within families and within subjects over time). (2) To develop
methods for planning more powerful studies and taking advantage of re-computing
interim power, by (i) assessing loss in efficiency in
estimation for different study designs and correlation models and (ii)
providing explicit formulas for sample size and power calculations for several
correlation structures, including the structures implemented in Aim 1. This aim
will consider both the regression and the correlation parameters. (3) To apply
our methods in analyses of several studies in female, pediatric, and
African-American Populations at the University of Pennsylvania, to further
refine and tailor their development to the characteristics of data for diverse
populations and to answer new questions that our methods make possible. (4) To
compare and contrast our approaches with alternative methods, including methods
based on random effects models and recent extensions of GEE, via simulations to
assess small sample efficiency and bias and data analyses to compare results of
the different approaches. (5) To implement the methods for analysis (Aim 1) and
planning (Aim 2) in Stata programs, for use by other
statisticians. Further, to widely disseminate the programs, and their
documentation, on a web site developed for this project.
1. Justine Shults, Carissa A. Mazurick, J.
Richard Landis. Analysis of repeated
bouts of measurements in the framework of generalized estimating equations. Statistics in Medicine, 25(23): 4114-4128, 2006.
2. Justine Shults, Wenguang
Sun, Xin Tu, and Jay
Amsterdam. (2006) On the Violation Of Bounds
For The Correlation In Generalized Estimating Equation Analyses Of
Binary Data From Longitudinal Trials, UPenn Biostatistics Working Papers. Working Paper 8. [Available from Bepress]
3. Wenguang Sun, Justine Shults,
and Mary Leonard. (2008) A Note on the Use of Unbiased Estimating Equations to
Estimate Correlation in Analysis of Longitudinal Trials. Biometrical Journal, in press.
An earlier version of
this paper is also available: (2006) Use of Unbiased Estimating Equations to
Estimate Correlation in Generalized Estimating Equation Analysis of
Longitudinal Trials, UPenn Biostatistics Working Papers. Working Paper 4. [Available from Bepress]
4. Sarah J. Ratcliffe
and Justine Shults. GEEQBOX: A Matlab
toolbox for generalized estimating equations and quasi-least squares. The Journal of Statistical Software, 25(14): 1-14, 2008.
5. Justine
Shults, Sarah J. Ratcliffe,
and Mary Leonard. Improved generalized estimating
equation analysis via xtqls for implementation of
quasi-least squares in Stata. The Stata Journal, 7(2): 147-166, 2007. [Available from
Bepress: UPenn Biostatistics Working Papers. Working Paper 13.]
6. Xin
Tu, Jiameng Zhang, Jeanne
Kowalski, Justine Shults, Changyong
Feng, Wenguang Sun and Wan
Tang. (2006) Power analysis for
longitudinal study designs with missing data. Statistics in Medicine (In press).
[Note:
Justine Shults was supported by the LADP project. The
other authors were supported by the following grants and contracts: R01-DA012249, AI-51186, and NIH
contract N01-AI-50029.]
7. Justine Shults
and Sarah J. Ratcliffe. Analysis of multi-level
correlated data in the framework of generalized estimating equations via xtmultcorr procedures in Stata
and qls functions in Matlab.
Statistics and its
Interface. (In press)
An earlier version of
this paper is also available: (2007) UPenn Biostatistics Working Papers. Working Paper
15. [Available from
Bepress]
8. Jichun Xie and
Justine Shults.
Implementation of quasi-least squares with the R package qlspack. Journal of
Statistical Software (in press).
9. Hanjoo Kim,
Joseph Hilbe, and Justine Shults.
(2008). On the designation of the patterned associations for longitudinal
Bernoulli data: weight matrix versus true correlation structure? (under review
at Biometrika)
UPenn Biostatistics Working Papers. Working
Paper 26. [Available from
Bepress]
10. Hanjoo
Kim and Justine Shults. (2008) %QLS SAS Macro: A SAS macro for Analysis of
Longitudinal Data Using Quasi-Least Squares. UPenn Biostatistics Working Papers. Working Paper 27. [Available from
Bepress]
Software for this project
is described and made available in the following publications. By using this
software, you agree to cite the appropriate publication in any manuscripts. To
install the software, simply unzip the files into the working directory for the
analysis. Changing to that directory within Stata or Matlab will make the procedures / functions available.
GEEQBOX: a Matlab toolbox for implementation of quasi-least squares as
described in Publication #4 (Ratcliffe & Shults, 2008). [Download software
and User’s Guide in a .zip file]
xtqls: a Stata procedure for
implementation of quasi-least squares as described in Publication #5 (Shults, Ratcliffe & Leonard,
2007). [Download software
in a .zip file]
xtmultcorr: Stata procedures for the analysis of multi-level correlated
data as described in Publication #7 (Shults & Ratcliffe, 2007). [Download software
in a .zip file]
qlspack: R package for the analysis of
correlated data via quasi-least squares as described in Publication #8 (Xie & Shults, 2008). [Available on
CRAN web-site]
%QLS: A SAS
macro for analysis of longitudinal data via quasi-lease squares as described in
Publication #10 (Kim & Shults, 2008). [Download
software in a .zip file]
Please write to
September 7, 2004 “Analysis of Repeated Bouts of Measurements
in the Framework of Generalized Estimating Equations”,
Cincinnati Children’s
November 19, 2004 “Analysis of Repeated Bouts of Measurements
in the Framework of Generalized Estimating Equations”,
Department of Mathematics and Statistics,
January 22, 2005 “Analysis of Dichotomous Outcomes Using
Quasi-Least Squares”,
Department of Mathematics and Statistics,
March 9, 2006 “A
Generalized Estimating Equation Analysis of a Clinical Trial to Compare Venlafaxine with Lithium in the Treatment of Major
Depressive Episode”,
Cincinnati Children’s
September 29, 2006 “Improved Generalized Estimating Equation
Analysis via Application of Quasi-Least Squares”,
Department of Biostatistics,
November 10, 2006 “Improved Generalized Estimating Equation
Analysis via Application of Quasi-Least Squares”,
Department of Statistics, George
March, 2007 “On the Impact and Likelihood of a Violation of
Bounds for the Correlation in GEE Analyses of Binary Data for Longitudinal Data
and What we Can Do to Address this Problem."
ENAR,
August, 2008 “Improved Analysis of Weight-Loss Interventions for
African-American Women”,
Joint Statistical Meetings (JSM),
March, 2005 “Adjusted
Quasi-Least Squares for Valid Analysis of Correlated Binary Data”
ENAR,
March, 2005 “Comparison
of Wang-Carey Estimation Versus Quasi-Least Squares”
ENAR,
August, 2005 “Adjusted Quasi-Least Squares for Analysis of
Correlated Binary Data”
JSM,
August, 2005 “Comparison of Wang-Carey Estimation Versus Quasi-Least
Squares”
JSM,
March, 2007 “Implementation of a New
Correlation Structure in Framework of GEE with R Software."
ENAR,
August, 2007 “Implementation of an Extended
Familial Correlation Structure with Quasi-Least Squares."
JSM,
August, 2008 “Demonstration of User-Written
Software in Stata, R, and Matlab
for Analysis of a Familial Correlation Structure in a Quasi-Least Squares
Analysis of Weight Loss in the SHARE Study” [Click
here for the talk and to obtain information regarding requests for the
software, which is provided on the last slide of the talk]
JSM, Denver,
CO; Xiaoying Wu, Jichun Xie, Sarah Ratcliffe, Shiriki Kumanyika, and Justine Shults.
March, 2009 Title
of Session: “From Theory to Practice: Examples and Discussion of Softwarea Development for Wide Dissemination of Statistical
Methodology.:
Organizer
& Chair: Justine Shults, PhD.
Speakers: John S Preisser, PhD; Peter XK
Song, PhD; Joseph M Hilbe, PhD.
Discussant: Anthony
Rossini, PhD.
Conference: ENAR
2009,
Investigators on this project
include the following faculty members in the Department of Biostatistics and
Epidemiology, Center for Clinical Epidemiology and Biostatistics (CCEB), University of Pennsylvania
School of Medicine,
Investigator Role
on Project
Justine Shults, PhD Principal
Investigator
Scarlett Bellamy, ScD Co-Investigator
Shiriki
Kumanyika, PhD, MPH Co-Investigator
Sarah J. Ratcliffe, PhD Co-Investigator
Thomas Ten Have, PhD,
MPH Co-Investigator
Jinbo
Chen, PhD Co-Investigator
(joined during year 3)
Graduate
Students in the Department of Biostatistics
within the CCEB who have worked on this project include (Biostatistics graduate
program):
Wenguang Sun, MS completed his Master’s Project in
Biostatistics on problems stemming from this project. Publication #3 describes
his research; plus see the list of contributed talks for two presentations of
this research, at ENAR 2005 and JSM 2005. (Update:
Wenguang Sun, PhD is currently an Assistant Professor
on the tenure track in the Department of Statistics at
Jichun Xie,
BS studied at
Hanjoo Kim, BS studied at
Student
Interns who have worked on this project
include:
George Wang (Summer 2008), a Statistics major at
This web-page is
maintained by Justine Shults, PhD & Sarah Ratcliffe, PhD. Please e-mail jshults@cceb.med.upenn.edu or sratclif@cceb.med.upenn.edu for
any questions or comments that you might have. We welcome your feedback and
interest in our research.
Last updated: 22 August 2008