Applied Longitudinal Data Analysis for Epidemiology
A Practical Guide
In this book the most important techniques available for longitudinal data analysis
are discussed. This discussion includes simple techniques such as the paired t-test
and summary statistics, and also more sophisticated techniques such as generalized
estimating
equations and random coefficient analysis. A distinction is made between
longitudinal analysis with continuous, dichotomous, and categorical outcome vari-
ables. The emphasis of the discussion lies in the interpretation of the different tech-
niquesandthecomparisonoftheresultsofdifferent techniques.Furthermore,special
chapters deal with the analysis of two measurements, experimental studies and the
problem of missing data in longitudinal studies. Finally, an extensive overview of
(and a comparison between) different software packages is provided. This practical
guide is suitable for non-statisticians and researchers working with longitudinal data
from epidemiolog
ical and clinical studies.
Dr Jos W. R. Twisk is senior researcher and lecturer in the Department of Clinical
Epidemiology and Biostatistics and the Institute for Research in Extramural
Medicine, Vrije Universiteit, Medical Centre, Amsterdam
© Cambridge University Press www.cambridge.org
Cambridge University Press
0521819768 - Applied Longitudinal Data Analysis for Epidemiology: A Practical Guide
Jos W. R. Twisk
Frontmatter
More information
Applied Longitudinal Data
Analysis for Epidemiology
A Practical Guide
JosW.R.Twisk
Vrije Universiteit Medical Centre, Amsterdam
© Cambridge University Press www.cambridge.org
Cambridge University Press
0521819768 - Applied Longitudinal Data Analysis for Epidemiology: A Practical Guide
Jos W. R. Twisk
Frontmatter
More information
published by the press syndicate of the university of cambridge
The Pitt Building, Trumpington Street, Cambridge, United Kingdom
cambridge university press
The Edinburgh Building, Cambridge CB2 2RU, UK
40 West 20th Street, New York, NY 10011-4211, USA
477 Williamstown Road, Port Melbourne, VIC 3207, Australia
Ruiz de Alarc
´
on 13, 28014 Madrid, Spain
Dock House, The Waterfront, Cape Town 8001, South Africa
http://www.cambridge.org
C
Jos W. R. Twisk 2003
This book is in copyright. Subject to statutory exception
and to the provisions of relevant collective licensing agreements,
no reproduction of any part may take place without
the written permission of Cambridge University Press.
First published 2003
Printed in the United Kingdom at the University Press, Cambridge
Typefaces Minion 11/14.5 pt and Formata System L
A
T
E
X2
ε
[tb]
A catalogue record for this book is available from the British Library
Library of Congress Cataloguing in Publication data
Twisk, Jos W R, 1962–
Applied longitudinal data analysis for epidemiology:apractical guide / Jos WR Twisk.
p. cm.
Includes bibliographical references and index.
ISBN 0-521-81976-8 (hbk). ISBN 0-521-52580-2 (pbk.)
1. Epidemiology Research Statistical methods. 2. Epidemiology Longitudinal studies.
3. Epidemiology Statistical methods. I. Title.
RA652.2.M3 T95 2002
614.4
07
27 dc21 2002023437
ISBN 0 521 81976 8 hardback
ISBN 0 521 52580 2 paperback
© Cambridge University Press www.cambridge.org
Cambridge University Press
0521819768 - Applied Longitudinal Data Analysis for Epidemiology: A Practical Guide
Jos W. R. Twisk
Frontmatter
More information
The world is turning, I hope it don’t turn away neil young
To Marjon and Mike
© Cambridge University Press www.cambridge.org
Cambridge University Press
0521819768 - Applied Longitudinal Data Analysis for Epidemiology: A Practical Guide
Jos W. R. Twisk
Frontmatter
More information
Contents
Preface page xv
Acknowledgements xvi
1 Introduction 1
1.1 Introduction 1
1.2 General approach 2
1.3 Prior knowledge 2
1.4 Example 3
1.5 Software 5
1.6 Data structure 5
1.7 Statistical notation 5
2 Study design 7
2.1 Introduction 7
2.2 Observational longitudinal studies 9
2.2.1 Period and cohort effects
9
2.2.2 Other confounding
effects
13
2.2.3 Example 14
2.3 Experimental (longitudinal) studies 15
3 Continuous outcome variables 18
3.1 Two measurements 18
3.1.1 Example 20
3.2 Non-parametric equivalent of the paired t-test 21
3.2.1 Example 22
3.3 More than two measurements 23
3.3.1 The ‘univariate approach: a numerical example 26
3.3.2 The shape of the relationship between an outcome
variable and time
29
vii
© Cambridge University Press www.cambridge.org
Cambridge University Press
0521819768 - Applied Longitudinal Data Analysis for Epidemiology: A Practical Guide
Jos W. R. Twisk
Frontmatter
More information
viii Contents
3.3.3 A numerical example 30
3.3.4 Example 32
3.4 The ‘univariate or the ‘multivariate’ approach?
37
3.5 Comparing groups 38
3.5.1 The ‘univariate approach: a numerical example 39
3.5.2 Example
41
3.6 Comments 45
3.7 Post-hoc procedures 46
3.7.1 Example 47
3.8 Different contrasts 48
3.8.1 Example 49
3.9 Non-parametric equivalent of MANOVA for repeated measurements 52
3.9.1 Example 53
4 Continuous outcome variables relationships with other variables 55
4.1 Introduction 55
4.2 ‘Traditional’ methods 55
4.3 Example 57
4.4 Longitudinal methods 60
4.5
Generalized estimating equations
62
4.5.1 Introduction
62
4.5.2 Working correlation structures 62
4.5.3 Interpretation of the regression coefficients derived from GEE analysis 66
4.5.4 Example
68
4.5.4.1 Introduction 68
4.5.4.2 Results of a GEE analysis 69
4.5.4.3 Different correlation structures 72
4.5.4.4 Unequally spaced time intervals 75
4.6 Random coefficient analysis 77
4.6.1 Introduction 77
4.6.2 Random coefficient analysis in longitudinal studies 77
4.6.3 Example 80
4.6.3.1 Results of a random coefficient analysis 80
4.6.3.2 Unequally spaced time intervals 88
4.6.4 Comments 88
4.7 Comparison betwe
en GEE analysis and random coef
ficient analysis 91
4.7.1 Extensions of random coefficient analysis 92
© Cambridge University Press www.cambridge.org
Cambridge University Press
0521819768 - Applied Longitudinal Data Analysis for Epidemiology: A Practical Guide
Jos W. R. Twisk
Frontmatter
More information
ix Contents
4.7.2 Equal variances over time 92
4.7.2.1 A numerical example 93
4.7.3 The correction for covariance
93
4.7.4 Comments 95
4.8 The modelling of time 95
4.8.1 Example
98
5 Other possibilities for modelling longitudinal data 102
5.1 Introduction 102
5.2 Alternative models 102
5.2.1 Time-lag model
102
5.2.2 Modelling of changes 105
5.2.3 Autoregressive model 107
5.2.4 Overview 108
5.2.5 Example 108
5.2.5.1 Introduction 108
5.2.5.2 Data structure for alternative models 109
5.2.5.3 GEE analysis 109
5.2.5.4 Random coefficient analysis 112
5.3
Comments
114
5.4 Another example
118
6 Dichotomous outcome variables 120
6.1 Simple methods 120
6.1.1 Two measurements 120
6.1.2 More than two measurements 122
6.1.3 Comparing groups 122
6.1.4 Example 123
6.1.4.1 Introduction 123
6.1.4.2 Development over time 123
6.1.4.3 Comparing groups 126
6.2 Relationships with other variables 128
6.2.1 ‘Traditional’ methods 128
6.2.2 Example 128
6.2.3 Sophisticated methods 129
6.2.4 Example 131
6.2.4.1 Generalized estimating equations 131
6.2.4.2 Random coefficient analysis 137
© Cambridge University Press www.cambridge.org
Cambridge University Press
0521819768 - Applied Longitudinal Data Analysis for Epidemiology: A Practical Guide
Jos W. R. Twisk
Frontmatter
More information
x Contents
6.2.5 Comparison between GEE analysis and random coefficient analysis 140
6.2.6 Alternative models 143
6.2.7 Comments 144
7 Categorical and count’ outcome variables 145
7.1 Categorical outcome variables 145
7.1.1 Two measurements
145
7.1.2 Mo
re than two measurements
146
7.1.3 Comparing groups 147
7.1.4 Example 147
7.1.5 Relationships with other variables
151
7.1.5.1 ‘Traditional’ methods 151
7.1.5.2 Example 151
7.1.5.3 Sophisticated methods 152
7.1.5.4 Example 153
7.2 ‘Count outcome variables 156
7.2.1 Example 157
7.2.1.1 Introduction 157
7.2.1.2 GEE analysis 158
7.2.1.3
Random coef
ficient
analysis
163
7.2.2 Comparison betwe
en GEE analysis and random coef
ficient analysis
165
8 Longitudinal studies with two measurements: the definition and
analysis of change 167
8.1 Introduction 167
8.2 Continuous outcome variables 167
8.2.1 A numerical example 171
8.2.2 Example 173
8.3 Dichotomous and categorical outcome variables 175
8.3.1 Example 175
8.4 Comments 177
8.5 Sophisticated analyses 178
8.6 Conclusions 178
9 Analysis of experimental studies 179
9.1 Introduction 179
9.2 Example with a continuous outcome variable 181
9.2.1 Introduction 181
© Cambridge University Press www.cambridge.org
Cambridge University Press
0521819768 - Applied Longitudinal Data Analysis for Epidemiology: A Practical Guide
Jos W. R. Twisk
Frontmatter
More information
xi Contents
9.2.2 Simple analysis 182
9.2.3 Summary statistics 184
9.2.4 MANOVA for repeated measurements
185
9.2.4.1 MANOVA for repeated measurements corrected for the
baseline value
186
9.2.5 Sophisticated analysis
188
9.3 Example with a dichotomous outcome variable 195
9.3.1 Introduction 195
9.3.2 Simple analysis 195
9.3.3 Sophisticated analysis 196
9.4 Comments 200
10 Missing data in longitudinal studies 202
10.1 Introduction 202
10.2 Ignorable or informative missing data? 204
10.3 Example 205
10.3.1 Generating datasets with missing data 205
10.3.2 Analysis of determinants for missing data 206
10.4 Analysis performed on datasets with missing data 207
10.4.1
Example
208
10.5 Comments 212
10.6 Imputation methods 213
10.6.1 Continuous outcome variables 213
10.6.1.1 Cross-sectional imputation methods
213
10.6.1.2 Longitudinal imputation methods 213
10.6.1.3 Multiple imputation method 214
10.6.2 Dichotomous and categorical outcome variables 216
10.6.3 Example 216
10.6.3.1 Continuous outcome variables 216
10.6.3.2 Dichotomous outcome variables 219
10.6.4 Comments 221
10.7 Alternative appr
oaches
223
10.8 Conclusions 223
11 Tracking 225
11.1 Introduction 225
11.2 Continuous outcome variables 225
11.3 Dichotomous and categorical outcome variables 230
© Cambridge University Press www.cambridge.org
Cambridge University Press
0521819768 - Applied Longitudinal Data Analysis for Epidemiology: A Practical Guide
Jos W. R. Twisk
Frontmatter
More information
xii Contents
11.4 Example 234
11.4.1 Two measurements 235
11.4.2 More than two measurements
237
11.5 Comments 238
11.5.1 Interpretation of tracking coefficients 238
11.5.2 Risk factors for chronic diseases
239
11.5.3 Grouping of continuous outcome variables 239
11.6 Conclusions 240
12 Software for longitudinal data analysis 241
12.1 Introduction
241
12.2 GEE analysis with continuous outcome variables 241
12.2.1 STATA 241
12.2.2 SAS 243
12.2.3 S-PLUS 244
12.2.4 Overview 246
12.3 GEE analysis with dichotomous outcome variables 247
12.3.1 STATA 247
12.3.2 SAS 248
12.3.3
S-PLUS
249
12.3.4 Overview
250
12.4 Random coefficient analysis with continuous outcome variables 250
12.4.1 STATA 250
12.4.2 SAS
251
12.4.3 S-PLUS 255
12.4.4 SPSS 257
12.4.5 MLwiN 259
12.4.6 Overview 262
12.5 Random coefficient analysis with dichotomous outcome variables 263
12.5.1 Introduction 263
12.5.2 STATA 264
12.5.3 SAS 265
12.5.4 MLwiN 269
12.5.5 Overview 270
12.6 Categorical and ‘count’ outcome variables 271
12.7 Alternative appr
oach using covariance structures
272
12.7.1 Example 274
© Cambridge University Press www.cambridge.org
Cambridge University Press
0521819768 - Applied Longitudinal Data Analysis for Epidemiology: A Practical Guide
Jos W. R. Twisk
Frontmatter
More information
xiii Contents
13 Sample size calculations 280
13.1 Introduction 280
13.2 Example 283
References 286
Index 295
© Cambridge University Press www.cambridge.org
Cambridge University Press
0521819768 - Applied Longitudinal Data Analysis for Epidemiology: A Practical Guide
Jos W. R. Twisk
Frontmatter
More information
Preface
The two most important advantages of this book are (1) the fact that it has
been written by an epidemiologist, and (2) the word ‘applied’, which implies
that the emphasis of this book lies more on the application of statistical tech-
niques for longitudinal data analysis and not so much on the mathematical
background. In most other books on the topic of longitudinal data analysis,
the mathematical background is the major issue, which may not be surprising
since (nearly) all the books on this topic have been written by statisticians.
Although statisticians fully understand the difficult mathematical material
underlying longitudinal data analysis, they often have difficulty in explaining
this complex material in a way that is understandable for the researchers who
have to use the technique or interpret the results. In fact, an epidemiologist is
not primarily interested in the basic (difficult) mathematical background of
the statistical methods, but in finding the answer to a specific research ques-
tion; the epidemiologist wants to know how to apply a statistical technique
and how to interpret the results. Owing to their different basic interests and
different level of thinking, communication problems between statisticians
and epidemiologists are quite common. This, in addition to the growing
interest in longitudinal studies, initiated the writing of this book: a book on
longitudinal data analysis, which is especially suitable for the ‘non-statistical’
researcher (e.g. epidemiologist). The aim of this book is to provide a practi-
cal guide on how to handle epidemiological data from longitudinal studies.
The purpose of this book is to build a bridge over the communication gap
that exists between statisticians and epidemiologists when addressing the
complicated topic of longitudinal data analysis.
Jos Twisk
Amsterdam, January 2002
xv
© Cambridge University Press www.cambridge.org
Cambridge University Press
0521819768 - Applied Longitudinal Data Analysis for Epidemiology: A Practical Guide
Jos W. R. Twisk
Frontmatter
More information
Acknowledgements
I am very grateful to all my colleagues and students who came to me with
(mostly) practical questions on longitudinal data analysis. This book is based
on all those questions. Furthermore, I would like to thank Dick Bezemer,
Maarten Boers, Bernard Uitdehaag and Wieke de Vente who critically read
preliminary drafts of some chapters and provided very helpful comments,
and Faith Maddever who corrected the English language.
xvi
© Cambridge University Press www.cambridge.org
Cambridge University Press
0521819768 - Applied Longitudinal Data Analysis for Epidemiology: A Practical Guide
Jos W. R. Twisk
Frontmatter
More information