Age and Ageing Advance Access originally published online on June 13, 2006
Age and Ageing 2006 35(5):497-502; doi:10.1093/ageing/afl055
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
The Parkinsons Disease Questionnaire (PDQ-39): evidence for a method of imputing missing data
1 University of Oxford, Department of Public Health, Old Road Campus, Headington, Oxford OX3 7LF, UK
2 Harris Manchester College, Oxford OX1 3TD, UK
3 Nuffield College, Oxford OX1 1NF, UK
Address correspondence to: C. Jenkinson, Health Services Research Unit, Department of Public Health, Old Road Campus, Headington, Oxford OX3 7LF, UK. Tel: (+44) 1865 226857. Fax: (+44) 1865 226711. Email: crispin.jenkinson{at}dphpc.ox.ac.uk
| Abstract |
|---|
|
|
|---|
Background: the Parkinsons Disease Questionnaire (PDQ-39) is the most widely used Parkinsons specific measure of health status. It is increasingly used in treatment trials, sometimes as a primary end-point, where any missing data can potentially cause difficulties in analyses.
Objectives: the purpose of this article is to evaluate the Expectation Maximisation (EM) algorithm for the imputation of missing dimension scores on the 39-item PDQ-39.
Methods: a postal survey of patients diagnosed with Parkinsons disease (PD). A total of 1,372 patients were surveyed and 839 (61.15%) questionnaires returned completed or partially completed. Of these, complete PDQ data were available in 715 (85.22%) cases. Data were deleted from this complete dataset and a sub-set of 200 respondents from this dataset and then imputed using the EM algorithm; results were then compared to the dataset before data deletion.
Results: results gained from imputation of data closely mirrored that of the complete dataset in each case. Descriptive statistics, mean scores and spread of scores were almost identical between original and imputed datasets. Furthermore, original and imputed datasets were highly correlated [intra-class correlation coefficient (ICC) = 0.93 or greater], and mean differences were small (±1.00).
Conclusions: the results suggest that the use of EM for the PDQ-39 provides data that closely mirrors the original when this has been deliberately removed. Consequently, EM is likely to be appropriate for trials using the PDQ that contains missing data points.
Keywords: Parkinsons Disease Questionnaire, health status measurement, missing data, data imputation, elderly, PDQ-39
| Introduction |
|---|
|
|
|---|
The Parkinsons Disease Questionnaire (PDQ-39) [1, 2] is the most widely used disease-specific measure of health status in Parkinsons disease (PD) and has been recommended as the most comprehensively validated of competing PD-specific outcome measures [3, 4].
The instrument was developed in the United Kingdom but has been translated into over 30 languages [510] and has been used in both single-country [11] and cross-cultural trials [12]. Typically, the instrument has relatively low levels of missing data, but inevitably with questionnaire-based outcome instruments, some data will be missing. Rigorous analyses of the data quality of patient-completed questionnaires are becoming more common [10, 1315], but an area that can cause concern in surveys using health status measures, and especially in clinical trials, is that of missing data [16, 17]. In trials, missing data can potentially cause both loss of power and bias [18]. The purpose of this article is to evaluate a simple missing data algorithm for the imputation of missing dimension scores on the 39-item PDQ-39.
One of the most widely used techniques for data imputation currently used within the social sciences is the Expectation Maximisation (EM) computational algorithm for multiple imputations [19]. The EM approach is an iterative procedure with two discrete steps. First, the expectation step computes the expected value of the complete data log likelihood based upon the complete data cases and the algorithms best guess as to what the sufficient statistical functions are for the missing data based upon the existing data points. The maximisation step procedure substitutes the expected values (typically means and covariances) for the missing data obtained from the expectation step and then maximises the likelihood function as if no data were missing to obtain new parameter estimates. The new parameter estimates are substituted back into the expectation step, and a new maximisation step is performed. The procedure iterates through these two steps until convergence is obtained. Convergence occurs when the change of the parameter estimates from iteration to iteration becomes negligible [20]. EM has a number of advantages over rival imputation methods based on multiple regression, not least that it is an established methodology that makes fewer demands of data in terms of statistical assumptions and appears, in general, to be more accurate [18]. Furthermore, EM is implemented in a wide variety of computer packages including popular software such as SPSS.
The purpose of this article is to explore the use of EM for the imputation of missing dimension scores on data gained from the PDQ-39.
| Methods |
|---|
|
|
|---|
Analysis plan
Data from a survey of patients registered with the Parkinsons Disease Society were analysed to determine whether EM could accurately estimate mean scores and distributions on the eight dimensions of the PDQ-39. This was achieved in a six-stage process mentioned below:
- Any case from the dataset that contained any missing PDQ-39 dimension scores was deleted. This led to a dataset that had a complete set of scores for all respondents on all the eight dimensions of the PDQ.
- Dimension scores were deleted from this newly created sub-set of the original dataset, reflecting the manner in which data had been missing in the original complete dataset (i.e. the pattern of missing data was identical to the original data, although the cases from which deletions were made were selected randomly).
- The data that had been deliberately removed were then imputed using EM assuming a normal multivariate distribution, and mean scores, standard deviations, median and 25th and 75th percentiles for each dimension were calculated; these results were compared to the original scores before the deletion of data. Both parametric (t-test) and non-parametric (Wilcoxon) tests were used to determine whether data were statistically significantly different. This procedure was then repeated, but randomly selecting another set of cases for data deletion to check the results of the first analysis.
- Ten per cent of the responses to each PDQ-39 dimension were randomly deleted and then data imputed as before and results compared with the original dataset. Once again this procedure was repeated with another random selection of data, deleted and then imputed.
- It has been suggested that the quality-of-life data are most likely to be missing amongst the most severely ill of respondents [21, 22]. Consequently, another analysis was undertaken restricting the 10% of deletions and imputations to those cases in the bottom third of scores on the PDQ-39 Index [23], which is a summary score calculated on the basis of all the eight dimensions. Once again this procedure was repeated.
- The purpose of this sixth step was to ascertain whether EM is an appropriate algorithm for use in smaller datasets. Consequently, a sub-set of 200 cases of the respondents who had complete data in the original dataset was selected. Ten per cent of data for each dimension were deleted and then imputed using EM as before. This step was then repeated.
The original dataset is referred to here as ORIGINAL.
The dataset derived from ORIGINAL, which contains only cases with complete data for all the eight dimensions of the PDQ-39 is called COMPLETE.
The dataset created from COMPLETE, where PDQ-39 values have been deliberately removed is called VALUES_ REMOVED.
The dataset created from COMPLETE, where deletions are restricted to the most severely ill respondents is called SEVERELY_ILL.
The dataset which contains a small sub-set of COMPLETE is called SMALL.
Survey design
The data analysed here were gained from a postal survey that was conducted of randomly selected members of 13 local branches of the Parkinsons Disease Society. Full details of the survey can be found elsewhere [24].
The Questionnaire
The PDQ-39 comprises 39 questions measuring the eight dimensions of health: mobility, activities of daily living, emotional well-being, stigma, social support, cognition, communication and bodily pain. Dimension scores are coded on a scale of 0 (perfect health as assessed by the measure) to 100 (worst health as assessed by the measure). The development of the instrument and studies confirming the reliability and validity of the instrument are documented elsewhere [1, 23, 25].
| Results |
|---|
|
|
|---|
A total of 1,372 PDS members were surveyed of which 851 (62.03%) questionnaires were returned. Of the 851 questionnaires returned, complete data on the PDQ were available in 715 (84.02%), and it is this sub-set of the data that is analysed here (called COMPLETE). The average age of respondents was 70.48 years (SD 9.57; min = 32.89, max = 90.57, n = 713). A total of 432 (60.42%) of the respondents were male and 283 (39.58%) female. Descriptive statistics for the eight dimensions of the PDQ-39 are summarised in Table 1.
|
The second stage in testing EM on the PDQ-39 involved removing data from COMPLETE. The distribution of missing data in the subsequent dataset, VALUES_REMOVED, replicated the pattern of missing data in the original dataset (ORIGINAL). In the original dataset, 125 cases were missing a score on at least one dimension because of missing data. Consequently, data in COMPLETE were deleted in 125 cases: the cases for deletion were chosen randomly, but the deletion of data exactly mirrored that in the original dataset. The distribution of dimensions with complete data and descriptive statistics for this dataset before imputation are shown in Appendix 1 in the supplementary data on the journal website (http://www.ageing.oupjournals.org/). As the number of cases with missing data in VALUES_REMOVED is exactly the same as in ORIGINAL, the proportion of cases with missing data is somewhat higher. Consequently, VALUES_REMOVED contains a higher proportion of missing data than is typical of PDQ-39 datasets: this strategy was deliberately chosen to test the data imputation algorithm in datasets with higher than typical amounts of missing data. Descriptive statistics are broadly similar to those gained from the dataset as a whole.
The third stage in testing EM on the PDQ-39 involved applying the EM algorithm to the dataset. Descriptive statistics are summarised in Table 2. Mean differences between original and imputed scores for all of the eight dimensions were very small (Table 2). Correlations between the two datasets were high [intra-class correlation coefficient (ICC) = 0.95 or higher, P<0.001].
|
The procedures outlined above were then repeated, but missing data were created for another random sample from COMPLETE to create a new VALUES_REMOVED. Data reflected the results gained above with very small differences for all PDQ-39 dimensions between the original dataset and the dataset containing imputations (<±0.4). Correlations between the original data and the imputed data for each dimension were high (ICC = 0.97 or higher, P<0.001).
The fourth step in testing the EM procedure on the PDQ-39 involved removing 10% of data per dimension (from COMPLETE) and then imputing values for this missing data. This led to 297 (41.54%) cases with complete data, and the remainder requiring at least one-dimension score to be imputed. Descriptive statistics and mean differences between the original data and the imputed dataset are summarised in Table 3, as well as descriptive statistics for the dataset, including the imputations. Mean differences between original and imputed scores for all of the eight dimensions were very small. Correlations between the two datasets were high (ICC = 0.95 or higher, P<0.001). The procedures outlined above were then repeated, but missing data were created for another 10% random sample for each dimension from COMPLETE: only 307 (42.94%) cases had complete data across all dimensions of the PDQ-39. Data reflected the results gained above with mean differences for all PDQ-39 dimensions between the original dataset and the dataset containing imputations being very small (<±0.35). Similarly, results for all the eight dimensions were highly correlated (ICC = 0.95, or higher, P<0.001).
|
The fifth stage in testing the EM algorithm was similar to that mentioned above, but the 10% of cases were all removed from respondents who scored in the bottom third of scores on the PDQ-39 Index, creating a dataset called SEVERELY_ILL. This led to 492 (68.81%) cases with complete data, and the remainder requiring at least one-dimension score to be imputed. Results in this analysis were similar to those from the random deletions undertaken previously. Mean differences between original and imputed scores for all of the eight dimensions were very small (see Appendix 2 in the supplementary data on the journal website http://www.ageing.oupjournals.org/). Correlations between the two datasets were high (ICC = 0.97 or higher, P<0.001). The procedures outlined above were then repeated, but missing data were created for another 10% random sample from the bottom third of scores on the PDQ Index for each dimension from COMPLETE: only 489 (68.39%) cases had complete data across all dimensions of the PDQ-39. Data reflected the results gained above with mean differences for all PDQ-39 dimensions between the original dataset and the dataset containing imputations being very small (<±0.88). Similarly, results for all the eight dimensions were highly correlated (ICC = 0.97 or higher, P<0.001).
In the sixth stage in testing the EM algorithm on the PDQ-39, 200 cases from the original dataset were randomly selected and 10% of responses to each dimension randomly removed from this sub-set and then imputed. This dataset is called DATA_FULLSMALL. The purpose of this procedure was to see whether data imputation works on datasets smaller than those previously analysed here. The results of this procedure are reported in Appendix 3 (in the supplementary data on the journal website http://www.ageing.oupjournals.org/). Only 88 respondents (44.0%) had complete data after random deletion of data. All PDQ-39 dimensions had ranges of 0100 for both the original 200 case dataset and the dataset containing imputations. Results for all the eight dimensions were highly correlated (ICC = 0.93 or higher, P<0.001). The procedures outlined above were then repeated, but missing data were created for another 10% random sample for each dimension from the 200 cases of the sample: only 82 (41%) of the respondents had complete data for the PDQ-39. Data reflected the results gained above, with the mean differences for all PDQ-39 dimensions between the original dataset and the dataset containing imputations being small (<±0.70). Similarly, results for all the eight dimensions were highly correlated (ICC = 0.95 or higher, P<0.001).
| Discussion |
|---|
|
|
|---|
This article has evaluated a widely used missing data algorithm (EM) for use with the PDQ-39. Missing data can lead to problems in analyses of data and, in treatment trials, can cause loss of power and are potentially a source of bias. Furthermore, common use of list-wise deletion (i.e. completely omitting cases with missing data) can not only lead to bias but could be viewed as unethical as patients have been given time to complete the survey. However, it is possible that imputation of values may severely skew results and hence be inappropriate and that simply reporting results based on the data available is the most appropriate solution [26]. However, the latter procedure can lead to reduced sample size and reduced power and hence the decision to undertake the analyses that have been reported in this article.
The procedure used here imputes dimension scores and not individual item responses. This approach was adopted, because previous analyses have shown that the eight dimensions of the PDQ-39 are tapping aspects of a broad general underlying phenomenon (overall health). Indeed, the concept of unidimensionality underlies the PDQ-39 Single Index Score that is calculated by summing scores for the eight dimensions [23]. Consequently, one would expect a pattern to dimension scores, and the results gained here would support this hypothesis. Indeed, differences between the original full dataset and the consequent datasets in which data have been deleted and then imputed are very small and are unlikely to be meaningful [22].
We would not advise using this, or any other data imputation algorithm, on smaller (n < 200) datasets than those presented here. Indeed, the use of EM on small datasets can produce misleading variance to the distribution of data [27]. Furthermore, datasets with very large amounts of missing data (in excess of 10%) must be viewed with caution as this is a high rate of missing data atypical of the PDQ-39 and is likely to reflect some aspects of the sample surveyed or the method of data collection. Indeed, the PDQ-39 exhibits low levels of missing data compared with generic instruments that have been used in elderly samples [28]. However, in the analyses undertaken here, we have adopted conservative estimates (i.e. large) for missing data, and the success in imputing the data scores suggests that the imputation method will work in more typical datasets where less data are likely to be missing. Results reported here provide encouraging evidence of a method of imputation for data on health-related quality of life. Furthermore, we have explored the possibility that missing data are systematically related to the overall health status of individuals, and the algorithm performs well.
In conclusion, we would recommend EM as a method for imputing PDQ data in surveys and trials. The technique is implemented in a wide variety of software packages including, for example, SPSS. However, use of this technique should not be at the expense of trying to determine what the reasons for such missing data may be and whether strategies may exist for improving data collection. Researchers would be advised to undertake a sensitivity analysis by comparing results gained from their original dataset, which includes the missing data, to those gained after implementing the EM algorithm. In any instance where meaningful differences were to be found between the two datasets, this may raise questions about the appropriateness of this technique and would suggest that non-responders may differ to responders in an important respect.
| Key points |
|---|
|
|
|---|
- Missing data on health status questionnaires can reduce sample size and power in trials which include such measures as outcomes.
- The PDQ-39 is one of the most widely used measures of outcome for studies evaluating the impact of PD.
- Reports of missing data on the PDQ-39 are generally low, but ideally such data could be imputed.
- EM is a widely implemented method for imputing data, found in software such as SPSS.
- In this study, EM was found to be an accurate method of imputing missing data on the PDQ-39 in datasets with sample sizes >200 cases.
| Sources of Funding |
|---|
|
|
|---|
We thank the Parkinsons Disease Society for funding the original studies on which the data used in this article are based.
| Conflict of interest |
|---|
|
|
|---|
None.
| References |
|---|
|
|
|---|
- Peto V, Jenkinson C, Fitzpatrick R, Greenhall R. The development and validation of a short measure of functioning and well-being for individuals with Parkinsons disease. Qual Life Res 1995; 4: 2418.[CrossRef][Web of Science][Medline]
- Jenkinson C, Fitzpatrick R, Peto V. The Parkinsons Disease Questionnaire: User Manual for the PDQ-39, PDQ-8 and the PDQ Summary Index. Oxford: University of Oxford Health Services Research Unit, 1998.
- Marinus J, Ramaker C, van-Hilten JJ, Stiggelbout AM. Health related quality of life in Parkinsons disease: a systematic review. J Neurol Neurosurg Psychiatry 2002; 72: 2418.
[Abstract/Free Full Text] - Damiano AM, Snyder C, Strausser B, Willian MK. A review of health related quality-of-life concepts and measures for Parkinsons disease. Qual Life Res 1999; 8: 23543.[CrossRef][Web of Science][Medline]
- Katsarou Z, Bostantjopoulou S, Peto V, Alevriadou A, Kiosseoglou G. Quality of life in Parkinsons disease: Greek translation and validation of the Parkinsons disease questionnaire (PDQ-39). Qual Life Res 2001; 10: 15963.[CrossRef][Web of Science][Medline]
- Martinez-Martin P, Frades-Payo B, Fontan-Tirado C, Martinez-Sarries F, Guerrero M, del-Ser-Quijano T. Valoracion de la calidad de vida en la enfermedad de Parkinson mediante el PDQ-39. Estudio Piloto Neurologia 1997; 12: 5660.
- Martinez-Martin P, Frades B, Jimenez-Jimenez F et al. The PDQ-39 Spanish version: reliability and correlation with the short-form health survey (SF-36). Neurologia 1999; 14: 15963.[Medline]
- Bushnell DM, Martin ML. Quality of life and Parkinsons disease: translation and validation of the US Parkinsons Disease Questionnaire (PDQ-39). Qual Life Res 1999; 8: 34550.[CrossRef][Web of Science][Medline]
- Sobstyl M, Zabek M, Koziara H, Kadziolka B. Evaluation of quality of life in Parkinsons disease treatment. Neurol Neurochir Polska 2003; 37 (Suppl. 5): 22130.
- Jenkinson C, Fitzpatrick R, Norquist J, Findley L, Hughes K. Cross cultural validation of the Parkinsons Disease Questionnaire: tests of data quality, score reliability, response rate and scaling assumptions in America, Canada, Japan, Italy and Spain. J Clin Epidemiol 2003; 56: 8437.[CrossRef][Web of Science][Medline]
- Wade DT, Gage H, Owen C, Trend P, Grossmith C, Kaye J. Multidisciplinary rehabilitation for people with Parkinsons disease: a randomised controlled trial. J Neurol Neurosurg Psychiatry 2003; 74: 15862.
[Abstract/Free Full Text] - Gershanik O, Emre M, Bernhard G, Sauer D. Efficacy and safety of levodopa with entacapone in Parkinsons disease patients suboptimally controlled with levodopa alone, in daily clinical practice. An international, multicentre, open label study. Prog Neuropsychopharmacol Biol Psychiatry 2003; 27: 96371.[CrossRef][Medline]
- Berzon RA. Understanding and using health related quality of life instruments within clinical research studies. In: Staquet MJ, Hays RD, Fayers PM, eds. Quality of Life Assessment in Clinical Trials: Methods and Practice. Oxford: Oxford University Press, 1998.
- Friedman LM, Furberg CD, DeMets DL. Fundamentals of Clinical Trials. New York: Springer, 1998.
- Jenkinson C, Levvy G, Fitzpatrick R, Garratt A. The Amyotrophic Lateral Sclerosis Assessment Questionnaire (ALSAQ-40): tests of data quality, score reliability and response rate in a survey of patients. J Neurol Sci 2000; 180: 94100.[CrossRef][Web of Science][Medline]
- Efficace F, Bottomley A, Vanvoorden V, Blazeby J. Methodological issues in assessing health-related quality of life of colorectal cancer patients in randomised controlled trials. Eur J Cancer 2004; 40: 18797.[CrossRef][Web of Science][Medline]
- Vercherin P, Gutknecht C, Guillemin F, Ecochard R, Mennen L-I, Mercier M. Non-reponses aux questionnaires de qualite de vie SF-36 dans un echantillon de letude SU.VI.MAX. Rev Epidemiol Sante Publique 2003; 51: 51325.[Web of Science][Medline]
- Fairclough D, Peterson H, Chang V. Why are missing quality of life data a problem in clinical trials of cancer therapy? Stat Med 1998; 17: 66777.[CrossRef][Web of Science][Medline]
- Schafer JL. Analysis of Incomplete Multivariate Data. London: Chapman & Hall, 1997.
- Little RJA, Rubin DB. The analysis of social science data with missing values. In: Fox S, Long JS, eds. Modern Methods of Data Analysis. Newbury Park, CA: Sage, 1990.
- Fayers PM, Machin D. Quality of Life. Assessment, Analysis and Interpretation. Chichester: John Wiley and Sons, 2000.
- Kopp I, Lorenz W, Rothmund M, Koller M. Relation between severe illness and non-completion of quality-of-life questionnaires by patients with rectal cancer. J R Soc Med 2003; 96: 4428.
[Abstract/Free Full Text] - Jenkinson C, Fitzpatrick R, Peto V, Greenhall R, Hyman N. The PDQ-39: development of a Parkinsons Disease summary index score. Age Ageing 1997; 26: 3537.
[Abstract/Free Full Text] - Peto V, Jenkinson C, Fitzpatrick R. Determining minimally important differences for the Parkinsons Disease Questionnaire (PDQ-39). Age Ageing 2001; 30: 299302.
[Abstract/Free Full Text] - Peto V, Jenkinson C, Fitzpatrick R. PDQ-39: a review of the development, validation and application of a Parkinsons disease quality of life questionnaire and its associated measures. J Neurol 1998; 245 (Suppl. 1): S1014.
- Von Hippel PT. Biases in SPSS 12.00 Missing Value Analysis. Am Statistician 2004; 58: 1604.
- Gelman A, King G, Lin C. Not asked and not answered: multiple imputation for multiple surveys. J Am Stat Assoc 1999; 93: 84657.[CrossRef]
- Kosinski M, Bayliss M, Bjorner JB, Ware JE. Improving estimates of SF-36 Health Survey scores for respondents with missing data. Medical Outcomes Trust Monitor, 2000; 5: 810.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
D. W. Murray, R. Fitzpatrick, K. Rogers, H. Pandit, D. J. Beard, A. J. Carr, and J. Dawson The use of the Oxford hip and knee scores J Bone Joint Surg Br, August 1, 2007; 89-B(8): 1010 - 1014. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
