Research Output

Mining trauma injury data with imputed values.

  Methods for analyzing trauma injury data with missing values, collected at a UK hospital, are reported. One measure of injury severity, the Glasgow coma score, which is known to be associated with patient death, is missing for 12% of patients in the dataset. In order to include these 12% of patients in the analysis, three different data imputation techniques are used to estimate the missing values. The imputed datasets are analyzed by an artificial neural network and logistic regression, and their results compared in terms of sensitivity, specificity, positive predictive value and negative predictive value. Although there is little distinction between results for the three imputation methods for the overall dataset, the hot-deck imputation method appears to give more accurate results than the model-based or propensity score imputation methods, when comparing the subsets of cases including only those patients with imputed Glasgow coma score (GCS) scores. Results show that imputation does not reduce the overall predictive accuracy following a data-mining analysis; demonstrating that all cases may be included when undertaking analysis of these trauma injury data.

  • Type:


  • Date:

    31 October 2009

  • Publication Status:


  • Publisher

    Wiley Inderscience

  • DOI:


  • ISSN:


  • Library of Congress:

    RA Public aspects of medicine

  • Dewey Decimal Classification:

    610.7 Medical education, research & nursing


Penny, K. I. & Chesney, T. (2009). Mining trauma injury data with imputed values. Statistical Analysis and Data Mining. 2, 246-254. doi:10.1002/sam.10044. ISSN 1932-1864



Data mining; artificial neural network; logistic regression; missing data imputation; trauma injury

Monthly Views:

Available Documents