The reliability of clinical measurements is critical to medical research and clinical practice. Newly proposed methods are assessed in terms of their reliability, which includes their repeatability, intra- and interobserver reproducibility. In general, new methods that provide repeatable and reproducible results are compared with established methods used clinically. This paper describes common statistical methods for assessing reliability and agreement between methods, including the intraclass correlation coefficient, coefficient of variation, Bland-Altman plot, limits of agreement, percent agreement, and the kappa statistic. These methods are more appropriate for estimating reliability than hypothesis testing or simple correlation methods. However, some methods of reliability, especially unscaled ones, do not clearly define the acceptable level of error in real size and unit. The Bland-Altman plot is more useful for method comparison studies as it assesses the relationship between the differences and the magnitude of paired measurements, bias (as mean difference), and degree of agreement (as limits of agreement) between two methods or conditions (e.g., observers). Caution should be used when handling heteroscedasticity of difference between two measurements, employing the means of repeated measurements by method in methods comparison studies, and comparing reliability between different studies. Additionally, independence in the measuring processes, the combined use of different forms of estimating, clear descriptions of the calculations used to produce indices, and clinical acceptability should be emphasized when assessing reliability and method comparison studies.
Citations
This study aimed at an examination of the item analysis and reliability of the Family screening test which was developed to assess the family's charateristics and the degree of family psychopathology from March 15 to May 30 1996
The subjects were 467 housewives who were sampled by the random sampling method in Seoul. This family screening test is composed of 11 subscales, the self-report measure, and a 5-point scale.
1) 98 items were selected by the item analysis and reliability evaluation among the original 1999 items which were categorized 11 subscales.
2) The internal consistency of the Cronbach's alpha and the speraman-Brown split-half correlation coefficients were more than 0.70 for all scales exept 8,9,10,11 scales.
3) The reliability coefficients of the Cronbach's alpha and the split-half correlation for 8,9,19,11 scales were less than 0.70 which were evaluated to have the insufficient corrected item-total correlation coefficients.
It was suggested that 8,9,10,11 scales need to the most items for increasing the reliability coefficients through the complemantary work.