# Statistics Colloquium: Dr. James Livesey

## U.S. Census Bureau

11:00 AM - 12:00 PM

**Title: ****An Overlap Measure of Classification Performance**

**Abstract: **The receiver operating characteristic (ROC) curve is a popular and important method used for diagnostic cut-point/threshold determination for K = 2 groups and M = 1 variable. The area under the ROC curve (AUROC) is an overall measure of the diagnostic accuracy of the classification procedure. Giacoletti and Heyse (2011) recommended nonparametric kernel density estimation and a measure of distribution overlap as a useful supplement to ROC/AUROC methods. Distribution overlap was introduced by Bradley (1985) as the area under two probability density functions and ranges from 0 (nonoverlapping) to 1 (identical densities). Royston and Altman (2010) recently described the use of the overlapping coefficient for assessing discrimination in the logistic regression model. For K > 2 groups the ROC/AUROC is not readily scalable, and measures of the overlapping coefficient have not been fully developed. This presentation will develop overlap measures for the K > 2 groups and M > 1 variables and consider them as performance metrics for classification. The methods will be applied to two real data examples, including the famous Fisher Iris data to illustrate the favorable interpretive properties and effective data visualization.