Two Example Annotated Areas of Whole-Slide Images Taken From the CAMELYON16 DatasetĮFigure 2. This mean is obtained by joining all the diagnoses of the pathologists WTC and computing the resulting ROC curve as if it were 1 person analyzing 11 × 129 = 1419 cases.ĮFigure 1. B, The mean ROC curve was computed using the pooled mean technique. To generate estimates of sensitivity and specificity for each pathologist, negative was defined as confidence levels of definitely normal and probably normal all others as positive. All the pathologists WTC scored glass slide images using 5 levels of confidence: definitely normal, probably normal, equivocal, probably tumor, definitely tumor. The top 2 deep learning–based systems outperform all the pathologists WTC in this study. A, A machine-learning system achieves superior performance to a pathologist if the operating point of the pathologist lies below the ROC curve of the system. Task 2 was measured on the 129 whole-slide images (for algorithms and the pathologist WTC) and corresponding glass slides (for 11 pathologists WOTC) in the test data set, which 49 contained metastatic regions. The blue in the axes on the left panels correspond with the blue on the axes in the right panels. AUC indicates area under the receiver operating characteristic curve CAMELYON16, Cancer Metastases in Lymph Nodes Challenge 2016 CULab, Chinese University Lab HMS, Harvard Medical School MGH, Massachusetts General Hospital MIT, Massachusetts Institute of Technology WOTC, without time constraint WTC, with time constraint ROC, receiver operator characteristic.