Taiwan Images for Malaria Eradication
What is TIME
TIME, aiming to facilitate the development of automated malaria diagnosis, is the first clinically validated database of more than eight thousand digitized thin blood smear images, collected from 36 patients, with reliable annotations of more than 24,000 malaria-infected red blood cells made by experienced microscopists. TIME is intended to provide a publicly available benchmark database for developing and comparing different algorithms of automated malaria diagnosis.
[Read the paper]: link to paper
Malaria, caused by parasites of the genus Plasmodium, has been a serious and re-emerging global health issue. The gold standard for malaria diagnosis is microscopic examination of stained peripheral blood smears. Rapid and accurate diagnosis is key to malaria treatment and prognosis. Much effort has been devoted to establishing automated microscopic diagnostics for malaria, but the challenge of achieving expert-level performance of malaria diagnosis in real-world clinical settings was yet to be unsolved.
In TIME, we provide (1) a large database of highly diverse malaria images (2) high quality reference standards annotated by multiple experienced microscopists, and (3) performance metrics of human experts for comparison.
How did we collect and annotate TIME
All data in TIME were acquired from ex vivo peripheral blood samples of patients notified to Taiwan Centers for Disease Control for suspected malaria infection. Blood smears were prepared in agreement with WHO guidelines. Olympus Virtual Slide Microscopes VS120 was used to scan the blood smears and acquire pictures via its digital camera. A whole slide was sliced into grids of 2 mm x 2 mm, and the 100X objective scope with oil immersion was used to scan the slide into Olympus virtual slide images (.vsi). Olympus VS-ASW software was then used to convert the scanned .vsi files into jpeg image files of 2048 x 2048 pixels.
We devised annotation guidelines according to WHO to standardize the annotation process.
Target of annotation
Bounding boxes are annotated on digitized thin blood smear images for:
1.Confirmed malaria-infected red blood cells, and
2.Highly-suspected malaria-infected blood cells.
The bounding box of a malaria-infected red blood cell is annotated by following the steps below:
Bounding box annotation: Find an infected blood cell including a parasite whose characteristics were identifiable, such as containing nuclear and cytoplasm. Create a bounding box to include the whole infected blood cell. The box should be as close to the exact size of the cell as possible. If the infected blood cell is overlapped with other blood cells, the bounding box should cover the most possible area containing the whole infected blood cell.
Confirm the label for the bounding box: After creating the bounding box, each box is labelled according the species and life-cycle stage of the infecting malaria parasite. Labels included the species of malaria (P. falciparum, P. vivax, P. ) and four different life-cycle stages (ring form/trophozoites, schizont, gametocyte), for example, P. falciparum/ring form or P. vivax/gametocyte. The label “Indeterminate” is used when the blood cell is suspicious for infection but could not be confirmed with full confidence.
Validation of annotation
To establish a baseline understanding of the variation existed among different annotators, we are the first to examine the inter-rater reliability of experts’ annotations. (Please refer to our paper for details.) Because of the moderate–high agreement among our experts’ annotations at both cell level and image level, we are confident of the reliability of the reference ground truth generated in our database. On top of that, for the clinical validation set, to get reference standard with further credibility, a process of majority decision among the panel of experts followed by discussion for adjudication was carried out, which further improved the authenticity of our annotations.
The bounding boxes were annotated with an in-house annotation tool, which can be accessed here.[link to annotation tool]
Our baseline model
In our project, we proposed a new approach using a one-stage detection framework to detect malaria-infected cells and recognize life-cycle stages directly from thin blood smear images. A object detection convolutional neural network was trained with promising sensitivity and specificity in malaria detection, especially in ring form detection. Compared with practicing microscopists, our algorithm demonstrated an AUC of 0.997 with error rate of 3.4% in malaria detection, and an AUC of 0.995 with error rate of 2.4% in ring form detection, indicating expert-level performance.
Use agreement/ License
The data could be freely used under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0).
At the moment, the database is comprised of P. falciparum only, not including other species of malaria (ex. P. vivax, P. ovale) due to local epidemiology. Although P. falciparum has been the most characteristic and severe species of malaria, we are seeking to expand the diversity of malaria species in the database by collaborating with academic institutions and Centers for Disease Control worldwide. If you are interested in collaboration, please contact us: [link to us]
Taiwan Centers for Disease Control | Taiwan AI Labs
Copyright 2019 All right reserved.