Received date: March 30, 2015; Accepted April 23, 2015; Published April 25, 2015
Citation: Deepak RU, Kumar RR, Byju NB, Sharathkumar PN, Pournami C, et al. (2015) Computer Assisted Pap Smear Analyser for Cervical Cancer Screening using Quantitative Microscopy. J Cytol Histol S3:010. doi: 10.4172/2157-7099.S3-010
Copyright: © 2015 Deepak RU, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Visit for more related articles at Journal of Cytology & Histology
Cervical cancer is the third most common cancer among women. The bulk of the cancer burden is on low and middle income countries where screening is mostly opportunistic rather than systematic. Among the number of screening methods, cytology based screening using Pap smear test is by far the most widely followed and accepted method. In countries where organized screening using Pap test has been introduced, incidence and mortality caused by the disease has significantly subsided. Although the method is effective in controlling the disease, it poses a serious challenge in practical implementation owing to the fact that the method is resource intensive requiring trained professionals skilled enough to identify a handful of abnormal cells among few hundred thousand cells. This motivates the need for automating the screening methodology. Since the 1960-ies numerous projects have developed such automated screening systems leading also to a couple of commercial products. Still these have had limited impact on the screening situation in most of the world. This paper describes a screening system developed by our group in an effort of creating a cost effective screening system that could be widely deployed. The systems digitizes Pap smear slides and carries out cell level and smear level analysis on digitized smear and finally classifies the smear as either normal or suspicious. Clearly normal smears were screened out without any human intervention while suspicious smears were sent for expert cytologist review. A low cost monolayer slide preparation technique has also been identified which produces monolayer slides of quality comparable to that of commercial systems at much lesser cost. The computer aided Pap smear analyzer was validated at the Regional Cancer Centre (RCC), Thiruvananthapuram, India since May 2011. Since then a total of 1107 smears covering all abnormal and normal categories has been evaluated with a specificity of 60% and overall sensitivity of 80%. The system produces even higher sensitivity of 93% and 95% in HSIL and SCC grades respectively. Each slide used for validation has undergone two arm blind reviews, first by conventional manual cytology by qualified cytologists and second by automated Pap smear analysis. The accuracy of the automated analysis was benchmarked by using the manual review result as gold standard. The system has been found to reduce the workload of cytologist to almost 60% and has been designed to be operated by a semi-skilled person. A fully automated system can be builtbased on the results obtained by the present system by adding a slide loader, scanner, bar-code reader, sensors etc. which when designed and build cost effectively can increase slide processing throughput and reduce the dependency on human labour thereby significantly reducing the cost per slide making it feasible to extend screening to many more.
Computer assisted screening; Cervical cancer; Pap smear; Mega Funnel Technique; Image analysis; Classification; Cost effective screening system
Cervical cancer is one of those rare cancer groups which can be diagnosed and fully cured if detected at onset. Even then, more than 525,000 women are diagnosed with cervical cancer and more than 265,000 die from the disease every year . 85% of these deaths occur in low and middle income countries and the reason being poor access to screening and treatment services . Globally there are 2 billion women  in the age group where screening is relevant and who need screening every three years . The Pap smear test , invented by Dr. George Papanicolaou in 1940, is by far the most widely followed screening technique which can detect cervical cancer at an early and easily curable stage by studying the cells naturally exfoliating from the cervix. Screening based on Pap test (or Pap smear) has led to a dramatic reduction in the mortality rate for women who have been tested regularly in countries with an effective screening program [6- 11]. Human Papilloma Virus (HPV) and its numerous strains, around Cervical cancer is one of those rare cancer groups which can be diagnosed and fully cured if detected at onset. Even then, more than 525,000 women are diagnosed with cervical cancer and more than 265,000 die from the disease every year . 85% of these deaths occur in low and middle income countries and the reason being poor access to screening and treatment services . Globally there are 2 billion women  in the age group where screening is relevant and who need screening every three years . The Pap smear test , invented by Dr. George Papanicolaou in 1940, is by far the most widely followed screening technique which can detect cervical cancer at an early and easily curable stage by studying the cells naturally exfoliating from the cervix. Screening based on Pap test (or Pap smear) has led to a dramatic reduction in the mortality rate for women who have been tested regularly in countries with an effective screening program [6- 11]. Human Papilloma Virus (HPV) and its numerous strains, around method. Studies show that even a single screening in a life-time substantially reduces risk of cervical cancer incidence . However, competing health care priorities, insufficient financial resources, weak health systems, and limited numbers of trained providers have made high coverage for cervical cancer screening in most low- and middleincome countries difficult to achieve [2,20-23].
Visual screening of a Pap smear includes careful scrutiny of several thousand Fields of View (FOV) under a microscope, which together contains a few hundred thousand cells, for identifying a few abnormal cells. Screening is a most demanding function of the human eye-brain axis, it is exhaustive and fatigue producing . According to the Clinical Laboratory Improvement Act (CLIA) of 1988 cytotechnologist, those who screens the specimen, should not process more than 100 slides per day because of fatigue and habituation factor which deteriorates the quality of screening and can result in high number of false positive and negatives . To give reasonable protection against developing undetected cervical cancers, eligible women need to be screened regularly. Considering 2 billion women population, in relevant age groups, screening programs generate enormous numbers of samples to analyze. Educating and financing sufficient numbers of human screeners create great practical and economic problems which have led to substantial interest in trying to automate the task. Furthermore the human eye-brain axis is not good at appreciating the early nuclear changes which are the first indications of neoplastic transformations. The quantitative microscopy is much better suited for detection and objective measurement of early changes of malignancy .
Ever since the first appearance of computers, significant development efforts have been aiming at supplementing or replacing the human visual inspection of Pap-smears by computer analysis [27-29]. But the problem turned out to be lot harder than expected. From the first automated system in 1950’s it took almost another half a century before the first commercially successful system appeared.
The Cytoanalyzer built during the 1950’s was the first attempt towards automation of PAP smear screening. Although the system was able to distinguish the morphological difference between normal and malignant cells it produced too many false alarms. Another early attempt was CYBEST, developed during the 1970’s, which was able to detect malignancy based on morphological features but had problems with the chromatin features primarily caused by poorly focused images. During 1980’s quite a number of systems like BioPEPR, FAZYTAN, LEYTAS, DIASCANNER etc. were developed. Although some of the systems reported accuracy comparable with conventional visual screening none was successful owing to lack of cost effectiveness . Lately, research work on PAP smears images has been done with assisted segmentation where free lying cells with no interference by inflammatory cells were handpicked . Such work may require significantly more effort to develop into a field deployable screening system
Two United States Food and Drug Administration (FDA) approved automated machines were developed in the 1990s, the AutoPap 300 QC (NeoPath, Redmond, WA, USA) and the PapNet (Neuromedical Systems Inc., Suffern, NY, USA), both systems were designed to work with conventional cytology slides. AutoCyte also developed a machine known as the AutoCyte-Screen which was able to read AutoCyte- Prep slides (now BD SurePath LBC). The experiences gained from these early commercial efforts led to the merger of the companies into TriPath Imaging Inc. (Burlington, NC, USA) and the first generation products were replaced by the AutoPap Primary Screening System, which is now known as the BD Focal Point GS Imaging System (BD Diagnostics, Franklin Lakes, NJ, USA). Cytyc also developed an interactive system with a computer prescreen that selected the most abnormal looking objects on each specimen for human inspection. In 2003 they received FDA approval for their ThinPrep Imaging System, and in 2007 they became part of the Hologic Company. The system is marketed for increasing detection of abnormalities by improved specimen preparation and screening both visually and by machine [30,32]. Even with numerous attempts, still automated screening is not sufficiently cost-effective to completely replace the visual screening judging from the relatively limited penetration of automated screening systems in the screening operations worldwide .
The basis for the Pap smear screening is that cancerous or precancerous abnormal cells have larger nuclei and more irregular shape and chromatin structure than normal cells, as from the Figure 1a and 1b. However, the task is not as simple as said owing to the facts that cells in the specimen, even though prepared by mono-layering technique, are often folded, overlapped, covered by blood cells or other artifacts and clustered as in Figure 1c. Moreover, as the task is analyzing a few hundred thousand cells looking for malignant cells even a very low false positive rate will result in all specimens to be classified as malignant. To address the said problems, the automated screening system uses advanced image acquisition, processing and classification technique coupled with novel monolayer slide preparation technique detailed in the subsequent sections to provide a solution, which can be adopted for mass screening of cervical cancer.
Pap smear collection
Pap smears were obtained from women attending the early cancer detection clinic and cancer detection camps of RCC. Cervical scrapes were obtained using cervicobrush and the cells were preserved in the vials provided by Surepath Liquid Based Cytology (LBC) system. A separate scrape of cells were obtained for Mega funnel Technique from a selected group of women whose consent was taken in advance. The samples were processed in the Surepath system according to the manufacturer’s instruction and MFT as described.
Mega-funnel specimen preparation technique
Each of the cell samples in 10 mL of preservative solution containing 50% alcohol, glacial acetic acid and a mucolytic agent was homogenized in a vortex for 30 seconds followed by centrifugation at 2000 RPM for 5 minutes. The cell palette was mixed well with 1 mL of preservative solution and 200-300 μL of the sample was then cyto-centrifuged onto a coated slide using a mega-funnel. The smears were fixed in 95% of alcohol for 15-30 minutes and stained using classical Pap staining method producing a specimen dimension of 22 mm×15 mm. A total of 60 MFT slides were prepared and compared against commercial LBC system to produce image of quality comparable as that of commercial LBC system. Gross appearance of slides and magnified view of smears prepared using different preparation techniques are shown in Figure 2.
Field of view selection
Specimens prepared on glass slides were magnified through a 40X lens to accurately quantify nuclear chromatin distribution which resulted in an average of 2000 FOVs needed to cover the whole specimen. Data acquisition in this work was manual and as it was impractical to cover the whole specimen with manual repositioning between FOVs, only interesting FOVs were selected from each specimen. The image data was acquired by a person skilled enough to operate a microscope who after scanning the entire specimen selected 40 FOV’s based on the relative density of stain and nuclear enlargement. Each FOV was optimally focused manually before acquisition.
Each FOV selected and focused manually was digitized using an image acquisition utility, e-Smear developed by our team (Government of India, Copyright Registration No. SW-6416/2013) which controls camera parameters, digitizes FOV, logs patient clinical details, creates a systematically organized repository of Pap smear, records image annotations and generates statistical reports. The microscope used was Leica DM2500 with a plan apochromat objective of magnification 40X and numerical aperture of 0.65. The camera used in the digital microscope was Leica DFC495 producing RGB images with a spatial resolution of 3264×2448 pixels and sensor pixel size of 2.7 μm. The whole CMOS sensor of DFC495 has a physical dimension of 8.81 mm×6.61 mm. To capture the maximum possible area in a FOV a demagnifier of 0.63x magnification was also used, resulting in an effective pixel size of 0.1 μm. The workstations which host e-Smear and the slide analysis software were quad core Dell desktops with 4 GB of RAM having a 32 bit operating system.
Pap Image analysis
The images acquired from e-Smear were transferred to an image processing station where each image undergoes a series of processing and analysis steps to finally classify the specimen as either normal or suspicious. A flow chart of the Pap image analysis is shown in Figure 3.
Preprocessing and segmentation
A Laplacian of Gaussian (LoG) filter was used for detecting objects from Pap smear image. The Laplacian operator applied on the image highlight regions of rapid intensity change and was used for edge detection. In order to reduce its sensitivity to noise, the Laplacian operator was applied to an image that has been first smoothened by a Gaussian smoothing filter. Red blood cells, RBC’s are removed using color information from the true color RGB input image .
Feature extraction and ranking
The heart of the quantification and automation task is to determine what is to be measured and how it should be measured. Over the past 50 years of quantitative cytometry quite a large set of features have been tried and tested for various applications . Around 40 mathematical features which can accurately determine morphology, texture and densitometry of cervical epithelial cells were identified heuristically. All the identified features were ranked using histogram analysis and Mahalanobis maximization function , which is the ratio of difference in mean and sum of standard deviation of normal and abnormal cells.
A hierarchical multi-stage classification approach was followed for classifying normal smears from suspicions smears. In the first stage, artifacts, microbes and other debris were separated from epithelial cells [36,37]. The epithelial cells were then analyzed using a set of mathematical features to determine suspicious cells from the rest. Apart from the cell level classification, cell clusters were detected for careful scrutiny , significant diagnostic information was gathered from count of neutrophils  and Koilocytes . Finally the cell distribution of the whole specimen was analyzed for deviation from normal cell distribution. The final classification decision was made by a specimen level classifier taking input from the cell level and slide level classifiers. A flow chart is shown in Figure 4.
The preprocessing, segmentation, feature extraction and classification modules were integrated with a Graphical User Interface (GUI) application called CerviSCAN (Government of India, Copyright Registration No. SW-7352/2013).
Ground truth collection
A Cell Marker utility (Government of India, Copyright Registration No. SW-7458/2013) was developed and used by a team ofexperienced cytotechnologists to obtain ground truth. The Cell Marker automatedis a GUI application used to generate ground truths, visualization of segmentation results, feature extraction, training set creation and visualization of classification results. A total of 15,708 malignant cells were hand marked by cytotechnologists and close to 300,000 normal cells from normal smears, verified by cytotechnologists, were auto marked using CellMarker. 3092 cells which include 2935 normal cells and 157 abnormal cells of all grades were used to train the classification algorithm. The study protocol was approved by the Human Ethics Committee (HEC) of RCC, Thiruvananthapuram (HEC No. 22/2009). The evaluation protocol is elaborated in Figure 5.
All the Pap smears used for the system validation were obtained from women attending the Early Cancer Detection Centres (ECDC) of RCC from different places in Kerala like Karuanagapally, Ernakulam and Palakkad apart from routine examination in RCC, Thiruvananthapuram. The smears were collected after obtaining informed consent as per the recommendation of the HEC of RCC. The semi-automated system for screening of cervical cancer was used in RCC, Division of Cancer Research, since March 2011.
1107 Pap smears were used for the validation. Each slide in the study was manually screened by cytologists with over 25 years of experience and the ground truth was recorded. Smears were also analyzed in parallel by the automated system using image processing methods. Manual cytology was considered as the gold standard for benchmarking the efficacy of the automated analysis. All abnormal smears were biopsy proven. Table 1 describes distribution of slides used for validation.
|Smear category||Slide count|
|Negative for Intraepithelial Lesion or Malignancy ( NILM )||934|
|Atypical squamous cells of undetermined significance (ASC-US)||43|
|Low grade squamous intraepithelial lesion ( LSIL)||22|
|Atypical squamous cells – cannot exclude HSIL ( ASC-H)||9|
|High grade squamous intraepithelial lesion ( HSIL)||60|
|Squamous cell carcinoma ( SCC )||39|
|Total number of slides||1107|
Table 1: Category wise slide count.
The number of smears correctly classified and misclassified is described in Table 2. True positives are those abnormal smears which are classified as suspicious and sent for cytologist’s review by automated analysis. True negatives are normal smears which were classified as normal and require no further human intervention. False positives and negatives are misclassified smears. Not processed smears are smears which were rejected from automated analysis either because of poor image quality or insufficient number of image fields.
Table 2: Classification Statistics.
The system screened out 60% of the normal smears which needs no further human review and classified 80% of the abnormal cases as suspicious which needs further expert human review, as in the Table 3. Detailed analysis of accuracy in normal and different precursors of cervical cancer is elaborated in Table 4.
Table 3: Accuracy of automated analysis.
Comparison with commercial Systems
In a randomized controlled trial by Kitchener et al. , automatedis assisted and manual cervical screening was extensively studied and compared. Automated-assisted systems used in the trial were Becton Dickinson (BD) FocalPoint Slide Profiler (Becton Dickinson, Franklin Lakes, NJ, USA) and ThinPrep Imaging System (Hologic, Bedford, MA, USA). The primary outcome of the trial was to determine sensitivity of automation-assisted reading relative to manual reading which is detailed in Table 5. As evident from comparing Tables 4 and 5, the system described in this paper produced results of sensitivity comparable to or even better, in case of HSIL, than existing commercial system. Slides classified as No Further Review (NFR) by the commercial automated system were at 22% which is almost three times inferior to our system.
Table 4: Accuracy on different Specimen category.
Table 5: Results of Field trial by Kitchener et al. .
The automated image analyzer described in this article effectively screens out 60% of normal cases (similar to NFR of commercial system) which requires no further human intervention. It can thus reduce the workload of cytologist by up to 60% and extend the screening to many more, when deployed for population screening. Furthermore, as from the Table 4, high grade lesions like HSIL and SCC are detected with a higher accuracy of 93% and 95% respectively which mainly limits the false negatives to less severe cases like LSIL. As the disease takes a decade to progress to carcinoma in situ, a systematic implementation of routine screening will considerably reduce the chance of disease going undetected.
This work has demonstrated a system capable of detecting early pre-malignant changes of cervical smears with acceptable classification performance. However, the operation of the system needs to be made more time-efficient before large scale deployment. We here outline some of the considerations that will be taken into account for that work.
Motorized microscope: The existing system was designed in a semi-automated fashion where a semi-skilled person can operate the microscope, position the stage, focus and acquire the images while the analysis part is taken care of by the image analysis platform. A more sophisticated approach is full automation of slide loading and scanning where a robotic arm transfers each slide from slide tray to a scanningspace which will be controlled by stepper or piezo controlled motors for movement of slide in XY direction for FOV hopping. Image focus on each FOV is controlled by moving either stage or objective in Z direction. Throughput of the system depends to a larger extend on the speed of the image acquisition which requires motorized mechanical movement. However, as the automated platform can work 24/7, excluding time for routine maintenance, human efficiency can very well be breached. From field trial by Kitchener et al., adoption of an automated-assisted system resulted in increase in productivity by 60%- 80% .
Malignancy associated changes: An alternative approach to exhaustive scan of complete specimen is analysis of the field-effect or malignancy associated changes (MAC) [41,42], which points to the subtle changes in normal cells present in malignant smears. These discoveries were confirmed in the early research on automated cervical screening [43,44]. If the MAC approach is adopted only a small subset of cells from each smear needs to be analyzed instead of the complete smear scan. For MAC analysis it is essential to analyze the chromatin pattern in great detail. A highly accurate artefact removal with perfect focus is a prerequisite to convincingly demonstrate that MAC alone can detect early premalignant changes with sufficient sensitivity .
Field trials: The system need to undergo an extensive independent evaluation on around 10,000 smears.
Image analysis throughput: Image analysis throughput can be improved by porting CPU intensive operations to graphics processing units (GPU) where hundreds of dedicated highly parallel cores make the system more efficient with only a marginal increase in cost.
Building a cost effective system: The goal of our project is to demonstrate that a cost effective system for Pap-smear screening can be built. Based on the experiences gained from our study we have made a rough estimate of the component costs for a final automated system. It can be based on a standard microscope with minimal modification for integrating digital camera, quality optics, motorized stage and illumination. Furthermore, commercial XY motorized stages are now available with a travel range sufficient to load multiple slides in which case human intervention is required may be in every 2 hours or so just to re-load slide tray. Such cost optimized systems which avoid the need for expensive embellishments can be very well be implemented under $34,000, Table 6 describes approximate component wise costs. On an economic angle, if such a system will be able to screen a moderate target of 20,000 women per year, the screening cost for each smear can be reduced to under $2 from $7, which is the current cost of slide screening in India. This savings computes to $100,000 per year or rather possibilities of offering screening to many more.
|Item Description||Cost in kUSD*|
|Motorized XY stage||8.5|
|Z focusing system||4|
|Workstation with GPU card||4|
*costs mentioned are indicative figures obtained from websites
Table 6: Split up of cost of commercially available products cost.
To better address disperse population in low resource setting a Centralized Smear Analysis Station (CSAS) and multiple Satellite Smear Collection Centre (SSCC) model is suggested. CSAS should have all equipments mentioned in Table 6, while SSCC need to have only person(s) collecting cervical smears on a buffer which will be transferred to CSAC. Number of CSAS and SSCC can be decided based on population required to be screened and also based on resource availability. Analysis station contains desktop grade computers which had now evolved into a stable product requiring very less maintenance, if at all required, support will be readily available. Same applies for microscope and its accessories. Microscope XY stage will be only component requiring occasional maintenance due to wear & tare caused by heavy duty slide scanning which is easily addressed being at a centralized location.
The work was funded by Department of Electronics and Information technology, Ministry of Communication and Information Technology, Government of India. The work also received financial support from the Swedish Research Council under Swedish Research Links Programme. Authors wish to thank Dr. N. SreedeviAmma, Rtd. Professor and Former Head of the Department of Pathology and Additional director of the Regional Cancer Centre (RCC), Thiruvananthapuram and Dr. B. Chandralekha, Consultant Pathologist and Former Head of Pathology, RCC, for critically reviewing the work.
Make the best use of Scientific Research and information from our 700 + peer reviewed, Open Access Journals