TY - JOUR
T1 - Machine learning to reveal an astute risk predictive framework for Gynecologic Cancer and its impact on women psychology: Bangladeshi perspective
AU - Asaduzzaman, Sayed
AU - Ahmed, Md Raihan
AU - Rehana, Hasin
AU - Chakraborty, Setu
AU - Islam, Md Shariful
AU - Bhuiyan, Touhid
PY - 2021/4/24
Y1 - 2021/4/24
N2 - Background: In this research, an astute system has been developed by using machine learning and data mining approach to predict the risk level of cervical and ovarian cancer in association to stress. Results: For functioning factors and subfactors, several machine learning models like Logistics Regression, Random Forest, AdaBoost, Naïve Bayes, Neural Network, kNN, CN2 rule Inducer, Decision Tree, Quadratic Classifier were compared with standard metrics e.g., F1, AUC, CA. For certainty info gain, gain ratio, gini index were revealed for both cervical and ovarian cancer. Attributes were ranked using different feature selection evaluators. Then the most significant analysis was made with the significant factors. Factors like children, age of first intercourse, age of husband, Pap test, age are the most significant factors of cervical cancer. On the other hand, genital area infection, pregnancy problems, use of drugs, abortion, and the number of children are important factors of ovarian cancer. Conclusion: Resulting factors were merged, categorized, weighted according to their significance level. The categorized factors were indexed using ranker algorithm which provides them a weightage value. An algorithm has been formulated afterward which can be used to predict the risk level of cervical and ovarian cancer in relation to women's mental health. The research will have a great impact on the low incoming country like Bangladesh as most women in low incoming nations were unaware of it. As these two can be described as the most sensitive cancers to women, the development of the application from algorithm will also help to reduce women’s mental stress. More data and parameters will be added in future for research in this perspective.
AB - Background: In this research, an astute system has been developed by using machine learning and data mining approach to predict the risk level of cervical and ovarian cancer in association to stress. Results: For functioning factors and subfactors, several machine learning models like Logistics Regression, Random Forest, AdaBoost, Naïve Bayes, Neural Network, kNN, CN2 rule Inducer, Decision Tree, Quadratic Classifier were compared with standard metrics e.g., F1, AUC, CA. For certainty info gain, gain ratio, gini index were revealed for both cervical and ovarian cancer. Attributes were ranked using different feature selection evaluators. Then the most significant analysis was made with the significant factors. Factors like children, age of first intercourse, age of husband, Pap test, age are the most significant factors of cervical cancer. On the other hand, genital area infection, pregnancy problems, use of drugs, abortion, and the number of children are important factors of ovarian cancer. Conclusion: Resulting factors were merged, categorized, weighted according to their significance level. The categorized factors were indexed using ranker algorithm which provides them a weightage value. An algorithm has been formulated afterward which can be used to predict the risk level of cervical and ovarian cancer in relation to women's mental health. The research will have a great impact on the low incoming country like Bangladesh as most women in low incoming nations were unaware of it. As these two can be described as the most sensitive cancers to women, the development of the application from algorithm will also help to reduce women’s mental stress. More data and parameters will be added in future for research in this perspective.
KW - Algorithms
KW - Bayes Theorem
KW - Child
KW - Female
KW - Humans
KW - Logistic Models
KW - Machine Learning
KW - Neoplasms
KW - Neural Networks, Computer
KW - Pregnancy
U2 - 10.1186/s12859-021-04131-6
DO - 10.1186/s12859-021-04131-6
M3 - Article
C2 - 33894739
SN - 1471-2105
VL - 22
SP - 213
JO - BMC Bioinformatics
JF - BMC Bioinformatics
IS - 1
ER -