Machine Learning in Healthcare
- ali@fuzzywireless.com
- Mar 4, 2022
- 2 min read
Bauder and Khoshgoftaar (2018) presented a machine learning based framework to detect fraud in medical claims filed in US. Publicly available data from Centers for Medicare and Medicaid Services of medical claims from 2012 through 2015 was used. For fraud flag, another list of excluded individuals and entities was used from the office of Inspector General which are excluded from Medicare program based on their historic fraudulent past. It is important to note that even though the fraudulent entities and individuals are not allowed to take part in Medicare program, but some still continue to practice and were not suspended despite convictions. The objective of study is to flag fraud and help saving money spent through Medicare and Medicaid (2018).
Supervised machine learning algorithm, namely random forest was used to detect fraud from medical claims (Bauder & Khoshgoftaar, 2018). Since the number of fraudulent claims are fewer as compared to non-fraudulent transactions, which is why Bauder and Khoshgoftaar (2018) used varying sampling sizes like, 99.01:0.1, 99:1, 95:5, 90:10, 75:25, 65:35 and 50:50 to improve the outcome of predictions. Recall and false positive rate was measured to gauge the effectiveness of prediction. Later Analysis of variance (ANOVA) was used to perform the statistical significance testing. Prediction of 90:10 came as best followed by 95:5, 75:25, 65:35, 50:50, 99:1 and 99.9:0.1. Several other supervised machine learning algorithms like Gradient Boosted Machine, Random Forest, Deep Neural Network, and Naïve Bayes) and unsupervised machine learning algorithms like autoen-coder, Mahalanobis distance, KNN, and LOF were also compared to identify the best algorithm for fraud detection in medical claims. It was also determined that some particular type of medical providers are difficult to learn versus others because of varying number of procedures lead to complexities (2018). Kareem, Ahmad and Sarlan (2017) also presented the machine learning framework to identify the fraudulent health claim. Support Vector Machine (SVM), which is a supervised machine learning algorithm was used to classify fraudulent and non-fraudulent claims using association mining rules.
References:
Kareem, S., Ahmad, R. & Sarlan, A. (2017). Framework for the identification of fraudulent health insurance claims using association rule mining. 2017 IEEE Conference on Big Data and Analytics.
Bauder, R. & Khoshgoftaar (2017). Medicare fraud detection using machine learning methods. 2017 16th IEEE International Conference on Machine Learning and Applications.
Comments