Medical Diagnostics using Big Data Analytics
- ali@fuzzywireless.com
- Mar 4, 2022
- 3 min read
Sowmya and Sravanthi (2017) define big data as the one which is huge and can not be efficiently processed using traditional data processing approaches. Hadoop, an open-source framework is one the tool widely used for big data amongst several others. Some of the key attributes of big data are volume, value, variety, velocity and veracity (Marr, 2014).
Biomedical data is regarded as one of the leading examples of big data due to sheer volume of medical health records, diagnosis, patient monitoring data, etc. (). Huge volume of data is generated from biomedical imaging data like, computed topography (CT scan), magnetic resonance imaging (MRI), diffusion tensor imaging (DTI), single proton emission computed tomography (SPECT), functional magnetic resonance imaging (fMRI) and so on. Similarly, biomedical signaling data from electroencephalography (EEG) and electrocardiography (ECG) are used to understand cardiovascular and brain activity. Health records of patients in the form of symptoms, lab results, diagnosis etc. is also large volume dataset which can be used for tracking and preventing spread of diseases. Lastly, biomedical genome data is a humongous data set of 100 gigabytes for just one human genome and helps in identifying the individual characteristics of people ().
Nair & Ganesh (2016) processed the MRI data and concluded that traditional processing techniques are not efficient for such time critical dataset. Lee, Yao, Shrestha, Gullberg & Seo (2014) used GraphX on SPARK to process medical imaging data while implementing maximum likelihood expectation maximum to reduce processing time significantly. Hadoop streaming environment was used while implementing MapReduce with K-Means and Laplacian filtering algorithm to process CT scan, X-Ray image etc. which resulted in speeding up the processing time versus traditional measures (Neshatpour, Koohi, Farahmand, Joshi, Rafatirad, Sasan, & Homayoun, 2016). Wee and Zahid (2015) processed large ECG data set using MapReduce techniques in cloud environment to reduce the processing time. Jatmiko et al. (2016) also highlighted that Hadoop file management system is a preferred platform for batch processing whereas Storm is more suited for real-time streaming data.
In summary, big data analytic tools help in reducing the processing time of medical data by several orders which ultimately helps in quicker diagnosis. Similarly, analysis of patient health data and wearable devices help in identifying health issues ahead of time which were otherwise not possible.
References:
Wee, K. & Zahid, S. (2015). Cloud computing for ecg analysis using MapReduce. 2015 4th International Conference on Advanced computer science applications and technologies, 115-120
Lee, J., Yao, Y., Shrestha, U., Gullberg, G., & Seo, Y. (2014). Handling big data in medical imaging: iterative reconstruction with large scale automated parallel computation. 2014 IEEE Nuclear Science Symposium and Medical Imaging Conference, 1-4
Neshatpour, K., Koohi, A., Farahmand, F., Joshi, R., Rafatirad, S. Sasan, A. & Homayoun, H. (2016). Big bimedical image processing hardware acceleration: a case study for k-means and image filtering. 2016 IEEE International Symposium on circuits and systems, 1134-1137
Nair, S. & Ganesh, N. (2016). An exploratory study on big data processing: a case study from a biomedical informatics. 2016 3rd MEC International Conference on big data and smart city, 1-4
Jatmiko, W., Arsa, D., Wisesa, H., Jati, G. & Ma;sum, M. (2016). A review of big data analytics in the biomedical field. 2016 International Workshop on big data and information security.
Sowmya, M. & Sravanthi, N. (2017). Big data: an overview of features, tools, techniques and applications. 2017 International Journal of Engineering Science and Computing, Vol. 7(6), 13644 - 13647
Marr, B. (2014). Big data: The 5 Vs everyone must know. Retrieved from https://www.linkedin.com/pulse/20140306073407-64875646-big-data-the-5-vs-everyone-must-know
Comments