top of page

Data Analytics Lifecycle

  • ali@fuzzywireless.com
  • Mar 4, 2022
  • 3 min read

The life cycle of data analytics is broken into six major phases as discovery, data preparation, model planning, model execution, communication, and operations (EMC Education Services, 2015). The movement between the phases can be either forward or backward. In the discovery phase, the problem is investigated and analyzed in the right context as well as the sources of data. Key stake holders are identified in this phase to set the success criteria and associated risks. In this phase, rules are set in place to set up the communication channel with stakeholders for approvals in the event of some changes to the course and avoid delays. The focus and scope of the project is discussed with time, resource, risk, people etc. Initial hypothesis is developed in this phase along with the identification of data sources (2015).


In the data preparation phase, data is extracted, transformed and loaded into environment for further post-processing (EMC Education Services, 2015). The quality of data is highly dependent on the handling and preprocessing of data in this phase. Missing or incomplete data is conditioned in this phase. Hadoop (Apache Hadoop, 2018) and several other similarly capable tools can be used in data preparation phase.


After data preparation phase, the data modelling is performed by identifying the variables, their relationship in the context of initial hypothesis (book). Analytical models are selected based on the data, like linear regression, logistic regression, decision tree, neural networks and so on. Some of the popular tools used in this phase are R and Python (R, 2018 & Python, 2018).


In the model building phase, the analytical model is tested and trained on the dataset to determine the accuracy (EMC Education Services, 2015). Adjustments to model can be performed by varying the parameter values in this phase. In case, the selected model does not perform well than team can go back to the previous phase and select other appropriate model for the business case. Common tools used in model building phase are R, Python, Matlab etc. (2015).


In the communication phase, the data is interpreted to determine the success or failure against the established criteria (EMC Education Services, 2015). In this phase, data is visualized using graphics, maps, charts etc. to show the results of overall findings. Several business intelligence tools like Tableau etc. can be used in this phase. The results are shared with cross-functional teams as well as stake holders to establish the success or failure of the project (2015).


Finally, in the operational phase the project is deployed at a wider scale based on the success in the previous phases and acceptance from stakeholders (EMC Education Services, 2015). Implementation can be performed in a controlled fashion to validate the results in the field before wider deployment. Technical specifications, code, analytical slides are usually the outcome of this phase for deployment (2015).


Based on the life cycle of data analytics, stakeholders need to be involved across all phases. In the data preparation phase, involvement of stakeholders ensures that scope, business objective and success criteria are properly set. Although data preparation, data modelling and model building phase are more technical phases but based on data there might be some requirement of changes which can affect the final outcome thus, major stakeholders should be informed of any changes and findings from these phases. The communication and operational phases both require formal acceptance from stakeholders to establish success for wider in-field deployment.


References:


Apache Hadoop (2018). Apache Hadoop. Retrieved from https://hadoop.apache.org/

EMC Education Services (2015). Data Science and Big Data Analytics: Discovering, Analyzing, Visualizing, and Presenting Data. Hoboken, NJ: John Wiley & Sons, Inc.


Python (2018). Python. Retrieved from https://www.python.org/

R (2018). What is R? Retrieved from https://www.r-project.org/about.html

Recent Posts

See All
AI - supporting decision making

Machine learning is built on algorithms to learn and provide results to end user (Chavan, Somvanshi, Tambade & Shinde, 2016). It is...

 
 
 
AI Influence on big data

Traditional machine learning algorithms and systems were developed with the assumption that data will fit in memory however in the realm...

 
 
 

Comments


Post: Blog2_Post
bottom of page