Decision Trees
- ali@fuzzywireless.com
- Mar 3, 2022
- 2 min read
Decision trees are non-parametric supervised leaning algorithms used for regression and classification problems (Scikit-learn, 2018). A decision tree machine learning algorithm is analogous to tree, but with roots on the top and leaf at the bottom (Gupta, 2017). It is used for classification and regression problems when target variable is discrete and continuous respectively. As an example of classification decision tree, let’s take an example where it is desired to know whether umbrella is required or not on a given day; following decision tree approach root can be rain itself, whether it’s raining or not. If not raining, umbrella is not needed but if raining then next condition could be to check if a person is going out or not. If not going out, umbrella is not needed but if going out then next condition could be if the destination is an outdoor or indoor place. If destination is not outdoor, then umbrella is not needed but if outdoor than next condition could be if it’s windy or not. If windy, then umbrella would not be helpful but in case its not windy than umbrella is needed. On the other hand, regression example of decision tree could be determining the price of home on sale based on conditions like, neighborhood, school rating and so on (2017).
It is important to note that tree can become too large based on number of splits due to features of data set, which will result in complex tree with overfitting (Gupta, 2017). To improve the performance, pruning of tree is performed by removing the features of low importance, also referred as feature selection. Key advantage of decision tree is the visual simplicity and understanding of algorithm with minimal efforts towards data preparations. Another advantage is the capability to predict discrete as well as continuous variables. Relationship between independent variables do not impact performance of algorithm. On the other hand, decision trees are prone to over-fitting, thus results are difficult to generalize. Some features in decision tree can create bias thus require careful balancing (2017).
References
Scikit-Learn (2018). Decision Trees. Retrieved from https://scikit-learn.org/stable/modules/tree.html
Gupta, P. (2017). Decision trees in machine learning. Retrieved from https://towardsdatascience.com/decision-trees-in-machine-learning-641b9c4e8052

Comments