In everyday life, analogies with trees are frequent. Trees, made of roots, trunks, branches, and leaves, frequently represent growth.

A decision tree is an algorithm used in machine learning to build classification and regression models.

Because it begins at the base, like an upside-down tree, and branches out to show different outcomes, the decision tree gets its name.

Decision trees assist us in visualising these models and modifying how we train them because machine learning is centred on solving issues.

Here, you need to know about machine learning decision trees.

## Decision Tree: Definition

A decision tree is a graphical representation of a decision-making process. It is a flowchart-like tree structure where an internal node represents a feature(or attribute), the branch represents a decision rule, and each leaf node represents the outcome.

The topmost node in a decision tree is known as the root node. It learns to partition based on the attribute value. It partitions the tree in a recursive manner called recursive partitioning.

For example, a decision tree for determining whether or not to play golf might look like this:

Root node: Is the weather sunny?

Yes: Play golf.

No: Don't play golf.

In this example, the root node is the decision of whether or not to play golf, and the two child nodes are the possible outcomes based on whether the weather is sunny. Decision trees can be used for both classification and regression tasks.

Decision trees are a popular and effective machine learning algorithm because they are easy to understand and interpret and can handle both continuous and categorical data.

They are also relatively robust to missing data and are relatively fast to train and predict. However, they can be prone to overfitting if they are not pruned correctly.

## Importance of Decision Trees in Machine Learning

Decision trees are a popular and effective machine learning algorithm because they have several advantages:

- Easy to understand and interpret: Decision trees are easy to understand and interpret because they can be visualised as a flowchart. This makes them a valuable tool for explaining predictions to non-technical stakeholders.
- Can handle both continuous and categorical data: Decision trees can handle both continuous and categorical data, which makes them a versatile tool for many machine learning tasks.
- Can handle missing data: Decision trees can take missing data by simply creating a separate branch for missing values.
- Fast to train and predict: Decision trees are relatively fast to prepare and expect, especially for small to medium-sized datasets.
- Robust to noise: Decision trees are relatively full of noise and can handle outliers well.
- Can be used for both classification and regression tasks: Decision trees can be used for both classification and regression tasks.
- Can be used for feature selection: Decision trees can identify the essential features for a prediction task, which can be helpful for feature selection.

Despite these advantages, decision trees have some limitations. They can be prone to overfitting if they are not pruned properly, and they can be biased if some features are more important than others.

However, these issues can be mitigated by pruning and weighing the features.

## Decision tree terminologies

- Root node: The root node is the topmost node in a decision tree. It represents the entire population or sample, which further gets divided into two or more homogeneous sets. Here are some common terminologies used in decision trees:
- Splitting: Splitting is dividing a node into two or more sub-nodes.
- Decision node: A decision node is a node that has two or more branches. It represents a decision point in the tree.
- Leaf/terminal node: A leaf or terminal node is a node that has no branches. It represents the final decision or prediction made by the tree.
- Pruning: Pruning is the process of removing unnecessary nodes from the tree to improve its accuracy and reduce overfitting.
- Branch: A branch is a sub-section of the tree sub-section representing a possible outcome or decision.
- Parent node: A parent node is a node that has one or more child nodes.
- Child node: A child node is connected to a parent node and located below it in the tree.
- Depth: The depth of a node is the number of edges from the root node to the node.
- Level: The level of a node is the number of edges from the root node to the parent of the node.
- Impurity: Impurity measures how mixed the classes are in a node. A node is pure if all the samples belong to the same category.

## Decision trees in machine learning: Two types

Regression or classification trees are the two decision trees used in machine learning. Together, these two kinds of algorithms make up the class of "classification and regression trees" (CART), which is occasionally used to refer to them.

They each have a "classify" and "predict" function.

### Classification trees with two examples

Classification trees are used to predict a categorical response. For example, a classification tree might be used to indicate whether a customer will churn or not.

Here is an example of a classification tree for predicting whether a customer will churn or not based on their monthly charges and contract type:

Root node: Is the monthly cost greater than $70?

Yes: Predict "churn".

No: Go to the next node.

Next node: Is the contract type "month-to-month"?

Yes: Predict "churn".

No: Predict "not churn".

In this example, the root node is the decision of whether or not the customer will churn, and the two child nodes are the possible outcomes based on the monthly charge and contract type.

Here is another example of a classification tree for predicting whether a patient has diabetes or not based on their age, BMI, and blood pressure:

Root node: Is the BMI greater than 30?

Yes: Go to the next node.

No: Predict "no diabetes".

Next node: Is the blood pressure greater than 120/80?

Yes: Go to the next node.

No: Predict "no diabetes".

Next node: Is the age greater than 40?

Yes: Predict "diabetes".

No: Predict "no diabetes".

In this example, the root node is the decision of whether or not the patient has diabetes, and the child nodes are the possible outcomes based on BMI, blood pressure, and age.

### Regression trees with two examples

Regression trees are used to predict a continuous response. For example, a regression tree might be used to predict the price of a house based on its characteristics, such as size, location, and several bedrooms.

Here is an example of a regression tree for predicting the price of a house based on its size and location:

Root node: Is the size of the place greater than 2000 sq. ft.?

Yes: Go to the next node.

No: Predict a price of $200,000.

Next node: Is the location in a high-income neighbourhood?

Yes: Predict a cost of $500,000.

No: Predict a cost of $300,000.

In this example, the root node is the decision of the price of the house, and the child nodes are the possible outcomes based on the size and location of the house.

Here is another example of a regression tree for predicting the fuel efficiency of a car based on its engine size and weight:

Root node: Is the engine size greater than 2.5 litres?

Yes: Go to the next node.

No: Predict a fuel efficiency of 30 mpg.

Next node: Is the car's weight more fabulous than 3000 lbs?

Yes: Predict a fuel efficiency of 20 mpg.

No: Predict a fuel efficiency of 25 mpg.

In this example, the root node is the decision of the fuel efficiency of the car, and the child nodes are the possible outcomes based on the engine size and weight of the vehicle.