• Latest
How to Design a Better Decision Tree With Pruning

How to Design a Better Decision Tree With Pruning

January 7, 2022
FCC filings reveal Apple’s ‘Network Adapter’ that runs iOS

FCC filings reveal Apple’s ‘Network Adapter’ that runs iOS

May 22, 2022
ETL, ELT, and Reverse ETL

ETL, ELT, and Reverse ETL

May 22, 2022
Poco Watch review – GSMArena.com news

Poco Watch review – GSMArena.com news

May 22, 2022
These Panasonic deals will save you up to $500 on cameras and lenses!

These Panasonic deals will save you up to $500 on cameras and lenses!

May 22, 2022
Weekly poll results: the Sony Xperia 1 IV and 10 IV are great, if you can afford them

Weekly poll results: the Sony Xperia 1 IV and 10 IV are great, if you can afford them

May 22, 2022
Applying Kappa Architecture to Make Data Available

Applying Kappa Architecture to Make Data Available

May 22, 2022
Nintendo Responds To Wii And DSi Shop Channel Outages

Nintendo Responds To Wii And DSi Shop Channel Outages

May 21, 2022
Poll: What’s The Best Warriors Game On Switch?

Poll: What’s The Best Warriors Game On Switch?

May 21, 2022
Question of the Week: Which woman photographer do you feel is/was most influential?

If you could spend one day with a famous photographer, who would it be?

May 21, 2022
Report: Apple tells suppliers it wants to expand manufacturing outside of China, India and Vietnam likely future production hubs

Report: Apple tells suppliers it wants to expand manufacturing outside of China, India and Vietnam likely future production hubs

May 21, 2022
The Centennial Case: A Shijima Story Review (Switch eShop)

The Centennial Case: A Shijima Story Review (Switch eShop)

May 21, 2022
Apple supplier BOE could lose millions of iPhone 14 OLED panel orders

Apple supplier BOE could lose millions of iPhone 14 OLED panel orders

May 21, 2022
Advertise with us
Sunday, May 22, 2022
Bookmarks
  • Login
  • Register
GetUpdated
  • Home
  • Game Updates
    • Mobile Gaming
    • Playstation News
    • Xbox News
    • Switch News
    • MMORPG
    • Game News
    • IGN
    • Retro Gaming
  • Tech News
    • Apple Updates
    • Jailbreak News
    • Mobile News
  • Software Development
  • Photography
  • Contact
    • Advertise With Us
    • About
No Result
View All Result
GetUpdated
No Result
View All Result
GetUpdated
No Result
View All Result
ADVERTISEMENT

How to Design a Better Decision Tree With Pruning

January 7, 2022
in Software Development
Reading Time:4 mins read
0 0
0
Share on FacebookShare on WhatsAppShare on Twitter


Decision tree (DT) analysis is a general and predictive modeling tool for machine learning. It is one of the simplest and most useful structures for machine learning. As the name indicates, DTs are trees of decisions. By using an algorithm approach for splitting data sets according to different conditions, decision trees are constructed. A decision tree is one of the commonly used and functional techniques for supervised learning. 

But before moving on to designing decision trees with pruning, let’s understand its true concept.

Understanding Decision Trees

A decision tree is a supervised machine learning algorithm that is used for classification and regression problems. Decision trees follow a set of nested if-else statement conditions to make predictions. And as decision trees are used for classification and regression, the algorithm that is used to grow them is called CART (Classification and Regression Trees). And there is not only one but multiple algorithms proposed to build decision trees. Decision trees aim to create a model that predicts the target variable’s value by learning simple decision rules inferred from the data features.

Each node of a decision tree represents a decision. 

CART Model Visualized

Source

In the above diagram, the root node is the beginning of the decision tree graph with zero depth. Then there are child/internal nodes where binary decisions are taken. And finally, there are leaf nodes, where predictions of a category are made. 

The algorithm that helps to build a decision tree aims to predict a target variable from the variables and their attributes. The structure of the decision tree is made through binary splits series from the root node through branches. To reach the leaf node in the decision tree, you have to pass multiple internal nodes to check the predictions made.

Assumptions While Building Decision Trees

Some of the assumptions that you have to make while working with decision trees are:

  • The whole training set is root.
  • It is preferable to have categorical feature values. Discrete values are used before building a decision tree model.
  • The attribute values are used to recursively distribute records.
  • Statistical approaches are used to determine which attributes should be placed as root or internal nodes of the tree.

The Sum of Product (SOP) or Disjunctive Normal Form representation is used in decision trees. 

Each branch in a class, from the root to the leaf node of the tree having the different branches ending forms a disjunction (sum), same class forms conjunction (product) of the values.

Why Decision Tree?

A decision tree follows the same process as a human follows when making a decision in real life, making it simpler to understand. It is crucial for solving decision-making problems in machine learning. The reason why it is so commonly used to train an ML model is that decision trees help to think about all the possible outcomes of a problem. Moreover, there is less requirement for data cleaning in comparison to other algorithms. 

But, the decision tree also has its limitations, which is Overfitting.

Overfitting in Decision Trees

Overfitting is a significant practical difficulty in decision trees. A decision tree will always overfit the training data if it is allowed to grow to its max depth. Overfitting occurs in a decision tree when the tree is designed to fit all samples in the training data set perfectly. As a result, decision trees end up with branches with strict sparse data rules and this affects the accuracy of prediction by working with samples that are not part of the training set. The deeper the tree grows, the more complex the decision rule sequence becomes. Assigning the maximum depth is the easiest way to simplify a tree and handle overfitting. 

But how to improve our tree model in a more precise way? Let’s find out!

How To Prevent Overfitting in Decision Trees with Pruning?

Pruning is a technique used to remove overfitting in Decision trees. It simplifies the decision tree by eliminating the weakest rule. It can be further divided into:

  • Pre-pruning refers to stopping the tree at an early stage by limiting the growth of the tree through setting constraints. To do that, we can set parameters like min_samples_split, min_samples_leaf, or max_depth using Hyperparameter tuning.
  • Post-pruning or Backward pruning is used after the decision tree is built. It is used when the decision tree has become extremely in-depth and shows model overfitting. To do that, we will control the decision tree branches like max_samples_split and max_depth through cost-complexity pruning.

Pruning starts with an unpruned tree. Then it takes the subtree sequence and selects the best one via cross-validation. 

It’s important that pruning ensures that the subtree is optimal, i.e., it has higher accuracy, and the optimal subtree search is computationally tractable. Hence, pruning should not only reduce overfitting but also make the decision tree less complex, easier to understand, and efficient to explain than the unpruned decision tree while maintaining its performance.

Takeaway

Now we know that decision trees belong to supervised machine learning algorithms. In contrast to other algorithms of supervised learning, decision trees can also be used to solve classification and regression problems. 

Decision trees are used to build models for predicting variable values or classes based on simple decision rules derived from previous training data. Another takeaway point of this blog is pruning. Pruning helps decision trees to make precise decisions while reducing their complexity and setting constraints.



Source link

ShareSendTweet
Previous Post

Nokia C100 – First Look (2022) Review, Price, Specification, Camera, Setup

Next Post

Battlefield 2042 Subreddit Could Be Locked for Toxicity

Related Posts

ETL, ELT, and Reverse ETL

May 22, 2022
0
0
ETL, ELT, and Reverse ETL
Software Development

This is an article from DZone's 2022 Data Pipelines Trend Report.For more: Read the Report ETL (extract, transform, load) has...

Read more

Applying Kappa Architecture to Make Data Available

May 22, 2022
0
0
Applying Kappa Architecture to Make Data Available
Software Development

Introduction  Banks are accelerating their modernization effort to rapidly develop and deliver top-notch digital experiences for their customers. To achieve...

Read more
Next Post
Battlefield 2042 Subreddit Could Be Locked for Toxicity

Battlefield 2042 Subreddit Could Be Locked for Toxicity

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

© 2021 GetUpdated – MW.

  • About
  • Advertise
  • Privacy & Policy
  • Terms & Conditions
  • Contact

No Result
View All Result
  • Home
  • Game Updates
    • Mobile Gaming
    • Playstation News
    • Xbox News
    • Switch News
    • MMORPG
    • Game News
    • IGN
    • Retro Gaming
  • Tech News
    • Apple Updates
    • Jailbreak News
    • Mobile News
  • Software Development
  • Photography
  • Contact
    • Advertise With Us
    • About

Welcome Back!

Login to your account below

Forgotten Password? Sign Up

Create New Account!

Fill the forms bellow to register

All fields are required. Log In

Retrieve your password

Please enter your username or email address to reset your password.

Log In
Are you sure want to unlock this post?
Unlock left : 0
Are you sure want to cancel subscription?