Decision Tree

Vivek Pawar

Published Aug 9, 2018

+ Follow

Decision tree is one of the most popular machine learning algorithms used all along.

Decision trees are used for both classification and regression problems

Why Decision trees?

We have couple of other algorithms there, so why do we have to choose Decision trees??

well, there might be many reasons but I believe a few which are

Decision tress often mimic the human level thinking so its so simple to understand the data and make some good interpretations.
Decision trees actually make you see the logic for the data to interpret(not like black box algorithms like SVM,NN,etc..)

For example : if we are classifying bank loan application for a customer, the decision tree may look like this

Here we can see the logic how it is making the decision.

It’s simple and clear.

So what is the decision tree?

may look like this

Here we can see the logic how it is making the decision.

It’s simple and clear.

So what is the decision tree??

A decision tree is a tree where each node represents a feature(attribute), each link(branch) represents a decision(rule) and each leaf represents an outcome(categorical or continues value).

The whole idea is to create a tree like this for the entire data and process a single outcome at every leaf(or minimize the error in every leaf).

There are couple of algorithms there to build a decision tree ,

CART (Classification and Regression Trees) → uses Gini Index(Classification) as metric.
ID3 (Iterative Dichotomiser 3) → uses Entropy function and Information gain as metrics.

Classification with using the ID3 algorithm.

Let’s just take a famous dataset in the machine learning world which is weather dataset(playing game Y or N based on weather condition).

We have four X values (outlook,temp,humidity and windy) being categorical and one y value (play Y or N) also being categorical.

so we need to learn the mapping (what machine learning always does) between X and y.

This is a binary classification problem, lets build the tree using the ID3 algorithm

To create a tree, we need to have a root node first and we know that nodes are features/attributes(outlook,temp,humidity and windy),

so which one do we need to pick first??

determine the attribute that best classifies the training data; use this attribute at the root of the tree. Repeat this process at for each branch.

This means we are performing top-down, greedy search through the space of possible decision trees.

okay so how do we choose the best attribute?

use the attribute with the highest information gain in ID3

In order to define information gain precisely, we begin by defining a measure commonly used in information theory, called entropy that characterizes the (im)purity of an arbitrary collection of examples.”

Algorithm:

Generate decision tree. Generate a decision tree from the training tuples of data

partition D.

Input:

Data partition, D, which is a set of training tuples and their associated class labels;
attribute list, the set of candidate attributes;
Attribute selection method, a procedure to determine the splitting criterion that “best” partitions
the data tuples into individual classes. This criterion consists of a splitting attribute

and, possibly, either a split point or splitting subset.

Output: A decision tree.

Method:

(1) create a node N;

(2) if tuples in D are all of the same class, C then

(3) return N as a leaf node labeled with the class C;

(4) if attribute list is empty then

(5) return N as a leaf node labeled with the majority class in D; // majority voting

(6) apply Attribute selection method(D, attribute list) to find the “best” splitting criterion;

(7) label node N with splitting criterion;

(8) if splitting attribute is discrete-valued and

multiway splits allowed then // not restricted to binary trees

(9) attribute list attribute list 􀀀 splitting attribute; // remove splitting attribute

(10) for each outcome j of splitting criterion

// partition the tuples and grow subtrees for each partition

(11) let Dj be the set of data tuples in D satisfying outcome j; // a partition

(12) if Dj is empty then

(13) attach a leaf labeled with the majority class in D to node N;

(14) else attach the node returned by Generate decision tree(Dj, attribute list) to node N;

endfor

(15) return N;

To view or add a comment, sign in

Decision Tree

Vivek Pawar

Classification with using the ID3 algorithm.

More articles by Vivek Pawar

Others also viewed

Unlocking Model Performance: Navigating the Key Factors for Success in Machine Learning

Understanding Decision Trees in Machine Learning: A Comprehensive Guide

Unveiling the Power of Lasso Regression in Machine Learning: A Comprehensive Guide

Exploring the Power of Random Forest: From Decision Trees to Ensemble Methods

Support Vector Machines

Unveiling the Power of Decision Tree Regressor

BIAS vs VARIANCE - Quick Intro

What You Need to Know about Machine Learning

Understanding K-Nearest Neighbors (KNN) in Machine Learning

Explore content categories

Classification with using the ID3 algorithm.

More articles by Vivek Pawar

Random Forest : Ensemble Machine Learning Algorithm

Ensemble machine learning algorithms

Agile Methodology with QMS ISO9001 Compliance

CodeIgnitor Email sending Options - mail Vs Sendmail Vs SMTP

Digital Transformation Overview

Hide Navbar & Other Component inside Login Screen

Codeigniter Session Handling Advance Concepts

PHP MYSQLi Prepared Statement : Guard SQL Injection

Node JS Express CORS API Call In Angular 6 (Cross Domain Request)

Angular 6 — Login, Router, and CanActive Interface

Others also viewed

Unlocking Model Performance: Navigating the Key Factors for Success in Machine Learning

Understanding Decision Trees in Machine Learning: A Comprehensive Guide

Unveiling the Power of Lasso Regression in Machine Learning: A Comprehensive Guide

Exploring the Power of Random Forest: From Decision Trees to Ensemble Methods

Support Vector Machines

Unveiling the Power of Decision Tree Regressor

BIAS vs VARIANCE - Quick Intro

What You Need to Know about Machine Learning

Understanding K-Nearest Neighbors (KNN) in Machine Learning

Explore content categories