Decision Tree Workflow

This entry is part 3 of 6 in the series Decision Trees

Are you working in Python? Do you want to build a decision tree?

Let’s work though this workflow with a particular dataset in mind. We’ll work with a bank churn dataset found on Kaggle.

Python Workflow

Understand the Objective
Know your data
Import statements
Read the Data
Exploratory Data Analysis (EDA)
Select an Evaluation Metric
Feature Engineering
Split the Data
train a Baseline Model
Tune the Model (GridSearchCV)

Your objective may be trying to predict consumer behavior. Perhaps you work at a bank and you want to be able to predict if a customer will churn or not. Churn means leave the the company. The bank doesn’t want that and if they can predict the type of person that is likely to leave, the bank can take action to reduce that probability.

Know your data by reading the data dictionary, if one exists.

Import your needed Python libraries into your Python project.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier

# This function displays the splits of the tree
from sklearn.tree import plot_tree

from sklearn.metrics import ConfusionMatrixDisplay, confusion_matrix
from sklearn.metrics import recall_score, precision_score, f1_score, accuracy_score

Perform EDA on the data. You can start with a few Python and pandas commands on the DataFrame, such as head(), info(), describe().

Series Navigation<< Decision Tree Python AlgorithmDecision Tree 4 Rows >>

BeginCodingNow.com

for data analysts & software developers

for data analysts & software developers

Python Workflow

Leave a ReplyCancel reply