Machine Learning Workflow


This entry is part 3 of 5 in the series Machine Learning Overview

Machine learning workflows define which phases are implemented during a machine learning project.

The typical phases may include data collection, data pre-processing, building datasets, model training and refinement, evaluation, and deployment to production.

  1. Plan
  2. Get the data
  3. Exploratory Data Analysis (EDA)
  4. Prepare the data
  5. Choose a model and train it
  6. Evaluate and adjust your model as needed
  7. Communicate your findings/predictions
  8. Launch, monitor and maintain your system

We can compare this to the Data Analytics Lifecycle. This ML model is similar to the data analytics model except that this model adds two steps in the middle of the process. Here we add choosing a model and evaluating that model, at steps 5 and 6. The words describing the phases are different, but they are really the same things. For example, the Ask of the data analytics life cycle is the same phase as the Plan of this model.

We can also compare this to the PACE framework. The first two steps above fall inside PACE’s Plan phase. The EDA and preparing steps above fall into the Analyze phase of PACE. This is an important phase that includes formatting the data, and cleaning the data. Choosing a model and evaluating the model fall under the Construct phase of PACE. Finally, the Communicate and launch phases fall under the Execute phase of PACE.

Larger more complex projects may require a data engineer in the early stages. What does the workflow of a data engineer look like? Joe Reis and Matt Housley’s book Fundamentals of Data Engineering has a diagram that reveals a lot. Have a look at our post called Data Engineering Lifecycle.

Series Navigation<< Problems in Machine LearningTypes of Machine Learning >>Machine Learning Use Cases >>

Leave a Reply