- Multiple Linear Regression Introduction
- Multiple Linear Regression for Penguins
Multiple linear regression allows us to have many independent variables that are associated with one continuous dependent variable Y. Multiple linear regression is sometimes just called multiple regression, without the “linear”. Multiple linear regression is an analytical technique that estimates the relationship between one continuous dependent variable and two or more independent variables. Notes that the dependent variable is continuous, not discrete.
It’s best to go back and be sure you understand simple linear regression before tackling multiple linear regression. You will need to understand the difference between independent and dependent variables. You should also know what logistic regression is. You should know the equation of a straight line with its coefficients and its slope. Here is the multiple linear regression general formula.
Use Cases
Suppose you are selling services and you are wondering about your customer satisfaction. You have some data from a survey. What contributes to high customer satisfaction? Is it cost? Perhaps it’s customer response time (fast). Perhaps it’s the number of service types offered.
Another use case would be predicting house prices. Things that determine house prices might be square footage, lot square footage, number of bedrooms, number of bathrooms, proximity (distance) to a lake, river or ocean, proximity to a public school and so on.
Another use case may analyzing the effectiveness of different advertisements and how those advertisements affect the number of website clicks you are getting. We might be able to determine which advertisement is the most effective.
Another use case might be identifying a company’s high-value customers. Which customers are likely to make multiple purchases in a year? Which website pages are these high-value customers looking at? Which products do they prefer?