EDA Validating with PythonExploratory Data Analysis (EDA) has six main practices. The six main practices of EDA are discovering, structuring, cleaning, […] March 30, 2024 in Data Analytics tagged validation / validate by Mike
Interquartile Range (IQR)The middle 50% of your data is called the interquartile range, or IQR. The interquartile range is the […] March 27, 2024 in Statistics tagged range / quartile / median / irq / Statistics by Mike
HR Analytics Job Prediction ProjectThis is a dataset on Kaggle. The title of the Kaggle project is the same as the title […] March 27, 2024 in Data Analytics tagged HR / data / resources / employee / human / dataset by Mike
Distribution Plots in SeabornI created a project in a Jupyter Notebook under Anaconda that’s called Distribution Plots in Seaborn. For this […] March 26, 2024 in Data Visualization tagged chart / distribution / plot / seaborn by Mike
One-Hot Encoding of Categorical VariablesAre you a data analyst or are you working on a data analysis project and you are wondering […] March 24, 2024 in Data Analytics tagged data / convert / encode / ordinal / hierarchy / categorical / dummies by Mike
Python Floating Point NumbersOne important thing to know when you are programming in Python, is how Python deals with floating point […] March 21, 2024 in Python tagged point / round / number / float by Mike
Decision Tree – Only Six RowsThis is a very simple example of building a decision tree model on a very small dataset that […] March 20, 2024 in Machine Learning tagged decision / simple / tree / python / sklearn / decisiontreeclassifier / Decision Trees by Mike
Decision Tree WorkflowAre you working in Python? Do you want to build a decision tree? Let’s work though this workflow […] March 20, 2024 in Machine Learning tagged decision / tree / python / Decision Trees by Mike
Decision Trees and Random ForestsA random forest is a collection of decision trees whose results are aggregated into one final result. Their […] March 19, 2024 in Machine Learning tagged random / decision / tree / forest / Decision Trees by Mike
Titanic Logistic RegressionThis post will discuss the building of a logistic regression model on the Titanic dataset provided by Kaggle. […] March 18, 2024 in Data Analytics tagged table / pivot / regression / impute / logistic / Titanic by Mike
Data Imputation of AgeExploratory Data Analysis (EDA) has six main practices. The six main practices of EDA are discovering, structuring, cleaning, […] March 17, 2024 in Python tagged groupby / function / boxplot / search / seaborn / replace / impute / clean / apply / python / missing / pandas / Pandas EDA Cleaning by Mike
Seaborn Style and ColorAt the seaborn website there is an article called Controlling Figure Aesthetics. There is another article called Choosing […] March 12, 2024 in Python tagged data / style / colour / visualization / color / chart / seaborn / matplotlib / Seaborn by Mike