The Limitations of Data


This entry is part 3 of 6 in the series Data

Data can be very insightful, but it has limitations. When you are looking at a group of datasets, there are several things you can ask yourself.

  • Dirty Data
  • Incomplete (missing) Data
  • Misaligned Data
  • Tell a Clear Story

What Can the Data Analyst Do?

Use your good judgment. Before using data to present to others, be sure it is complete. For example, ask “is there enough survey results?” Clean the data. Check for the definitions of your metrics. For example, the total may be computed differently. Does total mean everyone who registered for the program or everyone who completed the program? Closely review those business rules. Could the data be bias because the data is self-reported instead of supervised?

Missing Data

One thing you might consider and try is to get the missing data and insert it. If the data is internal, try to find the missing data by asking the right questions of the right people. Have a look at the YouTube video called Data Analyst Interview Questions And Answers | Data Analytics Interview Questions | Simplilearn and go over to question seven at time 5:30. They suggest listwise deletion, average imputation, regression substitution (multiple-regression analysis), and multiple imputation. You mat decide to impute the missing data. Here is a post that is an example of imputing missing age data called Data Imputation of Age.

The first post in this series is called An Introduction to Data.

Series Navigation<< Relationships and the Data StoryTabular Data Format >>

Leave a Reply