Online Learning Platform

Business Analytics > Data & Database Analytics > What kinds of data quality problems?

Examples of data quality problems:

  • Noise and outliers
  • Missing values
  • Duplicate data

Noise and outliers: Noise refers to modification of original values. Examples: distortion of a person’s voice when talking on a poor phone and "snow" on television screen.

Outliers are data objects with characteristics that are considerably different than most of the other data objects in the data set.

 

Missing values: Reasons for missing values are: Information is not collected (e.g., people decline to give their age and weight) or Attributes may not be applicable to all cases (e.g., annual income is not applicable to children)

Duplicate data: Data set may include data objects that are duplicates, or almost duplicates of one another. This may happen when merging data from heterogeous sources. Examples: Same person with multiple email addresses

Prev
Different Data Sources
Next
Concept of Database
Feedback
ABOUT

Statlearner


Statlearner STUDY

Statlearner