Home /  Courses And Programs / Data Preparation for Analytics


Data Preparation for Analytics

An essential, yet often under-emphasized step in the data mining process is data preparation. Habitually, people are more inclined to focus on knowledge discovery, but without sufficient preparation of the data, return on efforts will be limited. Without adequate skill and knowledge, preparing data for modeling can lead to less than adequate modeling results.

This class offers in-depth coverage of data preparation techniques and a step-by-step approach through a variety of tools while providing practical illustrations using real data sets. The hands-on exercises will anchor the learned concepts and offer valuable first-hand experience in cleaning, filtering, and preparing the data for mining and predictive or descriptive modeling. The goal is to transform the datasets so that their information content is best exposed to the mining tool.

Topics include:

  • Prerequisites to good data preparation
  • Dealing with variables
    • Sparcity
    • Monotonicity
    • Increasing dimensionality
    • Anachronisms
    • Missing values
    • Outliers
  • Normalization, transformation, feature extraction, and feature reduction
  • Building mineable datasets
  • Data separation
  • Dealing with imbalanced data

Practical experience:

  • Hands-on data mining projects

Software: WEKA is used for class assignments. There is no additional cost for this product.

Course typically offered: Online in Winter and Summer

Prerequisites: Fundamentals of Data Mining or equivalent experience required.

Next Steps: Upon completion of this course, consider taking Data Mining: Advanced Concepts and Algorithms.

More Information: For more information about this course, please contact unex-techdata@ucsd.edu.


  • COURSE NUMBER  CSE-41261
  • CREDIT  2.00 unit(s)


Popular in Data Science

course

Predictive Analytics

Read More
course

Statistics for Data Analytics

Read More
course

Data Mining for Scientific Applications

Read More
course

Fundamentals of Data Mining

Read More
certificate

Data Mining for Advanced Analytics

Read More
course

GIS I: Introduction to GIS

Read More
course

Python for Informatics

Read More
course

Linear Algebra for Machine Learning

Read More
course

Data Preparation for Analytics

Read More
category

Data Science

Read More
course

Data Mining: Advanced Concepts and Algorithms

Read More

Certificate Programs

Data Mining for Advanced Analytics

From the Blog

See Also
Data Science