Statistics can be used to draw conclusions about data and provides a foundation for more sophisticated data analysis techniques. Viewing questions about data from a statistical perspective allows data scientists to create more predictable algorithms to convert data effectively into knowledge. As such, it is essential for data analysts to have a strong understanding of both descriptive and inferential statistics.

In this course, students will gain a comprehensive introduction to the statistical theories and techniques necessary for successful data mining and analysis. Particular attention will be paid to topics critical to data analytics, such as descriptive and inferential statistics, probability, linear and multiple regression, hypothesis testing, Bayes Theorem, and principal component analysis. This course prepares students for subsequent Data Mining courses.

Topics include:

  • Descriptive statistics
  • Two variable relationships
  • Probability
  • Bayes Theorem
  • Probability distributions
  • Sampling distributions
  • Confidence intervals
  • One- and two-sample hypothesis testing
  • Categorical data
  • Least-squares regression inference
  • Principal component analysis (PCA)

Practical experience:

  • Organize, summarize, and present data
  • Describe the relation between two variables
  • Work with sample data to make inferences about the data
  • Gain an understanding of linear algebra

Software: Students will use MyStatLab and StatCrunch to complete assignments. Both are included with the purchase of the required course textbook and instructions for access will be provided by the instructor on the course start date.

Course typically offered: Online in Fall, Winter, Spring and Summer (every quarter)

Prerequisites: Strong understanding of college algebra required

Next steps: Upon completion of this course, considering taking Fundamentals of Data Mining to continue learning.

More Information: For more information about this course, please contact unex-techdata@ucsd.edu.

Course Number: CSE-41264
Credit: 3.00 unit(s)
Related Certificate Programs: Data Mining for Advanced Analytics

