Text Mining

With the vast amounts of unstructured data available on the web and stored in databases, and the promise it will provide insights unavailable in structured data, text mining has become an indispensable addition to traditional predictive analytics.

In this course, students will learn practical techniques for text extraction and text mining in a data mining context, including document clustering and classification, information retrieval, and the enhancement of structured data. Emphasis will be placed on the practical use of text mining in business. In addition, basic concepts of textual information such as tokenization, part-of-speech tagging, and disambiguation will be covered.

Topics include:

  • Structured vs. unstructured learning
  • CRISP-DM
  • Data sources
  • Dictionaries and lexicons
  • Text parsing
  • Regular expressions
  • Structured data from unstructured data
  • Document clustering and classification
  • Sentiment analysis

Practical experience:

  • Working with R
  • Working with unstructured text
  • Prepping text data for modeling
  • Visualizing text data

Software: Students will use R in this course. There is no additional cost for this product.

Course typically offered: Online in Fall and Spring

Prerequisites: Introduction to R Programming or equivalent knowledge required.

Next Steps: Upon completion of this course, consider taking other courses in data science to continue learning.

More Information: For more information about this course, please contact unex-techdata@ucsd.edu.


  • COURSE NUMBER  CSE-41151
  • CREDIT  2.00 unit(s)


Popular in Data Science

course

Data Mining: Advanced Concepts and Algorithms

Read More
course

Statistics for Data Analytics

Read More
course

Linear Algebra for Machine Learning

Read More
course

Python for Informatics

Read More
category

Data Science

Read More
course

Fundamentals of Data Mining

Read More
course

Predictive Analytics

Read More
course

Data Preparation for Analytics

Read More
course

Data Mining for Scientific Applications

Read More
certificate

Data Mining for Advanced Analytics

Read More
course

GIS I: Introduction to GIS

Read More

Certificate Programs

Data Mining for Advanced Analytics

From the Blog

See Also
Data Science