Skip to Content
Home /  Courses And Programs / Natural Language Processing

Natural Language Processing

The goal of Natural Language Processing (NLP) is to understand the semantics of text. Only when computers understand the real meaning of the text, can they take decisive action which must be the intended action.

In this course you will learn the mathematical fundamentals of NLP which includes Regular Expressions and the Vectorization process. Count vectorizer, cosine similarity and TF-IDF (Term Frequency Inverse Document Frequency) concepts will be covered. The semantics of text will be analyzed using Latent Semantic Analysis (LSA) and Latent Dirichlet Allocation (LDA).

Deep learning tools (TensorFlow + Keras) will be used to generate Word Embeddings like Word2Vec. Transformers will be covered which includes GPT (Generative Pre-Training) & BERT (Bi-Directional Encoder Representations Transformers).

Key topics:

  • Fundamental mathematical analysis of NLP
  • Tokenization using Regular Expressions
  • Vectorization of words: Count Vectorizer + TF-IDF
  • Cosine similarity between words
  • Understand the meaning of text using Latent Semantic Analysis
  • Topic Modeling using Latent Dirichlet Allocation
  • Machine Learning: Naïve Bayes for text Classification
  • Deep Learning: Word Embeddings Word2Vec + GloVe
  • Deep Learning: Generating Neural Networks: GPT-1/2/3 (Generative Pre-Training)
  • Deep Learning: NLP Analysis for Search Engines: BERT – (Bi-Directional Encoder Representations Transformers)

Practical experience:

  • Google Cloud Platform (GCP) and programming in Python
  • Understanding of current machine learning models for NLP

Course typically offered: Online, during the Summer and Winter academic quarters.

Software: Students will use Python to complete hands-on assignments. These tools are free and open-source.

Prerequisites: Introduction to Programming (CSE-40028) or a basic working knowledge of Python. Students must have access to a web-enabled computer.

Next steps: After completion of this course, students are encouraged to consider taking additional coursework in the Machine Learning Methods or Python Programming certificates.

Contact: For more information about this course, please contact unex-techdata@ucsd.edu.

Course Number: CSE-41344
Credit: 3.00 unit(s)
Related Certificate Programs: Machine Learning MethodsSelected Topics in Artificial Intelligence

+ Expand All