CSE 432/532 Machine Learning

Catalog description:

This course introduces the process, methods, and computing tools fundamental to machine learning. Students will work on large real-world datasets to write code to accomplish tasks such as predicting outcomes, discovering associations, and identifying similar groups. Students will complete a term project showcasing the different steps of the machine learning process, from data cleaning to the extraction of accurate models and the visualization of results.

Prerequisite: 

CSE 274 

Required topics (approximate weeks allocated): 

  • Introduction to the course, including logistics and syllabus (0.5)
  • Setting-up technologies used in this course, including Anaconda and Jupyter notebooks (0.5)
  • Principles of Python programming in a professional environment (2)
    • Control flows in Python (loops, functions, conditionals)
    • Handling of data through files or in-memory structures (lists, associative arrays)
    • Use of the Pandas library for mapping and filtering
  • Essentials of data cleaning and transformation (2.5)
    • Detecting outliers
    • Filing in missing values 
    • Feature engineering
    • Dimensionality reduction
    • Data balancing
  • Overview of key machine learning tasks, e.g. classification, clustering (0.5)
  • Standard techniques for classification (3)
    • Decision trees
    • Support vector machines
    • Ensemble learning (e.g. random forests)
  • Overview of possible course projects (0.5)
  • Unsupervised learning (3)
    • Artificial neural networks on TensorFlow
    • Clustering
  • Visualizing machine learning models (1.5)
    • Principles of scientific visualization applied to machine learning
    • Programming visualizations within a machine learning workflow

Learning Outcomes:

  1. Describe how to create accurate and generalizable models from large and messy datasets.
  2. Implement code to clean data and derive a model using an appropriate machine learning algorithm.
  3. Present solutions to stakeholders using visualizations and professional machine learning workflows.
  4. Write machine learning applications using techniques that are learned independently using online resources. (graduate students only)