CSE 443/543 High Performance Computing (3 credits)
Catalog description:
Introduction to the practical use of multi-processor workstations and supercomputing clusters. Developing and using parallel programs for solving computationally intensive problems. The course builds on basic concepts of programming and problem solving.
Prerequisite:
CSE 381
Required topics (approximate weeks allocated):
- Introduction to parallel programming and high performance distributed computing (HPDC) (1)
- Motivation for HPDC
- Review of parallel programs and platforms
- Implicit parallelism and limitations of instruction level parallelism (ILP)
- Survey of architecture of commonly used HPDC platforms
- Concurrency and parallelism (1)
- Introduction to concurrency & Parallelism
- Levels of parallelism
- Instruction level parallelism
- SIMD versus MIMD
- Review of C programming language and the Linux environment (1.5)
- Review of basic programming constructs
- Applying Java/C++ syntax and semantics to C language
- Introduction to problem solving using the C language
- Introduction to Linux
- C programming using Linux
- C structures
- Exploring instruction level parallelism (1)
- Review of instruction level parallelism and sources of hazards
- Concepts of hazard elimination via code restructuring (dependency reduction, loop unrolling)
- Timing and statistical comparison of performance of c programs
- Introduction to parallel programming (2)
- Principles of parallel algorithms
- Effects of synchronization and communication latencies
- Overview of physical and logical communication topologies
- Using MPE for parallel graphical visualization (parallel libraries)
- Introduction to message passing paradigm (.5)
- Principles of message-passing programming
- The building blocks of message passing
- Programming in MPI (3)
- Introduction to MPI: The Message Passing Interface
- MPI Fundamentals
- Partitioning data versus partitioning control
- Blocking communications and parallelism
- MPI communication models
- Blocking vs. non-blocking communication and impacts of parallelism
- Developing MPI programs that exchange derived data types
- Create MPI programs that use structure derived data types
- Review of portability and interoperability issue
- Performance profiling (1)
- Using software tools for performance profiling
- Performance profiling of MPI programs
- Speedup anomalies in parallel algorithms
- Collective communications (2)
- Introduction to collective communications
- Distributed debugging
- Introduction to MPI scatter/gather operations
- Exploring the complete collective communication operations in MPI
- Scalability and performance (1)
- Understanding notions of scalability and performance
- Metrics of scalability and performance
- Asymptotic analysis of scalability and performance
- Exams/Reviews (1)
Learning Outcomes:
- Identify various forms of parallelism, their application, advantages, and drawbacks
- Describe the spectrum of parallelism available for high performance computing
- Compare and contrast the different form of parallelism
- Identify applications that can take advantage of a given type of parallelism and vice versa
- Identify suitable hardware platforms on which the various forms of parallelism can be effectively realized
- Describe the concept of semantic gap as it pertains to high level languages and HPDC platforms
- Effectively utilize instruction level parallelism
- Describe the concept of instruction level parallelism
- Identify the sources of hazards that impact instruction level parallelism using a contemporary high level programming language
- Apply source-code level software transformations to minimize hazards and improve instruction level parallelism
- Compare performance effects of various source-code level software transformations using a performance profiler
- Effectively utilize multi-core CPUs and multithreading
- Describe the concept of multi-core architectures
- Describe the concepts of threads and distinguish between processes & threads
- Demonstrate the creating threads using OpenMP compiler directives
- Demonstrate the process of converting a serial program to a data parallel application
- Demonstrate the process of converting a serial program to a task parallel application
- Describe race conditions and side effects
- Demonstrate the process of resolving race condition using OpenMP critical sections
- Describe the performance tradeoff of using critical sections
- Describe the process of identifying and using multiple independent critical sections
- Measure performance gains of multithreading
- Specify, trace, and implement parallel and distributed programs using the Message Passing Interface (MPI) that solves a stated problem in a clean, robust, efficient, and scalable manner
- Describe the SPMD programming model
- Trace and create, compile, and run an MPI parallel program on a contemporary supercomputing cluster using PBS
- Describe, trace, and implement programs that uses MPI's point-to- point blocking communications
- Describe, trace, and implement programs that uses MPI's non-blocking communications
- Describe, trace, and implement programs that uses collective communications
- Describe, trace, and implement programs that use derived data types including vector derived data type and structure derived data types
- Be able to use 3rd party libraries compatible with MPI to develop programs
- Describe and empirically demonstrate concepts of parallel efficiency and scalability
- Describe the concepts of speedup, efficiency and scalability
- Describe the analytical metrics of speedup, efficiency, and scalability
- Identify efficient and scalable parallel programs (or algorithms) using asymptotic time complexities
- Use a performance profiler to empirically measure and compare efficiency, scalability, and speedup metrics of parallel programs
- Use profile data to improve speedup, efficiency, and scalability of a parallel program
Graduate students:
Students taking the course for graduate credit will be expected to apply course concepts to solve computationally demanding problems, analyze experimental results, and draw inferences to verify hypotheses.