Course Overview
Learn the basics of OpenACC, a high-level programming language for programming on GPUs. This course is for anyone with some C/C++ of Fortran experience who is interested in accelerating the performance of their applications beyond the limits of CPU-only programming. In this course, you’ll learn:
- How to profile and optimize your CPU-only applications to identify hot spots for acceleration
- How to use OpenACC directives to GPU accelerate your codebase
- How to optimize data movement between the CPU and GPU accelerator
Upon completion, you'll be ready to use OpenACC to GPU accelerate CPU-only applications.
Please note that once a booking has been confirmed, it is non-refundable. This means that after you have confirmed your seat for an event, it cannot be cancelled and no refund will be issued, regardless of attendance.
Prerequisites
- Basic C/C++ or Fortran competency, including familiarity with variable types, loops, conditional statements, functions, and array manipulations.
- No previous knowledge of GPU programming is assumed.
Course Objectives
- Profile and optimize your CPU-only applications to identify hot spots for acceleration.
- Use OpenACC directives to GPU-accelerate your codebase.
- Optimize data movement between the CPU and GPU accelerator.
Follow On Courses
Outline: Fundamentals of Accelerated Computing with OpenACC (FACO)
Introduction
- Meet the instructor.
- Create an account at courses.nvidia.com/join
Introduction to Parallel Programming
- Learn about parallelism in a conceptual way, as well as how to express it with OpenACC. Topics that will be covered are as follows:
- Introduction to parallelism
- The goals of OpenACC
- Basic parallelization of code using OpenACC
Profiling with OpenACC
- Learn how to build and compile an OpenACC code, the importance of profiling, and how to use the NVIDIA Nsight™ Systems profiler. Topics that will be covered are as follows:
- Compiling sequential and OpenACC code
- The importance of code profiling
- Profiling sequential and OpenACC multicore code
- Technical introduction to the code used in introductory modules
Introduction to OpenACC Directives
- Learn how to parallelize your code with OpenACC directives and understand the differences between parallel, kernel, and loop directives. Topics that will be covered are as follows:
- The Parallel directive
- The Kernels directive
- The Loop directive
GPU Programming with OpenACC
- Learn about the differences between GPUs and multicore CPUs, and manage memory with CUDA Unified Memory. Topics that will be covered are as follows:
- Definition of a GPU
- Basic OpenACC data management
- CUDA Unified Memory
- Profiling GPU applications
Data Management with OpenACC
- Learn how to explicitly manage data movement with OpenACC data directives to reduce data transfers. Topics that will be covered are as follows:
- OpenACC data directive/clauses
- OpenACC structured data region
- OpenACC unstructured data region
- OpenACC update directive
- Data management with C/C++ Structs/Classes
Loop Optimizations with OpenACC
- Understand the various levels of parallelism on a GPU and learn ways to extract more parallelism with OpenACC by optimizing loops in your code. Topics that will be covered are as follows:
- Seq/Auto clause
- Independent clause
- Reduction clause
- Collapse clause
- Tile clause
- Gang, Worker, Vector
Final Review
- Review key learnings and answer questions.
- Complete the assessment and earn a certificate.
- Complete the workshop survey.