SDST4013 Applied high-performance computing and parallel programming (6 credits) Academic Year 2025
Offering Department SCDS (Department of Statistics and Actuarial Science) Quota 15
Course Co-ordinator Prof L Qu, SCDS (Department of Statistics and Actuarial Science) < liangqqu@hku.hk >
Teachers Involved (Prof L Qu,Statistics & Actuarial Science)
Course Objectives High-Performance Computing (HPC) and parallel processing are ubiquitous in modern computing. The aim of this course is to provide in-demand skills and knowledge to the field of high performance and parallel computing with hands-on parallel programming experience on real parallel machines and HPC systems. This course will begin with an introduction to HPC, which targets at making students understand what HPC is and how to navigate real HPC systems. Next, students will learn the fundamental knowledge of hardware architecture (e.g., shared memory, distributed memory clusters, etc.) that supports HPC. Finally, different parallel programming tools like MPI, OpenMP and CUDA will be discussed in connection with domain specific problems.
Course Contents & Topics The course will cover:
- Introduction to high-performance computing
- Basic C/C++ programming and common Linux commands
- Parallel programming basics
- Distributed memory programming with MPI
- Share memory programming with OpenMP
- GPU architecture and CUDA programming
Course Learning Outcomes
On successful completion of this course, students should be able to:

CLO 1 Gain foundational knowledge of HPC architectures and systems, including navigating a typical Linux-based HPC environment, understanding SLURM job scheduling, and comprehending basic HPC concepts, preparing students for future HPC interactions and usage.
CLO 2 Understand the fundamentals of parallel programming and acquire the ability to measure and analyze the performance of parallel systems, as well as assess and evaluate application scalability, including weak and strong scaling.
CLO 3 Explore distributed-memory parallel programming using MPI, enabling students to develop efficient parallel applications for distributed-memory systems.
CLO 4 Investigate shared-memory parallel programming with OpenMP, allowing students to harness the power of multi-core processors and shared-memory systems.
CLO 5 Learn CUDA programming for GPU acceleration, laying the groundwork for students to optimize computationally intensive tasks using GPUs.
CLO 6 Gain hands-on experience in designing, implementing, and optimizing HPC and parallel computing applications using real-world problems and datasets.
Pre-requisites
(and Co-requisites and
Impermissible combinations)
Passed in (COMP2113 or COMP2119 or COMP2396) and (SDST3600 or SDST3612); and
Not for students who have passed in APAI4013, or already enrolled in this course.
Only for students admitted in 2025 and thereafter.
Course to PLO Mapping
Offer in 2025 - 2026 Y        2nd sem    Examination May     
Offer in 2026 - 2027 Y
Course Grade A+ to F
Grade Descriptors
A Demonstrate thorough mastery at an advanced level of extensive knowledge and skills required for attaining all the course learning outcomes. Show strong analytical and critical abilities and logical thinking, with evidence of original thought, and ability to apply knowledge to a wide range of complex, familiar and unfamiliar situations. Apply highly effective organizational and presentational skills.
B Demonstrate substantial command of a broad range of knowledge and skills required for attaining at least most of the course learning outcomes. Show evidence of analytical and critical abilities and logical thinking, and ability to apply knowledge to familiar and some unfamiliar situations. Apply effective organizational and presentational skills.
C Demonstrate general but incomplete command of knowledge and skills required for attaining most of the course learning outcomes. Show evidence of some analytical and critical abilities and logical thinking, and ability to apply knowledge to most familiar situations. Apply moderately effective organizational and presentational skills.
D Demonstrate partial but limited command of knowledge and skills required for attaining some of the course learning outcomes. Show evidence of some coherent and logical thinking, but with limited analytical and critical abilities. Show limited ability to apply knowledge to solve problems. Apply limited or barely effective organizational and presentational skills.
Fail Demonstrate little or no evidence of command of knowledge and skills required for attaining the course learning outcomes. Lack of analytical and critical abilities, logical and coherent thinking. Show very little or no ability to apply knowledge to solve problems. Organization and presentational skills are minimally effective or ineffective.
Communication-intensive Course N
Course Type Lecture-based course
Course Teaching
& Learning Activities
Activities Details No. of Hours
Lectures 36.0
Tutorials 12.0
Reading / Self study 100.0
Assessment Methods
and Weighting
Methods Details Weighting in final
course grade (%)
Assessment Methods
to CLO Mapping
Assignments Coursework (assignments, tutorials, class test(s) and a project) 60.0 1,2,3,4,5,6
Examination One 2-hour written examination 40.0 1,2,3,4,5
Required/recommended reading
and online materials
Hager G, Wellein G. Introduction to high performance computing for scientists and engineers[M]. CRC Press, 2010.
Barney B. Introduction to parallel computing[J]. Lawrence Livermore National Laboratory, 2010, 6(13): 10.
Course Website http://moodle.hku.hk
Additional Course Information