Enquiry for Course Details
ASAI4013 Applied high-performance computing and parallel programming (6 credits) Academic Year 2025
Offering Department SCDS (Department of Statistics and Actuarial Science) Quota 30
Course Co-ordinator Prof L Qu, SCDS (Department of Statistics and Actuarial Science) < liangqqu@hku.hk >
Teachers Involved (Prof L Qu,Statistics & Actuarial Science)
Course Objectives High-Performance Computing (HPC) and parallel processing are ubiquitous in modern computing. The aim of this course is to provide in-demand skills and knowledge in the field of high-performance and parallel computing with hands-on parallel programming experience on real parallel machines and HPC systems. The course will begin with an introduction to HPC, including SLURM job scheduling and fundamental HPC concepts, to prepare students to effectively use real HPC systems. Next, different parallel programming tools like MPI and OpenMP will be discussed in connection with domain-specific problems. Finally, students will explore CUDA programming for GPU acceleration, GPU architectures, and techniques for parallel training of deep neural networks.
Course Contents & Topics The course will cover:
- Introduction to high-performance computing
- Basic C/C++ programming and common Linux commands
- Parallel programming basics
- Distributed memory programming with MPI
- Share memory programming with OpenMP
- GPU architecture and CUDA programming
Course Learning Outcomes
On successful completion of this course, students should be able to:

CLO 1 Gain foundational knowledge of HPC architectures and systems, including navigating a typical Linux-based HPC environment, understanding SLURM job scheduling, and comprehending basic HPC concepts, preparing students for future HPC interactions and usage.
CLO 2 Understand the fundamentals of parallel programming and acquire the ability to measure and analyze the performance of parallel systems, as well as assess and evaluate application scalability, including weak and strong scaling.
CLO 3 Explore distributed-memory parallel programming using MPI, enabling students to develop efficient parallel applications for distributed-memory systems.
CLO 4 Investigate shared-memory parallel programming with OpenMP, allowing students to harness the power of multi-core processors and shared-memory systems.
CLO 5 Learn CUDA programming for GPU acceleration, laying the groundwork for students to optimize computationally intensive tasks using GPUs.
CLO 6 Gain hands-on experience in designing, implementing, and optimizing HPC and parallel computing applications using real-world problems and datasets.
Pre-requisites
(and Co-requisites and
Impermissible combinations)
Passed in (COMP2113 or COMP2119 or COMP2396) and (SDST3600 or SDST3612); and
Not for students who have passed in SDST4013, or already enrolled in this course.
For BASc(AppliedAI) students only.
Only for students admitted in 2025 and thereafter.
Course Status with Related Major/Minor /Professional Core 2U000C00 Course not offered under any Major/Minor/Professional core
2025 Bachelor of Arts and Sciences in Applied Artificial Intelligence ( Disciplinary Elective )
Course to PLO Mapping 2025 Bachelor of Arts and Sciences in Applied Artificial Intelligence < PLO 1,2 >
Offer in 2025 - 2026 Y        2nd sem    Examination May     
Offer in 2026 - 2027 Y
Course Grade A+ to F
Grade Descriptors
A Demonstrate thorough mastery at an advanced level of extensive knowledge and skills required for attaining all the course learning outcomes. Show strong analytical and critical abilities and logical thinking, with evidence of original thought, and ability to apply knowledge to a wide range of complex, familiar and unfamiliar situations. Apply highly effective organizational and presentational skills.
B Demonstrate substantial command of a broad range of knowledge and skills required for attaining at least most of the course learning outcomes. Show evidence of analytical and critical abilities and logical thinking, and ability to apply knowledge to familiar and some unfamiliar situations. Apply effective organizational and presentational skills.
C Demonstrate general but incomplete command of knowledge and skills required for attaining most of the course learning outcomes. Show evidence of some analytical and critical abilities and logical thinking, and ability to apply knowledge to most familiar situations. Apply moderately effective organizational and presentational skills.
D Demonstrate partial but limited command of knowledge and skills required for attaining some of the course learning outcomes. Show evidence of some coherent and logical thinking, but with limited analytical and critical abilities. Show limited ability to apply knowledge to solve problems. Apply limited or barely effective organizational and presentational skills.
Fail Demonstrate little or no evidence of command of knowledge and skills required for attaining the course learning outcomes. Lack of analytical and critical abilities, logical and coherent thinking. Show very little or no ability to apply knowledge to solve problems. Organization and presentational skills are minimally effective or ineffective.
Communication-intensive Course N
Course Type Lecture-based course
Course Teaching
& Learning Activities
Activities Details No. of Hours
Lectures 36.0
Tutorials 12.0
Reading / Self study 100.0
Assessment Methods
and Weighting
Methods Details Weighting in final
course grade (%)
Assessment Methods
to CLO Mapping
Assignments Coursework (assignments, tutorials, class test(s) and a project) 60.0 1,2,3,4,5,6
Examination One 2-hour written examination 40.0 1,2,3,4,5
Required/recommended reading
and online materials
Hager G, Wellein G. Introduction to high performance computing for scientists and engineers[M]. CRC Press, 2010.
Barney B. Introduction to parallel computing[J]. Lawrence Livermore National Laboratory, 2010, 6(13): 10.
Course Website http://moodle.hku.hk
Additional Course Information