MTR : and HKU Sign MoU on Railway Operation Big Data Analysis

The University of Hong Kong (HKU) and the MTR Corporation (MTR) signed a Memorandum of Understanding on December 19, 2018 to identify and collaborate in research issues of mutual interest on railway operation and maintenance engineering.


MTR is looking for leading technologies to advance its capabilities in data collection and analysis, predictive and prescriptive maintenance, condition-based monitoring systems, artificial intelligence and robotics for the operation of its railway network. HKU will be using its technologies and other facilities in collaboration with MTR for the purpose.


For details, please visit:


港大醫管局AI篩出九成半大血管栓塞中風個案 有望提早兩小時確診

(明報新聞網 | 2018年5月8日) 港大及醫管局去年合作,首次用大數據研究2016年其中300個中風病人資料,透過AI分析病人的病歷、放射影像等數據,結果正確篩出95%患大血管栓塞中風的病人。




A new Master in Data Science is scheduled to be offered in September 2018

Master of Data Science (MDASC) is a taught master programme jointly offered by Department of Statistics and Actuarial Science (host) and Department of Computer Science.


Its interdisciplinarity promotes the applications of computer technology, operational research, statistical modelling, and simulation to decision-making and problem-solving in all organizations and enterprises within the private and public sectors.


For details, please visit:


新型人才 技術突破

From Recruit on November 10, 2017

Mobile app MPF OpAl : Your Portable Investment Guide

Designed by Dr Philip Yu of HKU Department of Statistics and Actuarial Science, the powerful portfolio optimization software "PORTimizerR" and MPF mobile app "MPF Optimal Allocation (MPF OpAl)" was demonstrated at the launch of the "HKU x Cyberport FinTech Nucleus" recently.


The MPF Optimal Allocation (MPF OpAl) is a new mobile app which aimed to provide statistical suggestion on how to re-allocate the investments on MPF so that the users can well manage their own MPF for retirement. The users can view the monthly market price information of each MPF fund, record their own portfolios, track the changes of the portfolios, perform optimization based on their preferences on equity contribution based on the newly developed GPQ method, and project the future performance of the optimized portfolios.


It is hoped that this new mobile app can provide optimal allocation guidance to MPF holders.


MPF OpAl is currently being upgraded, and is expected to be released with new elements. Stay tuned for the latest updates.


HKU Press release:
About PORTimizer and MPF OpAl:

(The above news has also been covered in different media and newspapers)

FinTech Hackathon

Congratulations to our HKU students, Mr. Jia You (PhD candidate in Statistics), Mr. Renjie Lu (PhD candidate in Statistics), Mr. Brian Chan (MPhil. candidate in Physics who is planning to study PhD in Statistics) and Mr. Haofeng Li (PhD in Computer Science), who teamed up and won the second runner-up prize of HKD10,000 in the "FinTech Hackathon", a competition organised by Hang Seng Bank on May 27-28, 2017. Amongst them there were over 110 contestants from eight different tertiary institutions.


Leveraging "Haccelerator" platform co-organised by Hong Kong Monetary Authority and Cyberport, the Hang Seng Bank FinTech Hackathon this year was about applications of machine learning in banking. Teams of contestants would need to pick one challenge to work on and showcase their projects and prototypes on stage after 24 hours. For more event details, please visit:

Access Articles on Big Data!

Technometrics publishes papers that describe new statistical techniques, illustrate innovative application of known statistical methods, or review methods, issues, or philosophy in a particular area of statistics or science. Many of the problems in industry today concern the analysis of huge data sets that lead to improved quality or better understanding of the manufacturing or development process. Recent advances in data acquisition technologies have led to massive amount of data being collected routinely in the phsyical, chemical, and engineering sciences as well as information science and technology. Because of their big volume and complicated structure, big data are difficult to handle using traditional database management and statistical analysis tools. Which creates many new challenges for statisticians to describe and analyze them properly, this issue aims to discuss those challenges.


Please enjoy Free Access to the special issue on Big Data until October 31, 2016.

Building on research strengths@HKU - Built  

Deep Learning for Text analytics

Most of big data are unstructured such as legal documents, news and social media data, resulting in a lot of textual data. The project is to develop new methodologies to analyze textual data. Recently, Dr Philip Yu and his team developed a Bayesian learning algorithm for parsing, a natural language processing (NLP) method to perform syntactic analysis of sentence. The results have been published in Yu and Tang, Y. (2015). They are now developing advanced deep learning models for machine translation with external phrase memory. They will apply the models to Chinese-English translation.


Yu, P.L.H. and Tang, Y. (2015). Bayesian finite mixture models for probabilistic context-free grammars. In Computational Linguistics and Intelligent Text Processing: Proceedings Part I of CICLing 2015, (A. Gelbukh, (Ed.)) Lecture Notes in Computer Science 9041, 201-212.

Analysis of High frequency financial data

There are a lot of high frequency financial data, including tick-by-tick traded prices and volume, bid and ask orders, etc. Analysis of such high frequency data can find applications in volatility forecasting, risk management, portfolio selection and quantitative trading. One of the major problems in the modeling of high-dimensional realized covariance matrix. A paper (Yu, Li and Ng, 2016) on this is accepted for publication in a top journal-Journal of Business and Economic Statistics.


Professors W.K. Li, Jeff Yao and Dr Philip L.H. Yu are developing dynamic component models for high-dimensional realized covariance matrix. Some works that are in the pipeline include:

  1. Development of a generalized conditional autoregressive Wishart model for multivariate realized volatility;
  2. A method to forecast high-dimensional realized volatility matrices using a factor model;
  3. Development of a spiked model for large volatility matrix estimation from noisy high-frequency data.

At the same time, Dr Yu is exploring the feasibility of implementing deep learning in portfolio selection and quantitative trading.


Yu, P.L.H., Li, W.K. and Ng, F.C., (2016) The generalized conditional autoregressive Wishart model for multivariate stochastic volatility. To appear in Journal of Business and Economic Statistics.

Statistical/Machine Learning in Astrophysics

Recently, Dr Philip L.H. Yu has been working with Dr. Pablo Saz Parkinson on the classification and ranking of Fermi LAT gamma-ray sources using machine learning techniques. A paper was published in Astrophysical Journal in 2016. Although the size of the data studied in this paper is not very large, this opens up a new area of multidisciplinary research as there are huge amount of data about the characteristics of the sources collected around the world.


Saz Parkinson, P.M., Xu, H., Yu, P.L.H., Salvetti, D., Marelli, M. and Falcone, A.D. (2016). Classification and ranking of Fermi LAT Gamma-ray sources from the 3FGL catalog using machine learning techniques. Astrophysical Journal. 820(1), 8.

Mixed-state Markov models in image motion analysis

Digital images and video sequences are typical big data since they are of big volume and endowed with a variety of complex structures. Image analysis for these data has been a very active discipline from several decades. One of powerful tools for the analysis of a video sequence is to extract dynamic changes from one frame to the next by computing the time-differentials between subsequent images, that is image motions. In a series of works, we have introduced a novel class of stochastic models for such motion fields from image sequences, namely mixed-state statistical models. The development of a complete theory for these models has permitted to take into account spatial and temporal interaction. These mixed-state models are then used for motion texture analysis based on instantaneous apparent motion maps extracted from dynamic textures. The fitted mixed-state models are further used to motion texture recognition, classification and tracking.


An example of classification of video sequences through the corresponding motion maps is given in the picture below. The picture considers a small sample from a large library of video sequences where one representative image (cover image) for each video is displayed. Here the whole large set of videos are to be classified into meaningful classes according to some similarity measures. These similarity measures are indeed based on the corresponding image motion maps analysed using our new models.