|
|
|
| |
Seminar by Mr. Junyang LIN from Alibaba Group
|
Date | Tuesday, 5 December 2023 |
Time | 10:30 a.m. – 11:30 a.m. |
Venue | RR301, Run Run Shaw Building |
|
Title | Qwen: Towards a generalist model |
Abstract |
This talk introduces the large language and multimodal model series Qwen, which stands for Tongyi Qianwen (通义千问), published and opensourced by Alibaba Group. The Qwen models have achieved competitive performance against both opensource and proprietary LLMs and LMMs in both benchmark evaluation and human evaluation. This talk provides a brief overview of the model series, and then delves into details about building the LLMs and LMMs, including pretraining, alignment, multimodal extension, as well as the opensource. Additionally, it points out the limitations, and discusses the future work for both research community and industry in this field. |
About the speaker |
Junyang Lin is a staff engineer of Alibaba Group, and he is now a leader of Qwen Team. He has been doing research in natural language processing and multimodal representation learning, with a focus on large-scale pretraining, and he has around 3000 citations. Recently his team released and opensourced the Qwen series, including large language model Qwen, large vision-language model Qwen-VL, and large audio-language model Qwen-Audio. Previously, he focused on building large-scale pretraining with a focus on multimodal pretraining, and developed opensourced models OFA, Chinese-CLIP, etc.
Paper: https://arxiv.org/abs/2309.16609 GitHub: https://github.com/QwenLM .
|
|
|
| |
|
|
|
|
|