DPhil in Computer Vision, University of Oxford
January 2024 – Present
Supervisor: Prof. Philip Torr
Welcome to my website.
My research focuses on multimodal foundation models, including multimodal large language models and generative models, as well as the interactions between them.
I am particularly interested in a vision-first path toward general intelligence, where visual pre-training plays a central role.
Recent representative papers are highlighted.