DPhil in Computer Vision, University of Oxford
January 2024 – Present
Supervisor: Prof. Philip Torr
Welcome to my website.
My research focuses on multimodal foundation models, including multimodal large language models and generative models, as well as the interactions between them.
I am particularly interested in multimodal approaches to general intelligence, especially how vision can contribute to its development through large-scale pre-training.
Recent representative papers are highlighted.