Publications
(* indicates equal contribution)
Your browser does not support the video tag.
MineDojo: Building Open-Ended Embodied Agents with Internet-Scale Knowledge
Linxi Fan ,
Guanzhi Wang* ,
Yunfan Jiang* ,
Ajay Mandlekar ,
Yuncong Yang ,
Haoyi Zhu ,
Andrew Tang ,
De-An Huang ,
Yuke Zhu†,
Anima Anandkumarâ€
Neural Information Processing Systems (NeurIPS) Datasets and Benchmarks Track , 2022
✨ Outstanding Paper Award ✨
[paper]  
[project page]  
[code]  
We introduced MineDojo, a new framework based on the popular Minecraft game for building generally capable, open-ended embodied agents.
Your browser does not support the video tag.
VIMA: General Robot Manipulation with Multimodal Prompts
Yunfan Jiang ,
Agrim Gupta* ,
Zichen "Charles" Zhang* ,
Guanzhi Wang* ,
Yongqiang Dou ,
Yanjun Chen ,
Li Fei-Fei ,
Anima Anandkumar ,
Yuke Zhu ,
Linxi "Jim" Fan
Neural Information Processing Systems (NeurIPS) Foundation Models for Decision Making Workshop , 2022 (Oral Presentation)
[paper]  
[project page]  
[code]  
We introduced a novel multimodal prompting formulation that converts diverse robot manipulation tasks into a uniform sequence modeling problem.
SECANT: Self-Expert Cloning for Zero-Shot Generalization of Visual Policies
Linxi Fan ,
Guanzhi Wang ,
De-An Huang ,
Zhiding Yu ,
Li Fei-Fei ,
Yuke Zhu ,
Anima Anandkumar
International Conference on Machine Learning (ICML) , 2021
[paper]  
[project page]  
[code]  
We proposed SECANT, a novel self-expert cloning technique that leverages image augmentation in two stages to decouple robust representation learning from policy optimization.
iGibson 1.0: a Simulation Environment for Interactive Tasks in Large Realistic Scenes
Bokui Shen* ,
Fei Xia* ,
Chengshu Li* ,
Roberto MartÃn-MartÃn* ,
Linxi Fan ,
Guanzhi Wang ,
Claudia D’Arpino ,
Shyamal Buch ,
Sanjana Srivastava ,
Lyne P. Tchapmi ,
Micael E. Tchapmi ,
Kent Vainio ,
Josiah Wong ,
Li Fei-Fei ,
Silvio Savarese
International Conference on Intelligent Robots and Systems (IROS) , 2021
[paper]  
[project page]  
[code]  
We presented iGibson, a novel simulation environment for developing interactive robotic agents in large-scale realistic scenes.
Deep Video Matting via Spatio-Temporal Alignment and Aggregation
Yanan Sun ,
Guanzhi Wang* ,
Qiao Gu* ,
Chi-Keung Tang ,
Yu-Wing Tai
Conference on Computer Vision and Pattern Recognition (CVPR) , 2021
[paper]  
[code]  
[dataset]
We proposed a deep learning-based video matting framework which employs a novel and effective spatio-temporal feature aggregation module.
RubiksNet: Learnable 3D-Shift for Efficient Video Action Recognition
Linxi Fan* ,
Shyamal Buch* ,
Guanzhi Wang ,
Ryan Cao ,
Yuke Zhu ,
Juan Carlos Niebles ,
Li Fei-Fei
European Conference on Computer Vision (ECCV) , 2020
[paper]  
[project page]  
[video]  
[supplementary]  
[code]  
We proposed RubiksNet, a new efficient architecture for video action recognition based on a proposed learnable 3D spatiotemporal shift operation (RubiksShift).
LADN: Local Adversarial Disentangling Network for Facial Makeup and De-Makeup
Qiao Gu* ,
Guanzhi Wang* ,
Mang Tik Chiu ,
Yu-Wing Tai ,
Chi-Keung Tang
International Conference on Computer Vision (ICCV) , 2019
[paper]  
[project page]  
[code]  
[dataset]
We proposed a local adversarial disentangling network for facial makeup and de-makeup, using multiple and overlapping local discriminators in a content-style disentangling network.
Conference Reviewer: NeurIPS 2022, ICLR 2022, ICCV 2021, CVPR 2021, ECCV 2020
Awards
NeurIPS Outstanding Paper Award (2022)
Kortschak Scholar (2021)
Stanford Human-Centered AI Google Cloud Credits Grant (2021)
Stanford Human-Centered AI AWS Cloud Credits Award (2020)
HKUST Academic Achievement Medal (2019)
Talent Development Scholarship (2019)
Reaching Out Award (2018)
High Fashion Charitable Foundation Exchange Scholarship (2018)
Overseas Learning Experience Scholarship (2018)
Dean’s List (2015-2019)
University Recruitment Scholarship (2015-2019)