Xiaolong Wang

Associate Professor, UC San Diego [GitHub] [Google Scholar] [CV]

Home Publication Group Contact

Select All Publication by Year Self-Supervised Learning Video Understanding Common Sense Reasoning RL and Robotics 3D Interaction Dexterous Hand

Selected Papers

Ri-Zhao Qiu*, Shiqi Yang*, Xuxin Cheng*, Chaitanya Chawla, Jialong Li, Tairan He, Ge Yan, David J. Yoon, Ryan Hoque, Lars Paulsen, Ge Yang, Jian Zhang, Sha Yi, Guanya Shi, Xiaolong Wang.
Humanoid Policy ~ Human Policy.
Conference on Robot Learning (CoRL), 2025.

[arXiv] [project page] [code] [data]

Sha Yi*, Xueqian Bai*, Adabhav Singh, Jianglong Ye, Michael T Tolley, Xiaolong Wang.
Co-Design of Soft Gripper with Neural Physics.
Conference on Robot Learning (CoRL), 2025.

[arXiv] [project page]

Jianglong Ye* , Keyi Wang*, Chengjing Yuan, Ruihan Yang, Yiquan Li, Jiyue Zhu, Yuzhe Qin, Xueyan Zou, and Xiaolong Wang.
Dex1B: Learning with 1B Demonstrations for Dexterous Manipulation.
Robotics: Science and Systems (RSS), 2025.

[arXiv] [project page]

Yu Sun*, Xinhao Li*, Karan Dalal*, Jiarui Xu, Arjun Vikram, Genghan Zhang, Yann Dubois, Xinlei Chen†, Xiaolong Wang†, Sanmi Koyejo†, Tatsunori Hashimoto†, Carlos Guestrin†.
Learning to (Learn at Test Time): RNNs with Expressive Hidden States.
International Conference on Machine Learning (ICML), 2025.
Spotlight

[arXiv] [jax code] [pytorch code]

Jialong Li, Xuxin Cheng, Tianshu Huang, Shiqi Yang, Ri-Zhao Qiu, Xiaolong Wang.
AMO: Adaptive Motion Optimization for Hyper-Dexterous Humanoid Whole-Body Control.
Robotics: Science and Systems (RSS), 2025.

[arXiv] [project page] [models and code]

An-Chieh Cheng*, Yandong Ji*, Zhaojing Yang*, Zaitian Gongye, Xueyan Zou, Jan Kautz, Erdem Bıyık, Hongxu Yin, Sifei Liu, Xiaolong Wang.
NaVILA: Legged Robot Vision-Language-Action Model for Navigation.
Robotics: Science and Systems (RSS), 2025.

[arXiv] [project page] [locomotion code]

Karan Dalal*, Daniel Koceja*, Gashon Hussein*, Jiarui Xu*, Yue Zhao+, Youjin Song+, Shihao Han, Ka Chun Cheung, Jan Kautz, Carlos Guestrin, Tatsunori Hashimoto, Sanmi Koyejo, Yejin Choi, Yu Sun, Xiaolong Wang.
One-Minute Video Generation with Test-Time Training.
Conference on Computer Vision and Pattern Recognition (CVPR), 2025.

[arXiv] [project page] [code]

Chenhao Lu*, Xuxin Cheng*, Jialong Li*, Shiqi Yang, Mazeyu Ji, Chengjing Yuan, Ge Yang, Sha Yi, Xiaolong Wang.
Mobile-TeleVision: Predictive Motion Priors for Humanoid Whole-Body Control.
International Conference on Robotics and Automation (ICRA), 2025.

[arXiv] [project page] [code]

Isabella Liu, Hao Su†, Xiaolong Wang†.
Dynamic Gaussians Mesh: Consistent Mesh Reconstruction from Monocular Videos.
International Conference on Learning Representations (ICLR), 2025.

[arXiv] [project page] [code]

Jun Wang*, Ying Yuan*, Haichuan Che*, Haozhi Qi*, Yi Ma, Jitendra Malik, Xiaolong Wang.
Lessons from Learning to Spin “Pens”.
Conference on Robot Learning (CoRL), 2024.

[arXiv] [project page] [code]

Xuxin Cheng*, Jialong Li*, Shiqi Yang, Ge Yang, Xiaolong Wang.
Open-TeleVision: Teleoperation with Immersive Active Visual Feedback.
Conference on Robot Learning (CoRL), 2024.

[arXiv] [project page] [code] [hardware] [dataset]

Minghuan Liu*, Zixuan Chen*, Xuxin Cheng, Yandong Ji, Ruihan Yang, Xiaolong Wang.
Visual Whole-Body Control for Legged Loco-Manipulation.
Conference on Robot Learning (CoRL), 2024.
Oral Presentation

[arXiv] [project page] [code]

Adrian Remonda, Nicklas Hansen, Ayoub Raji, Nicola Musiu, Marko Bertogna, Eduardo E. Veas, Xiaolong Wang.
A Simulation Benchmark for Autonomous Racing with Large-Scale Human Data.
Conference on Neural Information Processing Systems (NeurIPS), 2024.

[arXiv] [project page] [code] [dataset]

An-Chieh Cheng, Hongxu Yin, Yang Fu, Qiushan Guo, Ruihan Yang, Jan Kautz, Xiaolong Wang, Sifei Liu.
SpatialRGPT: Grounded Spatial Reasoning in Vision-Language Models.
Conference on Neural Information Processing Systems (NeurIPS), 2024.

[arXiv] [project page] [code]

Xuxin Cheng*, Yandong Ji*, Junming Chen, Ruihan Yang, Ge Yang, Xiaolong Wang.
Expressive Whole-Body Control for Humanoid Robots.
Robotics: Science and Systems (RSS), 2024.

[arXiv] [project page] [code]

Ruihan Yang, Yejin Kim, Aniruddha Kembhavi, Xiaolong Wang, Kiana Ehsani.
Harmonic Mobile Manipulation.
International Conference on Intelligent Robots and Systems (IROS), 2024.
Oral Presentation
Best Paper Award on Mobile Manipulation

[arXiv] [project page] [code]

Yinbo Chen, Oliver Wang, Richard Zhang, Eli Shechtman, Xiaolong Wang†, Michael Gharbi†.
Image Neural Field Diffusion Models.
Conference on Computer Vision and Pattern Recognition (CVPR), 2024.
Highlight

[arXiv] [project page] [code]

Yang Fu, Sifei Liu, Amey Kulkarni, Jan Kautz, Alexei A. Efros, Xiaolong Wang.
COLMAP-Free 3D Gaussian Splatting.
Conference on Computer Vision and Pattern Recognition (CVPR), 2024.
Highlight

[arXiv] [project page] [code]

Ying Yuan*, Haichuan Che*, Yuzhe Qin*, Binghao Huang, Zhao-Heng Yin, Kang-Won Lee, Yi Wu, Soo-Chul Lim, Xiaolong Wang.
Robot Synesthesia: In-Hand Manipulation with Visuotactile Sensing.
International Conference on Robotics and Automation (ICRA), 2024.

[arXiv] [project page] [code]

Nicklas Hansen, Hao Su*, Xiaolong Wang*.
TD-MPC2: Scalable, Robust World Models for Continuous Control.
International Conference on Learning Representations (ICLR), 2024.
Spotlight

[arXiv] [project page] [code] [models] [dataset]

Binghao Huang*, Yuanpei Chen*, Tianyu Wang, Yuzhe Qin, Yaodong Yang, Nikolay Atanasov, Xiaolong Wang.
Dynamic Handover: Throw and Catch with Bimanual Hands .
Conference on Robot Learning (CoRL), 2023.

[arXiv] [project page] [code]

Yanjie Ze*, Ge Yan*, Yueh-Hua Wu*, Annabella Macaluso, Yuying Ge, Jianglong Ye, Nicklas Hansen, Li Erran Li, Xiaolong Wang.
GNFactor: Multi-Task Real Robot Learning with Generalizable Neural Feature Fields.
Conference on Robot Learning (CoRL), 2023.
Oral Presentation

[arXiv] [project page] [code]

Yunhai Feng*, Nicklas Hansen*, Ziyan Xiong*, Chandramouli Rajagopalan, Xiaolong Wang.
Finetuning Offline World Models in the Real World.
Conference on Robot Learning (CoRL), 2023.
Oral Presentation

[arXiv] [project page] [code]

Yuzhe Qin, Wei Yang, Binghao Huang, Karl Van Wyk, Hao Su, Xiaolong Wang, Yu-Wei Chao, Dieter Fox.
AnyTeleop: A General Vision-Based Dexterous Robot Arm-Hand Teleoperation System.
Robotics: Science and Systems (RSS), 2023.

[arxiv] [project] [retargeting code] [visualizer code]

Zhao-Heng Yin*, Binghao Huang*, Yuzhe Qin, Qifeng Chen, Xiaolong Wang.
Rotating without Seeing: Towards In-hand Dexterity through Touch.
Robotics: Science and Systems (RSS), 2023.

[arXiv] [project page] [video] [code]

Ruihan Yang, Ge Yang, Xiaolong Wang.
Neural Volumetric Memory for Visual Locomotion Control.
Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
Highlight

[arXiv] [project page] [video]

Jiarui Xu, Sifei Liu*, Arash Vahdat*, Wonmin Byeon, Xiaolong Wang, Shalini De Mello.
Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models.
Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
Highlight

[arXiv] [project page] [video] [demo] [code]

Yuzhe Qin*, Binghao Huang*, Zhao-Heng Yin, Hao Su, Xiaolong Wang.
DexPoint: Generalizable Point Cloud Reinforcement Learning for Sim-to-Real Dexterous Manipulation.
Conference on Robot Learning (CoRL), 2022.

[arXiv] [project page] [video] [code]

Sateesh Kumar, Jonathan Zamora*, Nicklas Hansen*, Rishabh Jangir, Xiaolong Wang.
Graph Inverse Reinforcement Learning from Diverse Videos.
Conference on Robot Learning (CoRL), 2022.
Oral Presentation

[arXiv] [project page] [video] [code]

Yang Fu, Xiaolong Wang.
Category-Level 6D Object Pose Estimation in the Wild: A Semi-Supervised Learning Approach and A New Dataset.
Conference on Neural Information Processing Systems (NeurIPS), 2022.

[arXiv] [project page] [dataset] [Wild6D code]

Yuzhe Qin*, Yueh-Hua Wu*, Shaowei Liu, Hanwen Jiang, Ruihan Yang, Yang Fu, Xiaolong Wang.
DexMV: Imitation Learning for Dexterous Manipulation from Human Videos.
European Conference on Computer Vision (ECCV), 2022.

[arXiv] [project page] [video] [code]

Yuzhe Qin, Hao Su*, Xiaolong Wang*.
From One Hand to Multiple Hands: Imitation Learning for Dexterous Manipulation from Single-Camera Teleoperation.
Robotics and Automation Letters (RA-L), 2022.
International Conference on Intelligent Robots and Systems (IROS), 2022.

[arXiv] [project page] [video] [code]

Nicklas Hansen, Xiaolong Wang*, Hao Su*.
Temporal Difference Learning for Model Predictive Control.
International Conference on Machine Learning (ICML), 2022.

[arXiv] [project page] [video] [code]

Xuanchi Ren, Xiaolong Wang.
Look Outside the Room: Synthesizing A Consistent Long-Term 3D Scene Video from A Single Image.
Conference on Computer Vision and Pattern Recognition (CVPR), 2022.

[arXiv] [project page] [video] [code]

Jiarui Xu, Shalini De Mello, Sifei Liu, Wonmin Byeon, Thomas Breuel, Jan Kautz, Xiaolong Wang.
GroupViT: Semantic Segmentation Emerges from Text Supervision.
Conference on Computer Vision and Pattern Recognition (CVPR), 2022.

[arXiv] [project page] [video] [code] [huggingface colab] [huggingface demo]

Ruihan Yang*, Minghao Zhang*, Nicklas Hansen, Huazhe Xu, Xiaolong Wang.
Learning Vision-Guided Quadrupedal Locomotion End-to-End with Cross-Modal Transformers.
International Conference on Learning Representations (ICLR), 2022.
Spotlight Presentation

[arXiv] [project page] [video] [code]

Zihang Lai, Sifei Liu, Alexei A. Efros, Xiaolong Wang.
Video Autoencoder: self-supervised disentanglement of static 3D structure and motion.
International Conference on Computer Vision (ICCV), 2021.
Oral Presentation

[arXiv] [project page] [code] [video]

Jiarui Xu, Xiaolong Wang.
Rethinking Self-supervised Correspondence Learning: A Video Frame-level Similarity Perspective.
International Conference on Computer Vision (ICCV), 2021.
Oral Presentation

[arXiv] [project page] [code]

Hanwen Jiang*, Shaowei Liu*, Jiashun Wang, Xiaolong Wang.
Hand-Object Contact Consistency Reasoning for Human Grasps Generation.
International Conference on Computer Vision (ICCV), 2021.
Oral Presentation

[arXiv] [project page] [code]

Yinbo Chen, Sifei Liu, Xiaolong Wang.
Learning Continuous Image Representation with Local Implicit Image Function.
Conference on Computer Vision and Pattern Recognition (CVPR), 2021.
Oral Presentation

[arXiv] [project page] [code]

Qiang Zhang, Tete Xiao, Alexei A. Efros, Lerrel Pinto, Xiaolong Wang.
Learning Cross-domain Correspondence for Control with Dynamics Cycle-consistency.
International Conference on Learning Representations (ICLR), 2021.
Oral Presentation

[arXiv] [project page] [code]

Nicklas Hansen, Rishabh Jangir, Yu Sun, Guillem Alenyà, Pieter Abbeel, Alexei A. Efros, Lerrel Pinto, Xiaolong Wang.
Self-Supervised Policy Adaptation during Deployment.
International Conference on Learning Representations (ICLR), 2021.
Spotlight Presentation

[arXiv] [project page] [code] [bair blog post]

Yu Sun, Xiaolong Wang, Zhuang Liu, John Miller, Alexei A. Efros, Moritz Hardt.
Test-Time Training with Self-Supervision for Generalization under Distribution Shifts.
International Conference on Machine Learning (ICML), 2020.

[arXiv] [code and project page] [BibTeX]

Xiaolong Wang*, Allan Jabri* and Alexei A. Efros.
Learning Correspondence from the Cycle-consistency of Time.
Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
Oral Presentation

[project page] [slides] [result video] [oral talk]
[arXiv] [BibTeX] [code]

Xiaolong Wang, Ross Girshick, Abhinav Gupta, and Kaiming He.
Non-local Neural Networks.
Conference on Computer Vision and Pattern Recognition (CVPR), 2018.

[arXiv] [BibTeX] [code]

Xiaolong Wang*, Rohit Girdhar*, and Abhinav Gupta.
Binge Watching: Scaling Affordance Learning from Sitcoms.
Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
Spotlight Presentation

[pdf] [BibTeX] [dataset] [project page] [spotlight video]

Xiaolong Wang and Abhinav Gupta.
Unsupervised Learning of Visual Representations using Videos.
International Conference on Computer Vision (ICCV), 2015

[pdf] [BibTeX] [code] [model] [mined_patches] [project page] [spotlight video]