Xiaolong Wang

Assistant Professor, UC San Diego [GitHub] [Google Scholar] [CV]
Home Publication Group Contact

Selected Papers


Chenhao Lu*, Xuxin Cheng*, Jialong Li*, Shiqi Yang, Mazeyu Ji, Chengjing Yuan, Ge Yang, Sha Yi, Xiaolong Wang.
Mobile-TeleVision: Predictive Motion Priors for Humanoid Whole-Body Control.
International Conference on Robotics and Automation (ICRA), 2025.

[arXiv] [project page] [code]

Isabella Liu, Hao Su†, Xiaolong Wang†.
Dynamic Gaussians Mesh: Consistent Mesh Reconstruction from Monocular Videos.
International Conference on Learning Representations (ICLR), 2025.

[arXiv] [project page] [code]

Shiqi Yang, Minghuan Liu, Yuzhe Qin, Runyu Ding, Jialong Li, Xuxin Cheng, Ruihan Yang, Sha Yi, Xiaolong Wang.
ACE: A Cross-Platform Visual-Exoskeletons System for Low-Cost Dexterous Teleoperation.
Conference on Robot Learning (CoRL), 2024.

[arXiv] [project page] [code] [hardware design]

Jun Wang*, Ying Yuan*, Haichuan Che*, Haozhi Qi*, Yi Ma, Jitendra Malik, Xiaolong Wang.
Lessons from Learning to Spin “Pens”.
Conference on Robot Learning (CoRL), 2024.

[arXiv] [project page] [code]

Xuxin Cheng*, Jialong Li*, Shiqi Yang, Ge Yang, Xiaolong Wang.
Open-TeleVision: Teleoperation with Immersive Active Visual Feedback.
Conference on Robot Learning (CoRL), 2024.

[arXiv] [project page] [code] [hardware] [dataset]

Minghuan Liu*, Zixuan Chen*, Xuxin Cheng, Yandong Ji, Ruihan Yang, Xiaolong Wang.
Visual Whole-Body Control for Legged Loco-Manipulation.
Conference on Robot Learning (CoRL), 2024.
Oral Presentation

[arXiv] [project page] [code]

Adrian Remonda, Nicklas Hansen, Ayoub Raji, Nicola Musiu, Marko Bertogna, Eduardo E. Veas, Xiaolong Wang.
A Simulation Benchmark for Autonomous Racing with Large-Scale Human Data.
Conference on Neural Information Processing Systems (NeurIPS), 2024.

[arXiv] [project page] [code] [dataset]

An-Chieh Cheng, Hongxu Yin, Yang Fu, Qiushan Guo, Ruihan Yang, Jan Kautz, Xiaolong Wang, Sifei Liu.
SpatialRGPT: Grounded Spatial Reasoning in Vision-Language Models.
Conference on Neural Information Processing Systems (NeurIPS), 2024.

[arXiv] [project page]

Yu Sun*, Xinhao Li*, Karan Dalal*, Jiarui Xu, Arjun Vikram, Genghan Zhang, Yann Dubois, Xinlei Chen†, Xiaolong Wang†, Sanmi Koyejo†, Tatsunori Hashimoto†, Carlos Guestrin†.
Learning to (Learn at Test Time): RNNs with Expressive Hidden States.
arXiv, 2024.

[arXiv] [jax code] [pytorch code]

Runyu Ding*, Yuzhe Qin*, Jiyue Zhu*, Chengzhe Jia, Shiqi Yang, Ruihan Yang, Xiaojuan Qi, Xiaolong Wang.
Bunny-VisionPro: Real-Time Bimanual Dexterous Teleoperation for Imitation Learning.
arXiv, 2024.

[arXiv] [project page] [code] [Hand URDFs]

Xuxin Cheng*, Yandong Ji*, Junming Chen, Ruihan Yang, Ge Yang, Xiaolong Wang.
Expressive Whole-Body Control for Humanoid Robots.
Robotics: Science and Systems (RSS), 2024.

[arXiv] [project page] [code]

Ruihan Yang, Yejin Kim, Aniruddha Kembhavi, Xiaolong Wang, Kiana Ehsani.
Harmonic Mobile Manipulation.
International Conference on Intelligent Robots and Systems (IROS), 2024.
Oral Presentation
Best Paper Award on Mobile Manipulation

[arXiv] [project page]

Yinbo Chen, Oliver Wang, Richard Zhang, Eli Shechtman, Xiaolong Wang†, Michael Gharbi†.
Image Neural Field Diffusion Models.
Conference on Computer Vision and Pattern Recognition (CVPR), 2024.
Highlight

[arXiv] [project page]

Yang Fu, Sifei Liu, Amey Kulkarni, Jan Kautz, Alexei A. Efros, Xiaolong Wang.
COLMAP-Free 3D Gaussian Splatting.
Conference on Computer Vision and Pattern Recognition (CVPR), 2024.
Highlight

[arXiv] [project page] [code]

Jiarui Xu, Xingyi Zhou, Shen Yan, Xiuye Gu, Anurag Arnab, Chen Sun, Xiaolong Wang, Cordelia Schmid.
Pixel Aligned Language Models.
Conference on Computer Vision and Pattern Recognition (CVPR), 2024.

[arXiv] [project page] [code]

Ying Yuan*, Haichuan Che*, Yuzhe Qin*, Binghao Huang, Zhao-Heng Yin, Kang-Won Lee, Yi Wu, Soo-Chul Lim, Xiaolong Wang.
Robot Synesthesia: In-Hand Manipulation with Visuotactile Sensing.
International Conference on Robotics and Automation (ICRA), 2024.

[arXiv] [project page] [code]

Nicklas Hansen, Hao Su*, Xiaolong Wang*.
TD-MPC2: Scalable, Robust World Models for Continuous Control.
International Conference on Learning Representations (ICLR), 2024.
Spotlight

[arXiv] [project page] [code] [models] [dataset]

Binghao Huang*, Yuanpei Chen*, Tianyu Wang, Yuzhe Qin, Yaodong Yang, Nikolay Atanasov, Xiaolong Wang.
Dynamic Handover: Throw and Catch with Bimanual Hands .
Conference on Robot Learning (CoRL), 2023.

[arXiv] [project page] [code]

Yanjie Ze*, Ge Yan*, Yueh-Hua Wu*, Annabella Macaluso, Yuying Ge, Jianglong Ye, Nicklas Hansen, Li Erran Li, Xiaolong Wang.
GNFactor: Multi-Task Real Robot Learning with Generalizable Neural Feature Fields.
Conference on Robot Learning (CoRL), 2023.
Oral Presentation

[arXiv] [project page] [code]

Yunhai Feng*, Nicklas Hansen*, Ziyan Xiong*, Chandramouli Rajagopalan, Xiaolong Wang.
Finetuning Offline World Models in the Real World.
Conference on Robot Learning (CoRL), 2023.
Oral Presentation

[arXiv] [project page] [code]

Yuzhe Qin, Wei Yang, Binghao Huang, Karl Van Wyk, Hao Su, Xiaolong Wang, Yu-Wei Chao, Dieter Fox.
AnyTeleop: A General Vision-Based Dexterous Robot Arm-Hand Teleoperation System.
Robotics: Science and Systems (RSS), 2023.

[arxiv] [project] [retargeting code] [visualizer code]

Zhao-Heng Yin*, Binghao Huang*, Yuzhe Qin, Qifeng Chen, Xiaolong Wang.
Rotating without Seeing: Towards In-hand Dexterity through Touch.
Robotics: Science and Systems (RSS), 2023.

[arXiv] [project page] [video] [code]

Ruihan Yang, Ge Yang, Xiaolong Wang.
Neural Volumetric Memory for Visual Locomotion Control.
Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
Highlight

[arXiv] [project page] [video]

Jiarui Xu, Sifei Liu*, Arash Vahdat*, Wonmin Byeon, Xiaolong Wang, Shalini De Mello.
Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models.
Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
Highlight

[arXiv] [project page] [video] [demo] [code]

Yuzhe Qin*, Binghao Huang*, Zhao-Heng Yin, Hao Su, Xiaolong Wang.
DexPoint: Generalizable Point Cloud Reinforcement Learning for Sim-to-Real Dexterous Manipulation.
Conference on Robot Learning (CoRL), 2022.

[arXiv] [project page] [video] [code]

Sateesh Kumar, Jonathan Zamora*, Nicklas Hansen*, Rishabh Jangir, Xiaolong Wang.
Graph Inverse Reinforcement Learning from Diverse Videos.
Conference on Robot Learning (CoRL), 2022.
Oral Presentation

[arXiv] [project page] [video] [code]

Yang Fu, Xiaolong Wang.
Category-Level 6D Object Pose Estimation in the Wild: A Semi-Supervised Learning Approach and A New Dataset.
Conference on Neural Information Processing Systems (NeurIPS), 2022.

[arXiv] [project page] [dataset] [Wild6D code]

Yuzhe Qin*, Yueh-Hua Wu*, Shaowei Liu, Hanwen Jiang, Ruihan Yang, Yang Fu, Xiaolong Wang.
DexMV: Imitation Learning for Dexterous Manipulation from Human Videos.
European Conference on Computer Vision (ECCV), 2022.

[arXiv] [project page] [video] [code]

Yuzhe Qin, Hao Su*, Xiaolong Wang*.
From One Hand to Multiple Hands: Imitation Learning for Dexterous Manipulation from Single-Camera Teleoperation.
Robotics and Automation Letters (RA-L), 2022.
International Conference on Intelligent Robots and Systems (IROS), 2022.

[arXiv] [project page] [video] [code]

Nicklas Hansen, Xiaolong Wang*, Hao Su*.
Temporal Difference Learning for Model Predictive Control.
International Conference on Machine Learning (ICML), 2022.

[arXiv] [project page] [video] [code]

Xuanchi Ren, Xiaolong Wang.
Look Outside the Room: Synthesizing A Consistent Long-Term 3D Scene Video from A Single Image.
Conference on Computer Vision and Pattern Recognition (CVPR), 2022.

[arXiv] [project page] [video] [code]

Jiarui Xu, Shalini De Mello, Sifei Liu, Wonmin Byeon, Thomas Breuel, Jan Kautz, Xiaolong Wang.
GroupViT: Semantic Segmentation Emerges from Text Supervision.
Conference on Computer Vision and Pattern Recognition (CVPR), 2022.

[arXiv] [project page] [video] [code] [huggingface colab] [huggingface demo]

Ruihan Yang*, Minghao Zhang*, Nicklas Hansen, Huazhe Xu, Xiaolong Wang.
Learning Vision-Guided Quadrupedal Locomotion End-to-End with Cross-Modal Transformers.
International Conference on Learning Representations (ICLR), 2022.
Spotlight Presentation

[arXiv] [project page] [video] [code]

Zihang Lai, Sifei Liu, Alexei A. Efros, Xiaolong Wang.
Video Autoencoder: self-supervised disentanglement of static 3D structure and motion.
International Conference on Computer Vision (ICCV), 2021.
Oral Presentation

[arXiv] [project page] [code] [video]

Jiarui Xu, Xiaolong Wang.
Rethinking Self-supervised Correspondence Learning: A Video Frame-level Similarity Perspective.
International Conference on Computer Vision (ICCV), 2021.
Oral Presentation

[arXiv] [project page] [code]

Hanwen Jiang*, Shaowei Liu*, Jiashun Wang, Xiaolong Wang.
Hand-Object Contact Consistency Reasoning for Human Grasps Generation.
International Conference on Computer Vision (ICCV), 2021.
Oral Presentation

[arXiv] [project page] [code]

Yinbo Chen, Sifei Liu, Xiaolong Wang.
Learning Continuous Image Representation with Local Implicit Image Function.
Conference on Computer Vision and Pattern Recognition (CVPR), 2021.
Oral Presentation

[arXiv] [project page] [code]

Qiang Zhang, Tete Xiao, Alexei A. Efros, Lerrel Pinto, Xiaolong Wang.
Learning Cross-domain Correspondence for Control with Dynamics Cycle-consistency.
International Conference on Learning Representations (ICLR), 2021.
Oral Presentation

[arXiv] [project page] [code]

Nicklas Hansen, Rishabh Jangir, Yu Sun, Guillem Alenyà, Pieter Abbeel, Alexei A. Efros, Lerrel Pinto, Xiaolong Wang.
Self-Supervised Policy Adaptation during Deployment.
International Conference on Learning Representations (ICLR), 2021.
Spotlight Presentation

[arXiv] [project page] [code] [bair blog post]

Yu Sun, Xiaolong Wang, Zhuang Liu, John Miller, Alexei A. Efros, Moritz Hardt.
Test-Time Training with Self-Supervision for Generalization under Distribution Shifts.
International Conference on Machine Learning (ICML), 2020.

[arXiv] [code and project page] [BibTeX]

Xiaolong Wang*, Allan Jabri* and Alexei A. Efros.
Learning Correspondence from the Cycle-consistency of Time.
Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
Oral Presentation

[project page] [slides] [result video] [oral talk]
[arXiv] [BibTeX] [code]

Xiaolong Wang, Ross Girshick, Abhinav Gupta, and Kaiming He.
Non-local Neural Networks.
Conference on Computer Vision and Pattern Recognition (CVPR), 2018.

[arXiv] [BibTeX] [code]

Xiaolong Wang*, Rohit Girdhar*, and Abhinav Gupta.
Binge Watching: Scaling Affordance Learning from Sitcoms.
Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
Spotlight Presentation

[pdf] [BibTeX] [dataset] [project page] [spotlight video]

Xiaolong Wang and Abhinav Gupta.
Unsupervised Learning of Visual Representations using Videos.
International Conference on Computer Vision (ICCV), 2015

[pdf] [BibTeX] [code] [model] [mined_patches] [project page] [spotlight video]