Xiaolong Wang

Select All Publication by Year Self-Supervised Learning Video Understanding Common Sense Reasoning RL and Robotics 3D Interaction Dexterous Hand

Ri-Zhao Qiu*, Shiqi Yang*, Xuxin Cheng*, Chaitanya Chawla, Jialong Li, Tairan He, Ge Yan, David J. Yoon, Ryan Hoque, Lars Paulsen, Ge Yang, Jian Zhang, Sha Yi, Guanya Shi, Xiaolong Wang.
Humanoid Policy ~ Human Policy.
Conference on Robot Learning (CoRL), 2025.

[arXiv] [project page] [code] [data]

Sha Yi*, Xueqian Bai*, Adabhav Singh, Jianglong Ye, Michael T Tolley, Xiaolong Wang.
Co-Design of Soft Gripper with Neural Physics.
Conference on Robot Learning (CoRL), 2025.

[arXiv] [project page]

Binghao Huang, Jie Xu, Iretiayo Akinola, Wei Yang, Balakumar Sundaralingam, Rowland O'Flaherty, Dieter Fox, Xiaolong Wang, Arsalan Mousavian, Yu-Wei Chao, Yunzhu Li.
VT-Refine: Learning Bimanual Assembly with Visuo-Tactile Feedback via Simulation Fine-Tuning.
Conference on Robot Learning (CoRL), 2025.

[arXiv] [project page]

Runyu Ding*, Yuzhe Qin*, Jiyue Zhu*, Chengzhe Jia, Shiqi Yang, Ruihan Yang, Xiaojuan Qi, Xiaolong Wang.
Bunny-VisionPro: Real-Time Bimanual Dexterous Teleoperation for Imitation Learning.
International Conference on Intelligent Robots and Systems (IROS), 2025.

[arXiv] [project page] [code] [Hand URDFs]

Ri-Zhao Qiu, Yafei Hu, Ge Yang, Yuchen Song, Yang Fu, Jianglong Ye, Jiteng Mu, Ruihan Yang, Nikolay Atanasov, Sebastian Scherer, Xiaolong Wang.
Learning Generalizable Feature Fields for Mobile Manipulation.
International Conference on Intelligent Robots and Systems (IROS), 2025.

[arXiv] [project page]

Ge Yan*, Yueh-Hua Wu*, Xiaolong Wang.
DNAct: Diffusion Guided Multi-Task 3D Policy Learning.
International Conference on Intelligent Robots and Systems (IROS), 2025.

[arXiv] [project page]

Qi Wu, Zipeng Fu, Xuxin Cheng, Xiaolong Wang, Chelsea Finn.
Helpful DoggyBot: Open-World Object Fetching using Legged Robots and Vision-Language Models.
International Conference on Intelligent Robots and Systems (IROS), 2025.

[arXiv] [project page]

Jianglong Ye* , Keyi Wang*, Chengjing Yuan, Ruihan Yang, Yiquan Li, Jiyue Zhu, Yuzhe Qin, Xueyan Zou, and Xiaolong Wang.
Dex1B: Learning with 1B Demonstrations for Dexterous Manipulation.
Robotics: Science and Systems (RSS), 2025.

[arXiv] [project page]

Yu Sun*, Xinhao Li*, Karan Dalal*, Jiarui Xu, Arjun Vikram, Genghan Zhang, Yann Dubois, Xinlei Chen†, Xiaolong Wang†, Sanmi Koyejo†, Tatsunori Hashimoto†, Carlos Guestrin†.
Learning to (Learn at Test Time): RNNs with Expressive Hidden States.
International Conference on Machine Learning (ICML), 2025.
Spotlight

[arXiv] [jax code] [pytorch code]

Jialong Li, Xuxin Cheng, Tianshu Huang, Shiqi Yang, Ri-Zhao Qiu, Xiaolong Wang.
AMO: Adaptive Motion Optimization for Hyper-Dexterous Humanoid Whole-Body Control.
Robotics: Science and Systems (RSS), 2025.

[arXiv] [project page] [models and code]

An-Chieh Cheng*, Yandong Ji*, Zhaojing Yang*, Zaitian Gongye, Xueyan Zou, Jan Kautz, Erdem Bıyık, Hongxu Yin, Sifei Liu, Xiaolong Wang.
NaVILA: Legged Robot Vision-Language-Action Model for Navigation.
Robotics: Science and Systems (RSS), 2025.

[arXiv] [project page] [locomotion code]

Karan Dalal*, Daniel Koceja*, Gashon Hussein*, Jiarui Xu*, Yue Zhao+, Youjin Song+, Shihao Han, Ka Chun Cheung, Jan Kautz, Carlos Guestrin, Tatsunori Hashimoto, Sanmi Koyejo, Yejin Choi, Yu Sun, Xiaolong Wang.
One-Minute Video Generation with Test-Time Training.
Conference on Computer Vision and Pattern Recognition (CVPR), 2025.

[arXiv] [project page] [code]

Jiteng Mu, Nuno Vasconcelos, Xiaolong Wang.
EditAR: Unified Conditional Generation with Autoregressive Models.
Conference on Computer Vision and Pattern Recognition (CVPR), 2025.

[arXiv] [project page]

Hongjun Wang, Wonmin Byeon, Jiarui Xu, Jingwei Gu, Xiaolong Wang, Kai Han, Jan Kautz, Sifei Liu.
Parallel Sequence Modeling via Generalized Spatial Propagation Network.
Conference on Computer Vision and Pattern Recognition (CVPR), 2025.

[arXiv] [project page] [code]

Zhijian Liu, Ligeng Zhu, Baifeng Shi, Zhuoyang Zhang, Yuming Lou, Shang Yang, Haocheng Xi, Shiyi Cao, Yuxian Gu, Dacheng Li, Xiuyu Li, Yunhao Fang, Yukang Chen, Cheng-Yu Hsieh, De-An Huang, An-Chieh Cheng, Vishwesh Nath, Jinyi Hu, Sifei Liu, Ranjay Krishna, Daguang Xu, Xiaolong Wang, Pavlo Molchanov, Jan Kautz, Hongxu Yin, Song Han, Yao Lu.
NVILA: Efficient Frontier Visual Language Models.
Conference on Computer Vision and Pattern Recognition (CVPR), 2025.

[arXiv] [code]

Chenhao Lu*, Xuxin Cheng*, Jialong Li*, Shiqi Yang, Mazeyu Ji, Chengjing Yuan, Ge Yang, Sha Yi, Xiaolong Wang.
Mobile-TeleVision: Predictive Motion Priors for Humanoid Whole-Body Control.
International Conference on Robotics and Automation (ICRA), 2025.

[arXiv] [project page] [code]

Ri-Zhao Qiu*, Yuchen Song*, Xuanbin Peng*, Sai Aneesh Suryadevara, Ge Yang, Minghuan Liu, Mazeyu Ji, Chengzhe Jia, Ruihan Yang, Xueyan Zou, Xiaolong Wang.
WildLMa: Long Horizon Loco-Manipulation in the Wild.
International Conference on Robotics and Automation (ICRA), 2025.

[arXiv] [project page] [hardware]

Tairan He, Wenli Xiao, Toru Lin, Zhengyi Luo, Zhenjia Xu, Zhenyu Jiang, Jan Kautz, Changliu Liu, Guanya Shi, Xiaolong Wang, Linxi Fan, Yuke Zhu.
Hover: Versatile neural whole-body controller for humanoid robots.
International Conference on Robotics and Automation (ICRA), 2025.

[arXiv] [project page] [code]

Cheng-Chun Hsu, Bowen Wen, Jie Xu, Yashraj Narang, Xiaolong Wang, Yuke Zhu, Joydeep Biswas, Stan Birchfield.
SPOT: SE (3) Pose Trajectory Diffusion for Object-Centric Manipulation.
International Conference on Robotics and Automation (ICRA), 2025.

[arXiv] [project page]

Xueyan Zou, Yuchen Song, Ri-Zhao Qiu, Xuanbin Peng, Jianglong Ye, Sifei Liu, Xiaolong Wang.
M3: 3D-Spatial MultiModal Memory.
International Conference on Learning Representations (ICLR), 2025.

[arXiv] [project page] [code]

Runjie Yan*, Yinbo Chen*, Xiaolong Wang.
Consistent Flow Distillation for Text-to-3D Generation.
International Conference on Learning Representations (ICLR), 2025.

[arXiv] [project page] [code]

Nicklas Hansen, Jyothir S V, Vlad Sobal, Yann LeCun, Xiaolong Wang†, Hao Su†.
Hierarchical World Models as Visual Whole-Body Humanoid Controllers.
International Conference on Learning Representations (ICLR), 2025.

[arXiv] [project page] [code]

Isabella Liu, Hao Su†, Xiaolong Wang†.
Dynamic Gaussians Mesh: Consistent Mesh Reconstruction from Monocular Videos.
International Conference on Learning Representations (ICLR), 2025.

[arXiv] [project page] [code]

Renhao Wang, Yu Sun, Yossi Gandelsman, Xinlei Chen, Alexei A. Efros, Xiaolong Wang.
Test-Time Training on Video Streams.
Journal of Machine Learning Research (JMLR), 2025.

[arXiv] [project page] [code and dataset]

Mazeyu Ji*, Ri-Zhao Qiu*, Xueyan Zou, Xiaolong Wang.
GraspSplats: Efficient Manipulation with 3D Feature Splatting.
Conference on Robot Learning (CoRL), 2024.

[arXiv] [project page] [code]

Xialin He*, Chengjing Yuan*, Wenxuan Zhou, Ruihan Yang, David Held, Xiaolong Wang.
Visual Manipulation with Legs.
Conference on Robot Learning (CoRL), 2024.

[arXiv] [project page]

Shiqi Yang, Minghuan Liu, Yuzhe Qin, Runyu Ding, Jialong Li, Xuxin Cheng, Ruihan Yang, Sha Yi, Xiaolong Wang.
ACE: A Cross-Platform Visual-Exoskeletons System for Low-Cost Dexterous Teleoperation.
Conference on Robot Learning (CoRL), 2024.

[arXiv] [project page] [code] [hardware design]

Jun Wang*, Ying Yuan*, Haichuan Che*, Haozhi Qi*, Yi Ma, Jitendra Malik, Xiaolong Wang.
Lessons from Learning to Spin “Pens”.
Conference on Robot Learning (CoRL), 2024.

[arXiv] [project page] [code]

Xuxin Cheng*, Jialong Li*, Shiqi Yang, Ge Yang, Xiaolong Wang.
Open-TeleVision: Teleoperation with Immersive Active Visual Feedback.
Conference on Robot Learning (CoRL), 2024.

[arXiv] [project page] [code] [hardware] [dataset]

Minghuan Liu*, Zixuan Chen*, Xuxin Cheng, Yandong Ji, Ruihan Yang, Xiaolong Wang.
Visual Whole-Body Control for Legged Loco-Manipulation.
Conference on Robot Learning (CoRL), 2024.
Oral Presentation

[arXiv] [project page] [code]

Ruihan Yang*, Zhuoqun Chen*, Jianhan Ma*, Chongyi Zheng*, Yiyu Chen, Quan Nguyen, Xiaolong Wang.
Generalized Animal Imitator: Agile Locomotion with Versatile Motion Prior.
Conference on Robot Learning (CoRL), 2024.
Workshop on Towards Reliable and Deployable Learning-Based Robotic Systems, CoRL 2023
Workshop Best Paper Award

[arXiv] [project page]

Adrian Remonda, Nicklas Hansen, Ayoub Raji, Nicola Musiu, Marko Bertogna, Eduardo E. Veas, Xiaolong Wang.
A Simulation Benchmark for Autonomous Racing with Large-Scale Human Data.
Conference on Neural Information Processing Systems (NeurIPS), 2024.

[arXiv] [project page] [code] [dataset]

An-Chieh Cheng, Hongxu Yin, Yang Fu, Qiushan Guo, Ruihan Yang, Jan Kautz, Xiaolong Wang, Sifei Liu.
SpatialRGPT: Grounded Spatial Reasoning in Vision-Language Models.
Conference on Neural Information Processing Systems (NeurIPS), 2024.

[arXiv] [project page] [code]

Ri-Zhao Qiu, Ge Yang, Weijia Zeng, Xiaolong Wang.
Feature Splatting: Language-Driven Physics-Based Scene Synthesis and Editing.
European Conference on Computer Vision (ECCV), 2024.

[arXiv] [project page] [code]

Jiteng Mu, Michaël Gharbi, Richard Zhang, Eli Shechtman, Nuno Vasconcelos, Xiaolong Wang, Taesung Park.
Editable Image Elements for Controllable Synthesis.
European Conference on Computer Vision (ECCV), 2024.

[arXiv] [project page]

Kaiwen Jiang, Yang Fu, Mukund Varma T, Yash Belhe, Xiaolong Wang, Hao Su, Ravi Ramamoorthi.
A Construct-Optimize Approach to Sparse View Synthesis without Camera Pose.
ACM SIGGRAPH, 2024.

[arXiv] [project page] [code]

Xuxin Cheng*, Yandong Ji*, Junming Chen, Ruihan Yang, Ge Yang, Xiaolong Wang.
Expressive Whole-Body Control for Humanoid Robots.
Robotics: Science and Systems (RSS), 2024.

[arXiv] [project page] [code]

Ruihan Yang, Yejin Kim, Aniruddha Kembhavi, Xiaolong Wang, Kiana Ehsani.
Harmonic Mobile Manipulation.
International Conference on Intelligent Robots and Systems (IROS), 2024.
Oral Presentation
Best Paper Award on Mobile Manipulation

[arXiv] [project page] [code]

Kang-Won Lee, Yuzhe Qin, Xiaolong Wang, Soo-Chul Lim.
DexTouch: Learning to Seek and Manipulate Objects with Tactile Dexterity.
Robotics and Automation Letters (RA-L), 2024.

[arXiv] [project page]

Yinbo Chen, Oliver Wang, Richard Zhang, Eli Shechtman, Xiaolong Wang†, Michael Gharbi†.
Image Neural Field Diffusion Models.
Conference on Computer Vision and Pattern Recognition (CVPR), 2024.
Highlight

[arXiv] [project page] [code]

Mengqi Zhang*, Yang Fu*, Zheng Ding, Sifei Liu, Zhuowen Tu, Xiaolong Wang.
HOIDiffusion: Generating Realistic 3D Hand-Object Interaction Data.
Conference on Computer Vision and Pattern Recognition (CVPR), 2024.

[arXiv] [project page] [code]

Hongchi Xia*, Yang Fu*, Sifei Liu, Xiaolong Wang.
RGBD Objects in the Wild: Scaling Real-World 3D Object Learning from RGB-D Videos.
Conference on Computer Vision and Pattern Recognition (CVPR), 2024.

[arXiv] [project page] [dataset]

Yang Fu, Sifei Liu, Amey Kulkarni, Jan Kautz, Alexei A. Efros, Xiaolong Wang.
COLMAP-Free 3D Gaussian Splatting.
Conference on Computer Vision and Pattern Recognition (CVPR), 2024.
Highlight

[arXiv] [project page] [code]

Jiarui Xu, Xingyi Zhou, Shen Yan, Xiuye Gu, Anurag Arnab, Chen Sun, Xiaolong Wang, Cordelia Schmid.
Pixel Aligned Language Models.
Conference on Computer Vision and Pattern Recognition (CVPR), 2024.

[arXiv] [project page] [code]

Jun Wang*, Yuzhe Qin*, Kaiming Kuang, Yigit Korkmaz, Akhilan Gurumoorthy, Hao Su, Xiaolong Wang.
CyberDemo: Augmenting Simulated Human Demonstration for Real-World Dexterous Manipulation.
Conference on Computer Vision and Pattern Recognition (CVPR), 2024.

[arXiv] [project page]

Jiarui Xu, Yossi Gandelsman, Amir Bar, Jianwei Yang, Jianfeng Gao, Trevor Darrell, Xiaolong Wang.
IMProv: Inpainting-based Multimodal Prompting for Computer Vision Tasks.
Transactions on Machine Learning Research (TMLR), 2024.

[arXiv] [project page]

Yang Fu, Shalini De Mello, Xueting Li, Amey Kulkarni, Jan Kautz, Xiaolong Wang, Sifei Liu.
3D Reconstruction with Generalizable Neural Fields using Scene Priors.
International Conference on Learning Representations (ICLR), 2024.

[arXiv] [project page]

An-Chieh Cheng, Xueting Li, Sifei Liu†, Xiaolong Wang†.
TUVF: Learning Generalizable Texture UV Radiance Fields.
International Conference on Learning Representations (ICLR), 2024.

[arXiv] [project page] [code]

Ying Yuan*, Haichuan Che*, Yuzhe Qin*, Binghao Huang, Zhao-Heng Yin, Kang-Won Lee, Yi Wu, Soo-Chul Lim, Xiaolong Wang.
Robot Synesthesia: In-Hand Manipulation with Visuotactile Sensing.
International Conference on Robotics and Automation (ICRA), 2024.

[arXiv] [project page] [code]

Entong Su, Chengzhe Jia, Yuzhe Qin, Wenxuan Zhou, Annabella Macaluso, Binghao Huang, Xiaolong Wang.
Sim2Real Manipulation on Unknown Objects with Tactile-based Reinforcement Learning.
International Conference on Robotics and Automation (ICRA), 2024.

[arXiv] [project page]

Nicklas Hansen, Hao Su*, Xiaolong Wang*.
TD-MPC2: Scalable, Robust World Models for Continuous Control.
International Conference on Learning Representations (ICLR), 2024.
Spotlight

[arXiv] [project page] [code] [models] [dataset]

Lirui Wang, Yiyang Ling*, Zhecheng Yuan*, Mohit Shridhar, Chen Bao, Yuzhe Qin, Bailin Wang, Huazhe Xu, Xiaolong Wang.
GenSim: Generating Robotic Simulation Tasks via Large Language Models.
International Conference on Learning Representations (ICLR), 2024.
Spotlight
Workshop on Language Grounding and Robot Learning, CoRL 2023
Workshop Best Paper Award

[arXiv] [project page] [code] [demo]

Open X-Embodiment.
Open X-Embodiment: Robotic Learning Datasets and RT-X Models.
International Conference on Robotics and Automation (ICRA), 2024.
Best Paper Award

[arXiv] [project page]

Binghao Huang*, Yuanpei Chen*, Tianyu Wang, Yuzhe Qin, Yaodong Yang, Nikolay Atanasov, Xiaolong Wang.
Dynamic Handover: Throw and Catch with Bimanual Hands .
Conference on Robot Learning (CoRL), 2023.

[arXiv] [project page] [code]

Yanjie Ze*, Ge Yan*, Yueh-Hua Wu*, Annabella Macaluso, Yuying Ge, Jianglong Ye, Nicklas Hansen, Li Erran Li, Xiaolong Wang.
GNFactor: Multi-Task Real Robot Learning with Generalizable Neural Feature Fields.
Conference on Robot Learning (CoRL), 2023.
Oral Presentation

[arXiv] [project page] [code]

Yunhai Feng*, Nicklas Hansen*, Ziyan Xiong*, Chandramouli Rajagopalan, Xiaolong Wang.
Finetuning Offline World Models in the Real World.
Conference on Robot Learning (CoRL), 2023.
Oral Presentation

[arXiv] [project page] [code]

Yueh-Hua Wu, Xiaolong Wang*, Masashi Hamaya*.
Elastic Decision Transformer.
Conference on Neural Information Processing Systems (NeurIPS), 2023.

[arXiv] [project page] [code]

Zehao Zhu, Jiashun Wang, Yuzhe Qin, Deqing Sun, Varun Jampani, Xiaolong Wang.
ContactArt: Learning 3D Interaction Priors for Category-level Articulated Object and Hand Poses Estimation.
International Conference on 3D Vision (3DV), 2024.
Oral Presentation

[arXiv] [project page] [data explorer]

Jiteng Mu, Shen Sang, Nuno Vasconcelos, Xiaolong Wang.
ActorsNeRF: Animatable Few-shot Human Rendering with Generalizable NeRFs.
International Conference on Computer Vision (ICCV), 2023.

[arXiv] [project page] [code]

Jianglong Ye, Naiyan Wang, Xiaolong Wang.
FeatureNeRF: Learning Generalizable NeRFs by Distilling Foundation Models.
International Conference on Computer Vision (ICCV), 2023.

[arXiv] [project page] [code]

Nicklas Hansen*, Zhecheng Yuan*, Yanjie Ze*, Tongzhou Mu*, Aravind Rajeswaran+, Hao Su+, Huazhe Xu+, Xiaolong Wang+.
On Pre-Training for Visuo-Motor Control: Revisiting a Learning-from-Scratch Baseline.
International Conference on Machine Learning (ICML), 2023.

[arXiv] [code]

Yang Fu, Ishan Misra, Xiaolong Wang.
MonoNeRF: Learning Generalizable NeRFs from Monocular Videos without Camera Poses.
International Conference on Machine Learning (ICML), 2023.

[arXiv] [project page]

Xuanchen Lu, Xiaolong Wang, Judith E. Fan.
Learning dense correspondences between photos and sketches.
International Conference on Machine Learning (ICML), 2023.

[arXiv] [project] [code]

Zhao-Heng Yin*, Binghao Huang*, Yuzhe Qin, Qifeng Chen, Xiaolong Wang.
Rotating without Seeing: Towards In-hand Dexterity through Touch.
Robotics: Science and Systems (RSS), 2023.

[arXiv] [project page] [video] [code]

Yuzhe Qin, Wei Yang, Binghao Huang, Karl Van Wyk, Hao Su, Xiaolong Wang, Yu-Wei Chao, Dieter Fox.
AnyTeleop: A General Vision-Based Dexterous Robot Arm-Hand Teleoperation System.
Robotics: Science and Systems (RSS), 2023.

[arxiv] [project] [retargeting code] [visualizer code]

Ruihan Yang, Ge Yang, Xiaolong Wang.
Neural Volumetric Memory for Visual Locomotion Control.
Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
Highlight

[arXiv] [project page] [video]

Jiarui Xu, Sifei Liu*, Arash Vahdat*, Wonmin Byeon, Xiaolong Wang, Shalini De Mello.
Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models.
Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
Highlight

[arXiv] [project page] [video] [demo] [code]

Yuying Ge, Annabella Macaluso, Li Erran Li, Ping Luo, Xiaolong Wang.
Policy Adaptation from Foundation Model Feedback.
Conference on Computer Vision and Pattern Recognition (CVPR), 2023.

[arXiv] [project page]

Chen Bao, Helin Xu, Yuzhe Qin, Xiaolong Wang.
DexArt: Benchmarking Generalizable Dexterous Manipulation with Articulated Objects.
Conference on Computer Vision and Pattern Recognition (CVPR), 2023.

[arXiv] [project page] [code]

Jiashun Wang, Xueting Li, Sifei Liu, Shalini De Mello, Orazio Gallo, Xiaolong Wang, Jan Kautz.
Zero-shot Pose Transfer for Unrigged Stylized 3D Characters.
Conference on Computer Vision and Pattern Recognition (CVPR), 2023.

[pdf] [project page]

Jianglong Ye*, Jiashun Wang*, Binghao Huang, Yuzhe Qin, Xiaolong Wang.
Learning Continuous Grasping Function with a Dexterous Hand from Human Demonstrations.
Robotics and Automation Letters (RA-L), 2023.
International Conference on Intelligent Robots and Systems (IROS), 2023.

[arXiv] [project page] [video] [code]

Yanjie Ze*, Nicklas Hansen*, Yinbo Chen, Mohit Jain, Xiaolong Wang.
Visual Reinforcement Learning with Self-Supervised 3D Representations.
Robotics and Automation Letters (RA-L), 2023.
International Conference on Intelligent Robots and Systems (IROS), 2023.

[arXiv] [project page] [code]

Chenhongyi Yang*, Jiarui Xu*, Shalini De Mello, Elliot J. Crowley, Xiaolong Wang.
GPViT: A High Resolution Non-Hierarchical Vision Transformer with Group Propagation.
International Conference on Learning Representations (ICLR), 2023.
Spotlight Presentation

[arXiv] [code]

Nicklas Hansen, Yixin Lin, Hao Su, Xiaolong Wang, Vikash Kumar, Aravind Rajeswaran.
MoDem: Accelerating Visual Model-Based Reinforcement Learning with Demonstrations.
International Conference on Learning Representations (ICLR), 2023.

[arXiv] [project page] [code]

Kaifeng Zhang, Yang Fu, Shubhankar Borse, Hong Cai, Fatih Porikli, Xiaolong Wang.
Self-Supervised Geometric Correspondence for Category-Level 6D Object Pose Estimation in the Wild.
International Conference on Learning Representations (ICLR), 2023.

[arXiv] [project page] [code]

Yuzhe Qin*, Binghao Huang*, Zhao-Heng Yin, Hao Su, Xiaolong Wang.
DexPoint: Generalizable Point Cloud Reinforcement Learning for Sim-to-Real Dexterous Manipulation.
Conference on Robot Learning (CoRL), 2022.

[arXiv] [project page] [video] [code]

Sateesh Kumar, Jonathan Zamora*, Nicklas Hansen*, Rishabh Jangir, Xiaolong Wang.
Graph Inverse Reinforcement Learning from Diverse Videos.
Conference on Robot Learning (CoRL), 2022 (Oral Presentation).

[arXiv] [project page] [video] [code]

Yueh-Hua Wu*, Jiashun Wang*, Xiaolong Wang.
Learning Generalizable Dexterous Manipulation from Human Grasp Affordance.
Conference on Robot Learning (CoRL), 2022.

[arXiv] [project page] [code]

Yang Fu, Xiaolong Wang.
Category-Level 6D Object Pose Estimation in the Wild: A Semi-Supervised Learning Approach and A New Dataset.
Conference on Neural Information Processing Systems (NeurIPS), 2022.

[arXiv] [project page] [dataset] [Wild6D code]

Yinbo Chen, Xiaolong Wang.
Transformers as Meta-Learners for Implicit Neural Representations.
European Conference on Computer Vision (ECCV), 2022.

[arXiv] [project page] [code]

Xueting Li, Xiaolong Wang, Ming-Hsuan Yang, Alexei Efros, Sifei Liu.
Scraping Textures from Natural Images for Synthesis and Editing.
European Conference on Computer Vision (ECCV), 2022.

[pdf]

Yuzhe Qin*, Yueh-Hua Wu*, Shaowei Liu, Hanwen Jiang, Ruihan Yang, Yang Fu, Xiaolong Wang.
DexMV: Imitation Learning for Dexterous Manipulation from Human Videos.
European Conference on Computer Vision (ECCV), 2022.

[arXiv] [project page] [video] [code]

Hanzhe Hu*, Yinbo Chen*, Jiarui Xu, Shubhankar Borse, Hong Cai, Fatih Porikli, Xiaolong Wang.
Learning Implicit Feature Alignment Function for Semantic Segmentation.
European Conference on Computer Vision (ECCV), 2022.

[arXiv] [code]

Yuzhe Qin, Hao Su*, Xiaolong Wang*.
From One Hand to Multiple Hands: Imitation Learning for Dexterous Manipulation from Single-Camera Teleoperation.
Robotics and Automation Letters (RA-L), 2022.
International Conference on Intelligent Robots and Systems (IROS), 2022.

[arXiv] [project page] [video] [code]

Jianglong Ye, Yuntao Chen, Naiyan Wang, Xiaolong Wang.
Online Adaptation for Implicit Object Tracking and Shape Reconstruction in the Wild.
Robotics and Automation Letters (RA-L), 2022.
International Conference on Intelligent Robots and Systems (IROS), 2022.

[arXiv] [project page] [video]

Chieko Sarah Imai*, Minghao Zhang*, Yuchen Zhang*, Marcin Kierebiński, Ruihan Yang, Yuzhe Qin, Xiaolong Wang.
Vision-Guided Quadrupedal Locomotion in the Wild with Multi-Modal Delay Randomization.
International Conference on Intelligent Robots and Systems (IROS), 2022.

[arXiv] [project page] [video] [code]

Nicklas Hansen, Xiaolong Wang*, Hao Su*.
Temporal Difference Learning for Model Predictive Control.
International Conference on Machine Learning (ICML), 2022.

[arXiv] [project page] [video] [code]

Zeyuan Chen, Yinbo Chen, Jingwen Liu, Xingqian Xu, Vidit Goel, Zhangyang Wang, Humphrey Shi, Xiaolong Wang.
VideoINR: Learning Video Implicit Neural Representation for Continuous Space-Time Super-Resolution.
Conference on Computer Vision and Pattern Recognition (CVPR), 2022.

[arXiv] [project page] [video] [code]

Jianglong Ye, Yuntao Chen, Naiyan Wang, Xiaolong Wang.
GIFS: Neural Implicit Function for General Shape Representation.
Conference on Computer Vision and Pattern Recognition (CVPR), 2022.

[arXiv] [project page] [video] [code]

Shaowei Liu, Subarna Tripathi, Somdeb Majumdar, Xiaolong Wang.
Joint Hand Motion and Interaction Hotspots Prediction from Egocentric Videos.
Conference on Computer Vision and Pattern Recognition (CVPR), 2022.

[arXiv] [project page] [video] [dataset] [code]

Jiteng Mu, Shalini De Mello, Zhiding Yu, Nuno Vasconcelos, Xiaolong Wang, Jan Kautz, Sifei Liu.
CoordGAN: Self-Supervised Dense Correspondences Emerge from GANs.
Conference on Computer Vision and Pattern Recognition (CVPR), 2022.

[arXiv] [project page] [video] [code]

Xuanchi Ren, Xiaolong Wang.
Look Outside the Room: Synthesizing A Consistent Long-Term 3D Scene Video from A Single Image.
Conference on Computer Vision and Pattern Recognition (CVPR), 2022.

[arXiv] [project page] [video] [code]

Jiarui Xu, Shalini De Mello, Sifei Liu, Wonmin Byeon, Thomas Breuel, Jan Kautz, Xiaolong Wang.
GroupViT: Semantic Segmentation Emerges from Text Supervision.
Conference on Computer Vision and Pattern Recognition (CVPR), 2022.

[arXiv] [project page] [video] [code] [huggingface colab] [huggingface demo]

Rishabh Jangir*, Nicklas Hansen*, Sambaran Ghosal, Mohit Jain, Xiaolong Wang.
Look Closer: Bridging Egocentric and Third-Person Views with Transformers for Robotic Manipulation.
Robotics and Automation Letters (RA-L), 2022.
International Conference on Robotics and Automation (ICRA), 2022.

[arXiv] [project page] [video] [code]

Ruihan Yang*, Minghao Zhang*, Nicklas Hansen, Huazhe Xu, Xiaolong Wang.
Learning Vision-Guided Quadrupedal Locomotion End-to-End with Cross-Modal Transformers.
International Conference on Learning Representations (ICLR), 2022 (Spotlight Presentation).

[arXiv] [project page] [video] [code]

Xueting Li, Shalini De Mello, Xiaolong Wang, Ming-Hsuan Yang, Jan Kautz, Sifei Liu.
Learning Continuous Environment Fields via Implicit Functions.
International Conference on Learning Representations (ICLR), 2022.

[arXiv]

Jiashun Wang, Huazhe Xu, Medhini Narasimhan, Xiaolong Wang.
Multi-Person 3D Motion Prediction with Multi-Range Transformers.
Conference on Neural Information Processing Systems (NeurIPS), 2021.

[pdf] [project page] [code]

Yizhuo Li*, Miao Hao*, Zonglin Di*, Nitesh B. Gundavarapu, Xiaolong Wang.
Test-Time Personalization with a Transformer for Human Pose Estimation.
Conference on Neural Information Processing Systems (NeurIPS), 2021.

[arXiv] [project page] [code]

Nicklas Hansen, Hao Su, Xiaolong Wang.
Stabilizing Deep Q-Learning with ConvNets and Vision Transformers under Data Augmentation.
Conference on Neural Information Processing Systems (NeurIPS), 2021.

[arXiv] [project page] [code]

Tianjun Zhang, Huazhe Xu, Xiaolong Wang, Yi Wu, Kurt Keutzer, Joseph E. Gonzalez, Yuandong Tian.
NovelD: A Simple yet Effective Exploration Criterion.
Conference on Neural Information Processing Systems (NeurIPS), 2021.

[pdf] [Talk] [code]

Zihang Lai, Sifei Liu, Alexei A. Efros, Xiaolong Wang.
Video Autoencoder: self-supervised disentanglement of static 3D structure and motion.
International Conference on Computer Vision (ICCV), 2021 (Oral Presentation).

[arXiv] [project page] [code] [video]

Jiarui Xu, Xiaolong Wang.
Rethinking Self-supervised Correspondence Learning: A Video Frame-level Similarity Perspective.
International Conference on Computer Vision (ICCV), 2021 (Oral Presentation).

[arXiv] [project page] [code]

Hanwen Jiang*, Shaowei Liu*, Jiashun Wang, Xiaolong Wang.
Hand-Object Contact Consistency Reasoning for Human Grasps Generation.
International Conference on Computer Vision (ICCV), 2021 (Oral Presentation).

[arXiv] [project page] [code]

Jiteng Mu, Weichao Qiu, Adam Kortylewski, Alan Yuille, Nuno Vasconcelos, Xiaolong Wang.
A-SDF: Learning Disentangled Signed Distance Functions for Articulated Shape Representation.
International Conference on Computer Vision (ICCV), 2021.

[arXiv] [project page] [code]

Haiping Wu, Xiaolong Wang.
Contrastive Learning of Image Representations with Cross-Video Cycle-Consistency.
International Conference on Computer Vision (ICCV), 2021.

[arXiv] [project page] [code]

Yinbo Chen, Zhuang Liu, Huijuan Xu, Trevor Darrell, Xiaolong Wang.
Meta-Baseline: Exploring Simple Meta-Learning for Few-Shot Learning.
International Conference on Computer Vision (ICCV), 2021.

[arXiv] [code]

Xin Wang, Thomas E. Huang, Benlin Liu, Fisher Yu, Xiaolong Wang, Joseph E. Gonzalez, Trevor Darrell.
Robust Object Detection via Instance-Level Temporal Cycle Confusion.
International Conference on Computer Vision (ICCV), 2021.

[arXiv] [project page] [code]

Tete Xiao, Colorado J Reed, Xiaolong Wang, Kurt Keutzer, Trevor Darrell.
Region Similarity Representation Learning.
International Conference on Computer Vision (ICCV), 2021.

[arXiv] [code]

Elad Levi, Tete Xiao, Xiaolong Wang, Trevor Darrell.
Rethinking preventing class-collapsing in metric learning with margin-based losses.
International Conference on Computer Vision (ICCV), 2021.

[pdf]

Ilija Radosavovic, Xiaolong Wang, Lerrel Pinto, Jitendra Malik.
State-Only Imitation Learning for Dexterous Manipulation.
International Conference on Intelligent Robots and Systems (IROS), 2021.

[arXiv] [project page] [Talk]

Amir Bar, Roei Herzig, Xiaolong Wang, Anna Rohrbach, Gal Chechik, Trevor Darrell, Amir Globerson.
Compositional Video Synthesis with Action Graphs.
International Conference on Machine Learning (ICML), 2021.

[arXiv] [project page] [code]

Yinbo Chen, Sifei Liu, Xiaolong Wang.
Learning Continuous Image Representation with Local Implicit Image Function.
Conference on Computer Vision and Pattern Recognition (CVPR), 2021 (Oral Presentation).

[arXiv] [project page] [code]

Jiashun Wang, Huazhe Xu, Jingwei Xu, Sifei Liu, Xiaolong Wang.
Synthesizing Long-Term 3D Human Motion and Interaction in 3D Scenes.
Conference on Computer Vision and Pattern Recognition (CVPR), 2021.

[arXiv] [project page] [code]

Shaowei Liu*, Hanwen Jiang*, Jiarui Xu, Sifei Liu, Xiaolong Wang.
Semi-Supervised 3D Hand-Object Poses Estimation with Interactions in Time.
Conference on Computer Vision and Pattern Recognition (CVPR), 2021.

[arXiv] [project page] [code]

Nicklas Hansen, Xiaolong Wang.
Generalization in Reinforcement Learning by Soft Data Augmentation.
International Conference on Robotics and Automation (ICRA), 2021.

[arXiv] [project page] [code]

Qiang Zhang, Tete Xiao, Alexei A. Efros, Lerrel Pinto, Xiaolong Wang.
Learning Cross-domain Correspondence for Control with Dynamics Cycle-consistency.
International Conference on Learning Representations (ICLR), 2021 (Oral Presentation).

[arXiv] [project page] [code]

Nicklas Hansen, Rishabh Jangir, Yu Sun, Guillem Alenyà, Pieter Abbeel, Alexei A. Efros, Lerrel Pinto, Xiaolong Wang.
Self-Supervised Policy Adaptation during Deployment.
International Conference on Learning Representations (ICLR), 2021 (Spotlight Presentation).

[arXiv] [project page] [code] [bair blog post]

Tete Xiao, Xiaolong Wang, Alexei A. Efros, Trevor Darrell.
What Should Not Be Contrastive in Contrastive Learning.
International Conference on Learning Representations (ICLR), 2021.

[arXiv]

Haozhi Qi, Xiaolong Wang, Deepak Pathak, Yi Ma, Jitendra Malik.
Learning Long-term Visual Dynamics with Region Proposal Interaction Networks.
International Conference on Learning Representations (ICLR), 2021.

[arXiv] [code] [project page] [Talk]

Yunfei Li, Huazhe Xu, Yilin Wu, Xiaolong Wang, Yi Wu.
Solving Compositional Reinforcement Learning Problems via Task Reduction.
International Conference on Learning Representations (ICLR), 2021.

[arXiv] [code] [project page]

Zhenggang Tang, Chao Yu, Boyuan Chen, Huazhe Xu, Xiaolong Wang, Fei Fang, Simon Shaolei Du, Yu Wang, Yi Wu.
Discovering Diverse Multi-Agent Strategic Behavior via Reward Randomization.
International Conference on Learning Representations (ICLR), 2021.

[arXiv] [code] [project page]

Ruihan Yang, Huazhe Xu, Yi Wu, Xiaolong Wang.
Multi-Task Reinforcement Learning with Soft Modularization.
Conference on Neural Information Processing Systems (NeurIPS), 2020.

[pdf] [code] [project page] [Talk]

Xueting Li, Sifei Liu, Shalini De Mello, Kihwan Kim, Xiaolong Wang, Ming-Hsuan Yang, Jan Kautz.
Online Adaptation for Consistent Mesh Reconstruction in the Wild.
Conference on Neural Information Processing Systems (NeurIPS), 2020.

[pdf] [project page]

Jingwei Xu, Huazhe Xu, Bingbing Ni, Xiaokang Yang, Xiaolong Wang, Trevor Darrell.
Hierarchical Style-based Networks for Motion Synthesis.
European Conference on Computer Vision (ECCV), 2020.

[arXiv] [project page] [BibTeX]

Yu Sun, Xiaolong Wang, Zhuang Liu, John Miller, Alexei A. Efros, Moritz Hardt.
Test-Time Training with Self-Supervision for Generalization under Distribution Shifts.
International Conference on Machine Learning (ICML), 2020.

[arXiv] [code and project page] [BibTeX]

Haozhi Qi, Chong You, Xiaolong Wang, Yi Ma, Jitendra Malik.
Deep Isometric Learning for Visual Recognition.
International Conference on Machine Learning (ICML), 2020.

[arXiv] [code] [project page] [BibTeX]

Qian Long*, Zihan Zhou*, Abhinav Gupta, Fei Fang, Yi Wu†, Xiaolong Wang†.
Evolutionary Population Curriculum for Scaling Multi-Agent Reinforcement Learning.
International Conference on Learning Representations (ICLR), 2020.

[arXiv] [project page] [BibTeX] [code]

Joanna Materzynska, Tete Xiao, Roei Herzig, Huijuan Xu†, Xiaolong Wang†, Trevor Darrell†.
Something-Else: Compositional Action Recognition with Spatial-Temporal Interaction Networks.
Conference on Computer Vision and Pattern Recognition (CVPR), 2020.

[arXiv] [project page] [BibTeX] [dataset annotation]

Xueting Li*, Sifei Liu*, Shalini De Mello, Xiaolong Wang, Jan Kautz, and Ming-Hsuan Yang.
Joint-task Self-supervised Learning for Temporal Correspondence.
Conference on Neural Information Processing Systems (NeurIPS), 2019.

[arXiv] [project page] [BibTeX] [code]

Xiaolong Wang*, Allan Jabri* and Alexei A. Efros.
Learning Correspondence from the Cycle-consistency of Time.
Conference on Computer Vision and Pattern Recognition (CVPR), 2019 (Oral Presentation).
(*indicates equal contributions.)

[project page] [slides] [result video] [oral talk]
[arXiv] [BibTeX] [code]

Xueting Li, Sifei Liu, Kihwan Kim, Xiaolong Wang, Ming-Hsuan Yang, and Jan Kautz.
Putting Humans in a Scene: Learning Affordance in 3D Indoor Environments.
Conference on Computer Vision and Pattern Recognition (CVPR), 2019.

[arXiv] [BibTeX]

Wei Yang, Xiaolong Wang, Ali Farhadi, Abhinav Gupta and Roozbeh Mottaghi.
Visual Semantic Navigation using Scene Priors.
International Conference on Learning Representations (ICLR), 2019.

[arXiv] [video] [BibTeX]

Xiaolong Wang and Abhinav Gupta.
Videos as Space-Time Region Graphs.
European Conference on Computer Vision (ECCV), 2018.

[arXiv] [BibTeX]

Tian Ye, Xiaolong Wang, James Davidson, and Abhinav Gupta.
Interpretable Intuitive Physics Model.
European Conference on Computer Vision (ECCV), 2018.

[pdf] [BibTeX] [code] [techxplore]

Xiaolong Wang, Ross Girshick, Abhinav Gupta, and Kaiming He.
Non-local Neural Networks.
Conference on Computer Vision and Pattern Recognition (CVPR), 2018.

[arXiv] [BibTeX] [code]

Xiaolong Wang*, Yufei Ye*, and Abhinav Gupta.
Zero-shot Recognition via Semantic Embeddings and Knowledge Graphs.
Conference on Computer Vision and Pattern Recognition (CVPR), 2018. (*indicates equal contributions.)

[arXiv] [BibTeX] [code]

Wei Yang , Wanli Ouyang, Xiaolong Wang, Jimmy Ren, Hongsheng Li and Xiaogang Wang.
3D Human Pose Estimation in the Wild by Adversarial Learning.
Conference on Computer Vision and Pattern Recognition (CVPR), 2018.

[arXiv] [BibTeX]

Xiaolong Wang, Kaiming He, and Abhinav Gupta.
Transitive Invariance for Self-supervised Visual Representation Learning.
International Conference on Computer Vision (ICCV), 2017

[pdf] [BibTeX] [caffe_model(RGB order input)] [caffe_prototxt]

Yuan Yuan, Xiaodan Liang, Xiaolong Wang, Dit-Yan Yeung, and Abhinav Gupta.
Temporal Dynamic Graph LSTM for Action-driven Video Object Detection.
International Conference on Computer Vision (ICCV), 2017

[pdf] [BibTeX] [dataset]

Xiaolong Wang*, Rohit Girdhar*, and Abhinav Gupta.
Binge Watching: Scaling Affordance Learning from Sitcoms.
Conference on Computer Vision and Pattern Recognition (CVPR), 2017 (spotlight presentation) (*indicates equal contributions.)

[pdf] [BibTeX] [dataset] [project page] [spotlight video]

Xiaolong Wang, Abhinav Shrivastava, and Abhinav Gupta.
A-Fast-RCNN: Hard Positive Generation via Adversary for Object Detection.
Conference on Computer Vision and Pattern Recognition (CVPR), 2017

[pdf] [BibTeX] [code]

Xiaolong Wang and Abhinav Gupta.
Generative Image Modeling using Style and Structure Adversarial Networks.
European Conference on Computer Vision (ECCV), 2016

[pdf] [BibTeX] [code] [models and dataset]

Gunnar A. Sigurdsson, Gül Varol, Xiaolong Wang, Ivan Laptev, Ali Farhadi, Abhinav Gupta.
Hollywood in Homes: Crowdsourcing Data Collection for Activity Understanding.
European Conference on Computer Vision (ECCV), 2016

[pdf] [BibTeX] [dataset]

Xiaolong Wang, Ali Farhadi, and Abhinav Gupta.
Actions ~ Transformations.
Conference on Computer Vision and Pattern Recognition (CVPR), 2016

[pdf] [BibTeX] [dataset]

Xiaolong Wang and Abhinav Gupta.
Unsupervised Learning of Visual Representations using Videos.
International Conference on Computer Vision (ICCV), 2015

[pdf] [BibTeX] [code] [model] [mined_patches] [project page] [spotlight video]

Xiaolong Wang, David F. Fouhey, and Abhinav Gupta.
Designing Deep Networks for Surface Normal Estimation.
Conference on Computer Vision and Pattern Recognition (CVPR), 2015.

[pdf] [BibTeX] [results for NYU Depth V2] [code and models] [project page]

David F. Fouhey, Xiaolong Wang, and Abhinav Gupta.
In Defense of the Direct Perception of Affordances.
arXiv, 2015.

[pdf]

Xiaolong Wang, Liliang Zhang, Liang Lin, Zhujin Liang, and Wangmeng Zuo.
Deep Joint Task Learning for Generic Object Extraction.
Advances in Neural Information Processing Systems (NIPS), 2014.

[pdf] [dataset] [test code] [results]

Keze Wang, Xiaolong Wang, and Liang Lin.
Deep Structured Models for 3D Human Activity Recognition.
ACM International Conference on Multimedia (MM), 2014. (full paper, oral presentation)

[pdf]

Zhujin Liang, Xiaolong Wang, Rui Huang, and Liang Lin.
An Expressive Deep Model for Parsing Human Action from a Single Image.
International Conference on Multimedia and Expo (ICME), 2014. (oral presentation, Best Student Paper Award)

[pdf]

Xiaolong Wang, Liang Lin, and Lichao Huang, Shuicheng Yan.
Incorporating Structural Alternatives and Sharing into Hierarchy for Multiclass Object Recognition and Detection.
Conference on Computer Vision and Pattern Recognition (CVPR), 2013.

[pdf]

Xiaolong Wang and Liang Lin.
Dynamical And-Or Graph Learning for Object Shape Modeling and Detection.
Advances in Neural Information Processing Systems (NIPS), 2012.

[pdf]

Liang Lin, Xiaolong Wang, Wei Yang, and Jian-Huang Lai.
Learning Contour-Fragment-based Shape Model with And-Or Tree Representation.
Conference on Computer Vision and Pattern Recognition (CVPR), 2012.

[pdf]

Wei Yang, Xiaolong Wang, Liang Lin, Chengying Gao.
Interactive CT image segmentation with online discriminative learning.
International Conference on Image Processing (ICIP), 2011.

[pdf]