Cosmos World Foundation Model Platform for Physical AI
arXiv 2025
NVIDIA
Cosmos-Transfer1: Conditional World Generation with Adaptive Multimodal Control
arXiv 2025
NVIDIA
World Simulation with Video Foundation Models for Physical AI
arXiv 2025
NVIDIA
Objaverse++: Curated 3D Object Dataset with Quality Annotations
ICCV 2025
Chendi Lin, Heshan Liu, Qunshu Lin, Zachary Bright, Shitao Tang, Yihui He, Minghao Liu, Ling Zhu, Cindy Le
CamFreeDiff: Camera-free Image to Panorama Generation with Diffusion Model
CVPR 2025
Xiaoding Yuan, Shitao Tang, Kejie Li, Alan Yuille, Peng Wang
BESA: Pruning Large Language Models with Blockwise Parameter-Efficient Sparsity Allocation
ICLR 2024
Shitao Tang, Yang Lin, Jing Li, Wei Niu, Bin Ren, Fei Mao, Jianli Ren, Zhishang Li
MVDiffusion++: A Dense High-resolution Multi-view Diffusion Model for Single to Sparse-view 3D Object Reconstruction
ECCV 2024
Shitao Tang*, Jiacheng Chen*, Dilin Wang*, Chengzhou Tang, Fuyang Zhang, Yuchen Fan, Vikas Chandra, Rakesh Ranjan, Yasutaka Furukawa
MVDiffusion: Enabling Holistic Multi-view Image Generation with Correspondence-Aware Diffusion
NeurIPS 2023 Spotlight
Shitao Tang*, Fuyang Zhang*, Jiacheng Chen, Peng Wang, Yasutaka Furukawa
NeuMap: Neural Coordinate Mapping by Auto-Transdecoder for Camera Localization
CVPR 2023
Shitao Tang, Sicong Tang, Andrea Tagliasacchi, Ping Tan, Yasutaka Furukawa
RenderNet: Visual Relocalization Using Virtual Viewpoints in Large-Scale Indoor Environments
arXiv 2022
Shitao Tang, Chengzhou Tang, Ping Tan
QuadTree Attention for Vision Transformers
ICLR 2022
Shitao Tang*, Jiahui Zhang*, Siyu Zhu, Ping Tan
Learning Camera Localization via Dense Scene Matching
CVPR 2021
Shitao Tang, Chengzhou Tang, Rui Huang, Siyu Zhu, Ping Tan
Channel Equilibrium Networks for Learning Deep Representation
ICML 2020
Wenqi Shao*, Shitao Tang*, Xingang Pan, Ping Tan, Xiaogang Wang, Ping Luo
Learning Efficient Detector with Semi-supervised Adaptive Distillation
BMVC 2019
Shitao Tang, Litong Feng, Wenqi Shao, Zhanghui Kuang, Wei Zhang, Yimin Chen
Fast Video Shot Transition Localization with Deep Structured Models
ACCV 2018
Shitao Tang, Litong Feng, Zhanghui Kuang, Yimin Chen, Wei Zhang