Lab4AI 待复现论文清单

在申请任务前,请务必仔细阅读我们的 贡献流程和奖励规则

论文名称 & 作者会议/期刊 & 年份形式相关链接状态 / 操作
Attention is all you need
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, ナ「kasz Kaiser, Illia Polosukhin
NIPS 2017 原文 | 代码 ✅ 已完成
Improving language understanding by generative pre-training
Alec Radford,Karthik Narasimhan,Tim Salimans,Ilya Sutskever
OpenAI 2018 原文 💡 未复现
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova
NAACL 2019 原文 💡 未复现
Language models are unsupervised multitask learners
Alec Radford,Jeffrey Wu,Rewon Child,David Luan,Dario Amodei,Ilya Sutskever
OpenAI 2019 原文 💡 未复现
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, Peter J. Liu
JMLR 2020 原文 | 代码 💡 未复现
From Local to Global: A GraphRAG Approach to Query-Focused Summarization
Darren Edge, Ha Trinh, Newman Cheng, Joshua Bradley, Alex Chao, Apurva Mody, Steven Truitt, Dasha Metropolitansky, Robert Osazuwa Ness, Jonathan Larson
2024 原文 | 代码 💡 未复现
Language Models are Few-Shot Learners
Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel Ziegler, Jeffrey Wu, Clemens Winter, Chris Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, Dario Amodei
NIPS 2020 原文 | 代码 💡 未复现
LoRA: Low-Rank Adaptation of Large Language Models
Edward J Hu,yelong shen,Phillip Wallis,Zeyuan Allen-Zhu,Yuanzhi Li,Shean Wang,Lu Wang,Weizhu Chen
ICLR 2022 原文 | 代码 💡 未复现
Finetuned Language Models are Zero-Shot Learners
Jason Wei,Maarten Bosma,Vincent Zhao,Kelvin Guu,Adams Wei Yu,Brian Lester,Nan Du,Andrew M. Dai,Quoc V Le
ICLR 2022 原文 | 代码 💡 未复现
LLaMA: Open and Efficient Foundation Language Models
Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, Aurelien Rodriguez, Armand Joulin, Edouard Grave, Guillaume Lample
2023 原文 | 代码 💡 未复现
Self-Consistency Improves Chain of Thought Reasoning in Language Models
Xuezhi Wang,Jason Wei,Dale Schuurmans,Quoc V Le,Ed H. Chi,Sharan Narang,Aakanksha Chowdhery,Denny Zhou
ICLR 2023 原文 | 代码 💡 未复现
Why Can GPT Learn In-Context? Language Models Secretly Perform Gradient Descent as Meta-Optimizers
Damai Dai, Yutao Sun, Li Dong, Yaru Hao, Shuming Ma, Zhifang Sui, Furu Wei
ACL 2023 原文 | 代码 💡 未复现
Toolformer: Language Models Can Teach Themselves to Use Tools
Timo Schick, Jane Dwivedi-Yu, Roberto Dessi, Roberta Raileanu, Maria Lomeli, Eric Hambro, Luke Zettlemoyer, Nicola Cancedda, Thomas Scialom
NIPS 2023 原文 | 代码 💡 未复现
DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability Curvature
Eric Mitchell, Yoonho Lee, Alexander Khazatsky, Christopher D Manning, Chelsea Finn
PMLR 2023 原文 | 代码 💡 未复现
Recitation-augmented language models
Zhiqing Sun,Xuezhi Wang,Yi Tay,Yiming Yang,Denny Zhou
ICLR 2023 原文 | 代码 💡 未复现
Self-Instruct: Aligning Language Models with Self-Generated Instructions
Yizhong Wang, Yeganeh Kordi, Swaroop Mishra, Alisa Liu, Noah A. Smith, Daniel Khashabi, Hannaneh Hajishirzi
ACL 2023 原文 | 代码 💡 未复现
Automatic chain of thought prompting in large language models
Zhuosheng Zhang,Aston Zhang,Mu Li,Alex Smola
ICLR 2023 原文 | 代码 💡 未复现
REALM: retrieval-augmented language model pre-training
Kelvin Guu, Kenton Lee, Zora Tung, Panupong Pasupat, Ming-Wei Chang
ICML 2020 原文 | 代码 💡 未复现
Language Is Not All You Need: Aligning Perception with Language Models
Shaohan Huang, Li Dong, Wenhui Wang, Yaru Hao, Saksham Singhal, Shuming Ma, Tengchao Lv, Lei Cui, Owais Khan Mohammed, Barun Patra, Qiang Liu, Kriti Aggarwal, Zewen Chi, Nils Bjorck, Vishrav Chaudhary, Subhojit Som, XIA SONG, Furu Wei
NIPS 2023 原文 | 代码 💡 未复现
Automatic Prompt Augmentation and Selection with Chain-of-Thought from Labeled Data
Kashun Shum, Shizhe Diao, Tong Zhang
EMNLP 2023 原文 | 代码 💡 未复现
Weakly Supervised Semantic Segmentation via Alternate Self-Dual Teaching
Dingwen Zhang, Wenyuan Zeng, Guangyu Guo, Chaowei Fang, Lechao Cheng, Ming-Ming Cheng, Junwei Han
TIP 2025 原文 💡 未复现
Text to Image for Multi-Label Image Recognition with Joint Prompt-Adapter Learning
Chun-Mei Feng, Kai Yu, Xinxing Xu, Salman Khan, Rick Siow Mong Goh, Wangmeng Zuo, Yong Liu
TPAMI 2024 原文 💡 未复现
Segment Concealed Objects with Incomplete Supervision
Chunming He, Kai Li, Yachao Zhang, Ziyun Yang, Youwei Pang, Longxiang Tang, Chengyu Fang, Yulun Zhang, Linghe Kong, Xiu Li, Sina Farsiu
TPAMI 2025 原文 | 代码 💡 未复现
Event-based Stereo Depth Estimation: A Survey
Suman Ghosh, Guillermo Gallego
TPAMI 2025 原文 💡 未复现
Efficient Low-Resolution Face Recognition via Bridge Distillation
Shiming Ge, Shengwei Zhao, Chenyu Li, Yu Zhang, Jia Li
TIP 2024 原文 💡 未复现
Towards Deviation-Robust Agent Navigation via Perturbation-Aware Contrastive Learning
Bingqian Lin, Yanxin Long, Yi Zhu, Fengda Zhu, Xiaodan Liang, Qixiang Ye, Liang Lin
TPAMI 2023 原文 | 代码 💡 未复现
Bayesian Estimate of Mean Proper Scores for Diversity-Enhanced Active Learning
Wei Tan, Lan Du, Wray Buntine
TPAMI 2023 原文 💡 未复现
Paragraph-to-Image Generation with Information-Enriched Diffusion Model
Weijia Wu, Zhuang Li, Yefei He, Mike Zheng Shou, Chunhua Shen, Lele Cheng, Yan Li, Tingting Gao, Di Zhang
IJCV 2025 原文 | 代码 💡 未复现
Inherit with Distillation and Evolve with Contrast: Exploring Class Incremental Semantic Segmentation Without Exemplar Memory
Danpei Zhao, Bo Yuan, Zhenwei Shi
TPAMI 2023 原文 💡 未复现
Dual Compensation Residual Networks for Class Imbalanced Learning
Ruibing Hou, Hong Chang, Bingpeng Ma, Shiguang Shan, Xilin Chen
TPAMI 2023 原文 💡 未复现
End-to-end Alternating Optimization for Real-World Blind Super Resolution
Zhengxiong Luo, Yan Huang, Shang Li, Liang Wang, Tieniu Tan
IJCV 2023 原文 | 代码 💡 未复现
YOLO-MS: Rethinking Multi-Scale Representation Learning for Real-time Object Detection
Yuming Chen, Xinbin Yuan, Jiabao Wang, Ruiqi Wu, Xiang Li, Qibin Hou, Ming-Ming Cheng
TPAMI 2025 原文 | 代码 💡 未复现
A Survey on Graph Neural Networks for Time Series: Forecasting, Classification, Imputation, and Anomaly Detection
Ming Jin, Huan Yee Koh, Qingsong Wen, Daniele Zambon, Cesare Alippi, Geoffrey I. Webb, Irwin King, Shirui Pan
TPAMI 2024 原文 | 代码 💡 未复现
SplatFlow: Learning Multi-frame Optical Flow via Splatting
Bo Wang, Yifan Zhang, Jian Li, Yang Yu, Zhenping Sun, Li Liu, Dewen Hu
IJCV 2024 原文 | 代码 💡 未复现
Towards Expressive Spectral-Temporal Graph Neural Networks for Time Series Forecasting
Ming Jin, Guangsi Shi, Yuan-Fang Li, Bo Xiong, Tian Zhou, Flora D. Salim, Liang Zhao, Lingfei Wu, Qingsong Wen, Shirui Pan
TPAMI 2025 原文 💡 未复现
Efficient Halftoning via Deep Reinforcement Learning
Haitian Jiang, Dongliang Xiong, Xiaowen Jiang, Li Ding, Liang Chen, Kai Huang
TIP 2023 原文 💡 未复现
PAC-Bayes Bounds for Bandit Problems: A Survey and Experimental Comparison
Hamish Flynn, David Reeb, Melih Kandemir, Jan Peters
TPAMI 2023 原文 💡 未复现
Salient Object Detection via Dynamic Scale Routing
Zhenyu Wu, Shuai Li, Chenglizhao Chen, Hong Qin, Aimin Hao
TIP 2022 原文 | 代码 💡 未复现
Twin Contrastive Learning for Online Clustering
Yunfan Li, Mouxing Yang, Dezhong Peng, Taihao Li, Jiantao Huang, Xi Peng
IJCV 2022 原文 💡 未复现
Kernel-Based Generalized Median Computation for Consensus Learning
Andreas Nienkötter, Xiaoyi Jiang
TPAMI 2022 原文 💡 未复现
A Tale of HodgeRank and Spectral Method: Target Attack Against Rank Aggregation Is the Fixed Point of Adversarial Game
Ke Ma, Qianqian Xu, Jinshan Zeng, Guorong Li, Xiaochun Cao, Qingming Huang
TPAMI 2022 原文 💡 未复现
Boosting Night-time Scene Parsing with Learnable Frequency
Zhifeng Xie, Sen Wang, Ke Xu, Zhizhong Zhang, Xin Tan, Yuan Xie, Lizhuang Ma
TIP 2023 原文 💡 未复现
SiamMask: A Framework for Fast Online Object Tracking and Segmentation
Weiming Hu, Qiang Wang, Li Zhang, Luca Bertinetto, Philip H. S. Torr
TPAMI 2022 原文 💡 未复现
SERE: Exploring Feature Self-relation for Self-supervised Transformer
Zhong-Yu Li, Shanghua Gao, Ming-Ming Cheng
TPAMI 2023 原文 | 代码 💡 未复现
Parallel and Distributed Graph Neural Networks: An In-Depth Concurrency Analysis
Maciej Besta, Torsten Hoefler
TPAMI 2023 原文 💡 未复现
Efficient Burst Raw Denoising with Variance Stabilization and Multi-frequency Denoising Network
Dasong Li, Yi Zhang, Ka Lung Law, Xiaogang Wang, Hongwei Qin, Hongsheng Li
IJCV 2022 原文 💡 未复现
Incomplete Gamma Kernels: Generalizing Locally Optimal Projection Operators
Patrick Stotko, Michael Weinmann, Reinhard Klein
TPAMI 2024 原文 | 代码 💡 未复现
From Slow Bidirectional to Fast Autoregressive Video Diffusion Models
Tianwei Yin, Qiang Zhang, Richard Zhang, William T. Freeman, Fredo Durand, Eli Shechtman, Xun Huang
CVPR 2025 原文 | 代码 💡 未复现
FastComposer: Tuning-Free Multi-Subject Image Generation with Localized Attention
Guangxuan Xiao, Tianwei Yin, William T. Freeman, Frédo Durand, Song Han
IJCV 2024 原文 | 代码 💡 未复现
ROGRAG: A Robustly Optimized GraphRAG Framework
Zhefan Wang, Huanjun Kong, Jie Ying, Wanli Ouyang, Nanqing Dong
ACL 2025 原文 | 代码 💡 未复现
Dual Encoder GAN Inversion for High-Fidelity 3D Head Reconstruction from Single Images
Bahri Batuhan Bilecen, Ahmet Berke Gokmen, Aysegul Dundar
NIPS 2024 原文 | 代码 💡 未复现
Can We Leave Deepfake Data Behind in Training Deepfake Detector
Jikang Cheng, Zhiyuan Yan, Ying Zhang, Yuhao Luo, Zhongyuan Wang, Chen Li
NIPS 2024 原文 | 代码 💡 未复现
VAST: A Vision-Audio-Subtitle-Text Omni-Modality Foundation Model and Dataset
Sihan Chen, Handong Li, Qunbo Wang, Zijia Zhao, Mingzhen Sun, Xinxin Zhu, Jing Liu
NIPS 2023 原文 | 代码 💡 未复现
MASt3R-SLAM: Real-Time Dense SLAM with 3D Reconstruction Priors
Riku Murai, Eric Dexheimer, Andrew J. Davison
CVPR 2025 原文 | 代码 💡 未复现
MAGiC-SLAM: Multi-Agent Gaussian Globally Consistent SLAM
Vladimir Yugay, Theo Gevers, Martin R. Oswald
CVPR 2025 原文 | 代码 💡 未复现
Murre: Multi-view Reconstruction via SfM-guided Monocular Depth Estimation
Haoyu Guo, He Zhu, Sida Peng, Haotong Lin, Yunzhi Yan, Tao Xie, Wenguan Wang, Xiaowei Zhou, Hujun Bao
CVPR 2025 原文 | 代码 💡 未复现
STAR: Spatial-Temporal Augmentation with Text-to-Video Models for Real-World Video Super-Resolution
Rui Xie, Yinhong Liu, Penghao Zhou, Chen Zhao, Jun Zhou, Kai Zhang, Zhenyu Zhang, Jian Yang, Zhenheng Yang, Ying Tai
ICCV 2025 原文 | 代码 💡 未复现
SemanticDraw: Towards Real-Time Interactive Content Creation from Image Diffusion Models
Jaerin Lee, Daniel Sungho Jung, Kanggeon Lee, Kyoung Mu Lee
CVPR 2025 原文 | 代码 💡 未复现
HSMR: Reconstructing Humans with a Biomechanically Accurate Skeleton
Yan Xia, Xiaowei Zhou, Etienne Vouga, Qixing Huang, Georgios Pavlakos
CVPR 2025 原文 | 代码 💡 未复现
RLAIF-V: Open-Source AI Feedback Leads to Super GPT-4V Trustworthiness
Tianyu Yu, Haoye Zhang, Qiming Li, Qixin Xu, Yuan Yao, Da Chen, Xiaoman Lu, Ganqu Cui, Yunkai Dang, Taiwen He, Xiaocheng Feng, Jun Song, Bo Zheng, Zhiyuan Liu, Tat-Seng Chua, Maosong Sun
CVPR 2025 原文 | 代码 💡 未复现
DFormer:Rethinking RGBD Representation Learning for Semantic Segmentation
Bowen Yin, Xuying Zhang, Zhongyu Li, Li Liu, Ming-Ming Cheng, Qibin Hou
CVPR 2025 原文 | 代码 💡 未复现
GaussianCity: Generative Gaussian Splatting for Unbounded 3D City Generation
Haozhe Xie, Zhaoxi Chen, Fangzhou Hong, Ziwei Liu
CVPR 2025 原文 | 代码 💡 未复现
PhysTwin: Physics-Informed Reconstruction and Simulation of Deformable Objects from Videos
Hanxiao Jiang, Hao-Yu Hsu, Kaifeng Zhang, Hsin-Ni Yu, Shenlong Wang, Yunzhu Li
ICCV 2025 原文 | 代码 💡 未复现
UniGoal: Towards Universal Zero-shot Goal-oriented Navigation
Hang Yin, Xiuwei Xu, Lingqing Zhao, Ziwei Wang, Jie Zhou, Jiwen Lu
CVPR 2025 原文 | 代码 💡 未复现
Momentum-GS: Momentum Gaussian Self-Distillation for High-Quality Large Scene Reconstruction
Jixuan Fan, Wanhua Li, Yifei Han, Yansong Tang
ICCV 2025 原文 | 代码 💡 未复现
MINIMA: Modality Invariant Image Matching
Jiangwei Ren, Xingyu Jiang, Zizhuo Li, Dingkang Liang, Xin Zhou, Xiang Bai
CVPR 2025 原文 | 代码 💡 未复现
Uni4D: Unifying Visual Foundation Models for 4D Modeling from a Single Video
David Yifan Yao, Albert J. Zhai, Shenlong Wang
CVPR 2025 原文 | 代码 💡 未复现
Diffusion Renderer: Neural Inverse and Forward Rendering with Video Diffusion Models
Ruofan Liang, Zan Gojcic, Huan Ling, Jacob Munkberg, Jon Hasselgren, Zhi-Hao Lin, Jun Gao, Alexander Keller, Nandita Vijaykumar, Sanja Fidler, Zian Wang
CVPR 2025 原文 | 代码 💡 未复现
Linear Programming Bounds on k-Uniform States
Yu Ning, Fei Shi, Tao Luo, Xiande Zhang
ICCV 2025 原文 | 代码 💡 未复现
LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation
Jiaxiang Tang, Zhaoxi Chen, Xiaokang Chen, Tengfei Wang, Gang Zeng, Ziwei Liu
ECCV 2024 原文 | 代码 💡 未复现
VideoMamba: State Space Model for Efficient Video Understanding
Kunchang Li, Xinhao Li, Yi Wang, Yinan He, Yali Wang, Limin Wang, Yu Qiao
ECCV 2024 原文 | 代码 💡 未复现
DriveLM: Driving with Graph Visual Question Answering
Chonghao Sima, Katrin Renz, Kashyap Chitta, Li Chen, Hanxue Zhang, Chengen Xie, Jens Beißwenger, Ping Luo, Andreas Geiger, Hongyang Li
ECCV 2024 原文 | 代码 💡 未复现
GRiT: A Generative Region-to-text Transformer for Object Understanding
Jialian Wu, Jianfeng Wang, Zhengyuan Yang, Zhe Gan, Zicheng Liu, Junsong Yuan, Lijuan Wang
ECCV 2024 原文 | 代码 💡 未复现
PointLLM: Empowering Large Language Models to Understand Point Clouds
Runsen Xu, Xiaolong Wang, Tai Wang, Yilun Chen, Jiangmiao Pang, Dahua Lin
ECCV 2024 原文 | 代码 💡 未复现
nuCraft: Crafting High Resolution 3D Semantic Occupancy for Unified 3D Scene Understanding
Benjin Zhu, Zhe Wang, and Hongsheng Li
ECCV 2024 原文 | 代码 💡 未复现
Adversarial Diffusion Distillation
Axel Sauer, Dominik Lorenz, Andreas Blattmann, Robin Rombach
ECCV 2024 原文 | 代码 💡 未复现
Generative Image Dynamics
Zhengqi Li, Richard Tucker, Noah Snavely, Aleksander Holynski
CVPR 2024 最佳论文 原文 | 代码 💡 未复现
Rich Human Feedback for Text-to-Image Generation
Youwei Liang, Junfeng He, Gang Li, Peizhao Li, Arseniy Klimovskiy, Nicholas Carolan, Jiao Sun, Jordi Pont-Tuset, Sarah Young, Feng Yang, Junjie Ke, Krishnamurthy Dj Dvijotham, Katie Collins, Yiwen Luo, Yang Li, Kai J Kohlhoff, Deepak Ramachandran, Vidhya Navalpakkam
CVPR 2024 最佳论文 原文 | 代码 💡 未复现
Mip-Splatting: Alias-free 3D Gaussian Splatting
Zehao Yu, Anpei Chen, Binbin Huang, Torsten Sattler, Andreas Geiger
CVPR 2024 最佳学生论文 原文 | 代码 💡 未复现
BioCLIP: A Vision Foundation Model for the Tree of Life
Samuel Stevens, Jiaman Wu, Matthew J Thompson, Elizabeth G Campolongo, Chan Hee Song, David Edward Carlyn, Li Dong, Wasila M Dahdul, Charles Stewart, Tanya Berger-Wolf, Wei-Lun Chao, Yu Su
CVPR 2024 最佳学生论文 原文 | 代码 💡 未复现
4D Gaussian Splatting for Real-Time Dynamic Scene Rendering
Guanjun Wu, Taoran Yi, Jiemin Fang, Lingxi Xie, Xiaopeng Zhang, Wei Wei, Wenyu Liu, Qi Tian, Xinggang Wang
CVPR 2024 原文 | 代码 💡 未复现
Depth Anything: Unleashing The Power of Large-Scale Unlabeled Data
Lihe Yang, Bingyi Kang, Zilong Huang, Xiaogang Xu, Jiashi Feng, Hengshuang Zhao
CVPR 2024 原文 | 代码 💡 未复现
LISA: Reasoning Segmentation Via Large Language Model
Xin Lai, Zhuotao Tian, Yukang Chen, Yanwei Li, Yuhui Yuan, Shu Liu, Jiaya Jia
CVPR 2024 原文 | 代码 💡 未复现
InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks
Zhe Chen, Jiannan Wu, Wenhai Wang, Weijie Su, Guo Chen, Sen Xing, Muyan Zhong, Qinglong Zhang, Xizhou Zhu, Lewei Lu, Bin Li, Ping Luo, Tong Lu, Yu Qiao, Jifeng Dai
CVPR 2024 原文 | 代码 💡 未复现
MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark
Xiang Yue, Yuansheng Ni, Kai Zhang, Tianyu Zheng, Ruoqi Liu, Ge Zhang, Samuel Stevens, Dongfu Jiang, Weiming Ren, Yuxuan Sun, Cong Wei, Botao Yu, Ruibin Yuan, Renliang Sun, Ming Yin, Boyuan Zheng, Zhenzhu Yang, Yibo Liu, Wenhao Huang, Huan Sun, Yu Su, Wenhu Chen
CVPR 2024 原文 | 代码 💡 未复现
EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything
Yunyang Xiong, Bala Varadarajan, Lemeng Wu, Xiaoyu Xiang, Fanyi Xiao, Chenchen Zhu, Xiaoliang Dai, Dilin Wang, Fei Sun, Forrest Iandola, Raghuraman Krishnamoorthi, Vikas Chandra
CVPR 2024 原文 | 代码 💡 未复现
Improved Baselines with Visual Instruction Tuning (LLaVA-1.5)
Haotian Liu, Chunyuan Li, Yuheng Li, Yong Jae Lee
CVPR 2024 原文 | 代码 💡 未复现
DemoFusion: Democratising High-Resolution Image Generation With No $$$
Ruoyi Du, Dongliang Chang, Timothy Hospedales, Yi-Zhe Song, Zhanyu Ma
CVPR 2024 原文 | 代码 💡 未复现
ViewDiff: 3D-Consistent Image Generation with Text-to-Image Models
Lukas Höllein, Aljaž Božič, Norman Müller, David Novotny, Hung-Yu Tseng, Christian Richardt, Michael Zollhöfer, Matthias Nießner
CVPR 2024 原文 | 代码 💡 未复现
OmniGlue: Generalizable Feature Matching with Foundation Model Guidance
Hanwen Jiang, Arjun Karpur, Bingyi Cao, Qixing Huang, Andre Araujo
CVPR 2024 原文 | 代码 💡 未复现
DocRes: A Generalist Model Toward Unifying Document Image Restoration Tasks
Jiaxin Zhang, Dezhi Peng, Chongyu Liu, Peirong Zhang, Lianwen Jin
CVPR 2024 原文 | 代码 💡 未复现
MobileCLIP: Fast Image-Text Models through Multi-Modal Reinforced Training
Pavan Kumar Anasosalu Vasu, Hadi Pouransari, Fartash Faghri, Raviteja Vemulapalli, Oncel Tuzel
CVPR 2024 原文 | 代码 💡 未复现
Describing Differences in Image Sets with Natural Language
Lisa Dunlap, Yuhui Zhang, Xiaohan Wang, Ruiqi Zhong, Trevor Darrell, Jacob Steinhardt, Joseph E. Gonzalez, Serena Yeung-Levy
CVPR 2024 原文 | 代码 💡 未复现
XFeat: Accelerated Features for Lightweight Image Matching
Guilherme Potje, Felipe Cadar, Andre Araujo, Renato Martins, Erickson R. Nascimento
CVPR 2024 原文 | 代码 💡 未复现
pixelSplat: 3D Gaussian Splats from Image Pairs for Scalable Generalizable 3D Reconstruction
David Charatan, Sizhe Li, Andrea Tagliasacchi, Vincent Sitzmann
CVPR 2024 原文 💡 未复现
GPT4Point: A Unified Framework for Point-Language Understanding and Generation
Zhangyang Qi, Ye Fang, Zeyi Sun, Xiaoyang Wu, Tong Wu, Jiaqi Wang, Dahua Lin, Hengshuang Zhao
CVPR 2024 原文 💡 未复现
Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks
Bin Xiao, Haiping Wu, Weijian Xu, Xiyang Dai, Houdong Hu, Yumao Lu, Michael Zeng, Ce Liu, Lu Yuan
CVPR 2024 原文 💡 未复现
Identity-Preserving Text-to-Video Generation by Frequency Decomposition
Shenghai Yuan, Jinfa Huang, Xianyi He, Yunyuan Ge, Yujun Shi, Liuhan Chen, Jiebo Luo, Li Yuan
CVPR 2025 原文 | 代码 💡 未复现
Cinemo: Consistent and Controllable Image Animation with Motion Diffusion Models
Xin Ma, Yaohui Wang, Gengyun Jia, Xinyuan Chen, Yuan-Fang Li, Cunjian Chen, Yu Qiao
CVPR 2025 原文 | 代码 💡 未复现
X-Dyna: Expressive Dynamic Human Image Animation
Di Chang, Hongyi Xu, You Xie, Yipeng Gao, Zhengfei Kuang, Shengqu Cai, Chenxu Zhang, Guoxian Song, Chao Wang, Yichun Shi, Zeyuan Chen, Shijie Zhou, Linjie Luo, Gordon Wetzstein, Mohammad Soleymani
CVPR 2025 原文 | 代码 💡 未复现
PhyT2V: LLM-Guided Iterative Self-Refinement for Physics-Grounded Text-to-Video Generation
Qiyao Xue, Xiangyu Yin, Boyuan Yang, Wei Gao
CVPR 2025 原文 | 代码 💡 未复现
Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model
Feng Liu, Shiwei Zhang, Xiaofeng Wang, Yujie Wei, Haonan Qiu, Yuzhong Zhao, Yingya Zhang, Qixiang Ye, Fang Wan
CVPR 2025 原文 | 代码 💡 未复现
AR-Diffusion: Asynchronous Video Generation with Auto-Regressive Diffusion
Mingzhen Sun, Weining Wang, Gen Li, Jiawei Liu, Jiahui Sun, Wanquan Feng, Shanshan Lao, SiYu Zhou, Qian He, Jing Liu
CVPR 2025 原文 | 代码 💡 未复现
Number it: Temporal Grounding Videos like Flipping Manga
Yongliang Wu, Xinting Hu, Yuyang Sun, Yizhou Zhou, Wenbo Zhu, Fengyun Rao, Bernt Schiele, Xu Yang
CVPR 2025 原文 | 代码 💡 未复现
Edit Away and My Face Will not Stay: Personal Biometric Defense against Malicious Generative Editing
Hanhui Wang, Yihua Zhang, Ruizheng Bai, Yue Zhao, Sijia Liu, Zhengzhong Tu
CVPR 2025 原文 | 代码 💡 未复现
h-Edit: Effective and Flexible Diffusion-Based Editing via Doob’s h-Transform
Toan Nguyen, Kien Do, Duc Kieu, Thin Nguyen
CVPR 2025 原文 | 代码 💡 未复现
OverLoCK: An Overview-first-Look-Closely-next ConvNet with Context-Mixing Dynamic Kernels
Meng Lou, Yizhou Yu
CVPR 2025 oral 原文 | 代码 💡 未复现
Alias-Free Latent Diffusion Models:Improving Fractional Shift Equivariance of Diffusion Latent Space
Yifan Zhou, Zeqi Xiao, Shuai Yang, Xingang Pan
CVPR 2025 oral 原文 | 代码 💡 未复现
3D Student Splatting and Scooping
Jialin Zhu, Jiangbei Yue, Feixiang He, He Wang
CVPR 2025 oral 原文 | 代码 💡 未复现
CAP4D: Creating Animatable 4D Portrait Avatars with Morphable Multi-View Diffusion Models
Felix Taubner, Ruihang Zhang, Mathieu Tuli, David B. Lindell
CVPR 2025 oral 原文 | 代码 💡 未复现
Multi-view Reconstruction via SfM-guided Monocular Depth Estimation
Haoyu Guo, He Zhu, Sida Peng, Haotong Lin, Yunzhi Yan, Tao Xie, Wenguan Wang, Xiaowei Zhou, Hujun Bao
CVPR 2025 oral 原文 | 代码 💡 未复现
Closed-Loop Supervised Fine-Tuning of Tokenized Traffic Models
Zhejun Zhang, Peter Karkus, Maximilian Igl, Wenhao Ding, Yuxiao Chen, Boris Ivanovic, Marco Pavone
CVPR 2025 oral 原文 | 代码 💡 未复现
CustAny: Customizing Anything from A Single Example
Lingjie Kong, Kai Wu, Xiaobin Hu, Wenhui Han, Jinlong Peng, Chengming Xu, Donghao Luo, Mengtian Li, Jiangning Zhang, Chengjie Wang, Yanwei Fu
CVPR 2025 oral 原文 | 代码 💡 未复现
VGGT:Visual Geometry Grounded Transformer
Jianyuan Wang, Minghao Chen, Nikita Karaev, Andrea Vedaldi, Christian Rupprecht, David Novotny
CVPR 2025 oral,Award Candidate 原文 | 代码 💡 未复现
Navigation World Models
Amir Bar, Gaoyue Zhou, Danny Tran, Trevor Darrell, Yann LeCun
CVPR 2025 oral 原文 | 代码 💡 未复现
MegaSaM: Accurate, Fast, and Robust Structure and Motion from Casual Dynamic Videos
Zhengqi Li, Richard Tucker, Forrester Cole, Qianqian Wang, Linyi Jin, Vickie Ye, Angjoo Kanazawa, Aleksander Holynski, Noah Snavely
CVPR 2025 oral 原文 | 代码 💡 未复现
FoundationStereo: Zero-Shot Stereo Matching
Bowen Wen, Matthew Trepte, Joseph Aribido, Jan Kautz, Orazio Gallo, Stan Birchfield
CVPR 2025 oral,Award Candidate 原文 | 代码 💡 未复现
The PanAf-FGBG Dataset: Understanding the Impact of Backgrounds in Wildlife Behaviour Recognition
Otto Brookes, Maksim Kukushkin, Majid Mirmehdi, Colleen Stephens, Paula Dieguez, Thurston C. Hicks, Sorrel Jones, Kevin Lee, Maureen S. McCarthy, Amelia Meier, Emmanuelle Normand, Erin G. Wessling, Roman M. Wittig, Kevin Langergraber, Klaus Zuberbühler, Lukas Boesch, Thomas Schmid, Mimi Arandjelovic, Hjalmar Kühl, Tilo Burghardt
CVPR 2025 oral,Award Candidate 原文 | 代码 💡 未复现
Improving Diffusion Inverse Problem Solving with Decoupled Noise Annealing
Bingliang Zhang, Wenda Chu, Julius Berner, Chenlin Meng, Anima Anandkumar, Yang Song
CVPR 2025 oral 原文 | 代码 💡 未复现
MV-DUSt3R+: Single-StageSceneReconstruction fromSparseViewsIn2Seconds
Zhenggang Tang, Yuchen Fan, Dilin Wang, Hongyu Xu, Rakesh Ranjan, Alexander Schwing, Zhicheng Yan
CVPR 2025 oral 原文 | 代码 💡 未复现
DIFIX3D+: Improving 3D Reconstructions with Single-Step Diffusion Models
Jay Zhangjie Wu, Yuxuan Zhang, Haithem Turki, Xuanchi Ren, Jun Gao, Mike Zheng Shou, Sanja Fidler, Zan Gojcic, Huan Ling
CVPR 2025 oral,Award Candidate 原文 | 代码 💡 未复现
DIFFUSIONRENDERER: Neural Inverse and Forward Rendering with Video Diffusion Models
Ruofan Liang, Zan Gojcic, Huan Ling, Jacob Munkberg, Jon Hasselgren, Zhi-Hao Lin, Jun Gao, Alexander Keller, Nandita Vijaykumar, Sanja Fidler, Zian Wang
CVPR 2025 oral 原文 | 代码 💡 未复现
OpenING: A Comprehensive Benchmark for Judging Open-ended Interleaved Image-Text Generation
Pengfei Zhou, Xiaopeng Peng, Jiajun Song, Chuanhao Li, Zhaopan Xu, Yue Yang, Ziyao Guo, Hao Zhang, Yuqi Lin, Yefei He, Lirui Zhao, Shuo Liu, Tianhua Li, Yuxuan Xie, Xiaojun Chang, Yu Qiao, Wenqi Shao, Kaipeng Zhang
CVPR 2025 oral 原文 | 代码 💡 未复现
RandAR: Decoder-only Autoregressive Visual Generation in Random Orders
Ziqi Pang, Tianyuan Zhang, Fujun Luan, Yunze Man, Hao Tan, Kai Zhang, William T. Freeman, Yu-Xiong Wang
CVPR 2025 oral 原文 | 代码 💡 未复现
AnyEdit: Mastering Unified High-Quality Image Editing for Any Idea
Qifan Yu, Wei Chow, Zhongqi Yue, Kaihang Pan, Yang Wu, Xiaoyang Wan, Juncheng Li, Siliang Tang, Hanwang Zhang, Yueting Zhuang
CVPR 2025 oral 原文 | 代码 💡 未复现
VideoEspresso: A Large-Scale Chain-of-Thought Dataset for Fine-Grained Video Reasoning via Core Frame Selection
Songhao Han, Wei Huang, Hairong Shi, Le Zhuo, Xiu Su, Shifeng Zhang, Xu Zhou, Xiaojuan Qi, Yue Liao, Si Liu
CVPR 2025 oral 原文 | 代码 💡 未复现
SegEarth-OV: Towards Training-Free Open-Vocabulary Segmentation for Remote Sensing Images
Kaiyu Li, Ruixun Liu, Xiangyong Cao, Xueru Bai, Feng Zhou, Deyu Meng, Zhi Wang
CVPR 2025 oral 原文 | 代码 💡 未复现
Minority-Focused Text-to-Image Generation via Prompt Optimization
Soobin Um, Jong Chul Ye
CVPR 2025 oral 原文 | 代码 💡 未复现
Autoregressive Distillation of Diffusion Transformers
Yeongmin Kim, Sotiris Anagnostidis, Yuming Du, Edgar Schönfeld, Jonas Kohler, Markos Georgopoulos, Albert Pumarola, Ali Thabet, Artsiom Sanakoyeu
CVPR 2025 oral 原文 | 代码 💡 未复现
Molmo and PixMo:Open Weights and Open Data for State-of-the-Art Vision-Language Mo
Matt Deitke, Christopher Clark, Sangho Lee, Rohun Tripathi, Yue Yang, Jae Sung Park, Mohammadreza Salehi, Niklas Muennighoff, Kyle Lo, Luca Soldaini, Jiasen Lu, Taira Anderson, Erin Bransom, Kiana Ehsani, Huong Ngo, YenSung Chen, Ajay Patel, Mark Yatskar, Chris Callison-Burch, Andrew Head, Rose Hendrix, Favyen Bastani, Eli VanderBilt, Nathan Lambert, Yvonne Chou, Arnavi Chheda, Jenna Sparks, Sam Skjonsberg, Michael Schmitz, Aaron Sarnat, Byron Bischoff, Pete Walsh, Chris Newell, Piper Wolters, Tanmay Gupta, Kuo-Hao Zeng, Jon Borchardt, Dirk Groeneveld, Crystal Nam, Sophie Lebrecht, Caitlin Wittlif, Carissa Schoenick, Oscar Michel, Ranjay Krishna, Luca Weihs, Noah A. Smith, Hannaneh Hajishirzi, Ross Girshick, Ali Farhadi, Aniruddha Kembhavi
CVPR 2025 oral 原文 | 代码 💡 未复现
Continuous 3D Perception Model with Persistent State
Qianqian Wang, Yifei Zhang, Aleksander Holynski, Alexei A. Efros, Angjoo Kanazawa
CVPR 2025 oral 原文 | 代码 💡 未复现
Q-Eval-100K: Evaluating Visual Quality and Alignment Level for Text-to-Vision Content
Zicheng Zhang, Tengchuan Kou, Shushi Wang, Chunyi Li, Wei Sun, Wei Wang, Xiaoyu Li, Zongyu Wang, Xuezhi Cao, Xiongkuo Min, Xiaohong Liu, Guangtao Zhai
CVPR 2025 oral 原文 | 代码 💡 未复现
Video-XL:Extra-Long Vision Language Model for Hour-Scale Video Understanding
Yan Shu, Zheng Liu, Peitian Zhang, Minghao Qin, Junjie Zhou, Zhengyang Liang, Tiejun Huang, Bo Zhao
CVPR 2025 oral 原文 | 代码 💡 未复现
Go-with-the-Flow: Motion-Controllable Video Diffusion Models Using Real-Time Warped Noise
Ryan Burgert, Yuancheng Xu, Wenqi Xian, Oliver Pilarski, Pascal Clausen, Mingming He, Li Ma, Yitong Deng, Lingxiao Li, Mohsen Mousavi, Michael Ryoo, Paul Debevec, Ning Yu
CVPR 2025 oral 原文 | 代码 💡 未复现
CleanDIFT: Diffusion Features without Noise
Nick Stracke, Stefan Andreas Baumann, Kolja Bauer, Frank Fundel, Björn Ommer
CVPR 2025 oral 原文 | 代码 💡 未复现
CraftsMan3D: High-fidelity Mesh Generation with 3D Native Diffusion and Interactive Geometry Refiner
Weiyu Li, Jiarui Liu, Hongyu Yan, Rui Chen, Yixun Liang, Xuelin Chen, Ping Tan, Xiaoxiao Long
CVPR 2025 oral 原文 | 代码 💡 未复现
DreamRelation: Bridging Customization and Relation Generation
Qingyu Shi, Lu Qi, Jianzong Wu, Jinbin Bai, Jingbo Wang, Yunhai Tong, Xiangtai Li
CVPR 2025 oral 原文 | 代码 💡 未复现
Generative Multimodal Pretraining with Discrete Diffusion Timestep Tokens
Kaihang Pan, Wang Lin, Zhongqi Yue, Tenglong Ao, Liyu Jia, Wei Zhao, Juncheng Li, Siliang Tang, Hanwang Zhang
CVPR 2025 oral 原文 | 代码 💡 未复现
3DTopia-XL: Scaling High-quality 3D Asset Generation via Primitive Diffusion
Zhaoxi Chen, Jiaxiang Tang, Yuhao Dong, Ziang Cao, Fangzhou Hong, Yushi Lan, Tengfei Wang, Haozhe Xie, Tong Wu, Shunsuke Saito, Liang Pan, Dahua Lin, Ziwei Liu
CVPR 2025 highlight 原文 | 代码 💡 未复现
AnySat: One Earth Observation Model for Many Resolutions, Scales, and Modalities
Guillaume Astruc, Nicolas Gonthier, Clement Mallet, Loic Landrieu
CVPR 2025 highlight 原文 | 代码 💡 未复现
Cross-modal Causal Relation Alignment for Video Question Grounding
Weixing Chen, Yang Liu, Binglin Chen, Jiandong Su, Yongsen Zheng, Liang Lin
CVPR 2025 highlight 原文 | 代码 💡 未复现
DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos
Wenbo Hu, Xiangjun Gao, Xiaoyu Li, Sijie Zhao, Xiaodong Cun, Yong Zhang, Long Quan, Ying Shan
CVPR 2025 highlight 原文 | 代码 💡 未复现
DroneSplat: 3D Gaussian Splatting for Robust 3D Reconstruction from In-the-Wild Drone Imagery
Jiadong Tang, Yu Gao, Dianyi Yang, Liqi Yan, Yufeng Yue, Yi Yang
CVPR 2025 highlight 原文 | 代码 💡 未复现
Empowering Vector Graphics with Consistently Arbitrary Viewing and View-dependent Visibility
Yidi Li, Jun Xiao, Zhengda Lu, Yiqun Wang, Haiyong Jiang
CVPR 2025 highlight 原文 | 代码 💡 未复现
Extrapolating and Decoupling Image-to-Video Generation Models: Motion Modeling is Easier Than You Think
Jie Tian, Xiaoye Qu, Zhenyi Lu, Wei Wei, Sichen Liu, Yu Cheng
CVPR 2025 highlight 原文 | 代码 💡 未复现
ETAP: Event-based Tracking of Any Point
Friedhelm Hamann, Daniel Gehrig, Filbert Febryanto, Kostas Daniilidis, Guillermo Gallego
CVPR 2025 highlight 原文 | 代码 💡 未复现
Flash3D: Super-scaling Point Transformers through Joint Hardware-Geometry Locality
Liyan Chen, Gregory P. Meyer, Zaiwei Zhang, Eric M. Wolff, Paul Vernaza
CVPR 2025 highlight 原文 | 代码 💡 未复现
Flowing from Words to Pixels: A Noise-Free Framework for Cross-Modality Evolution
Qihao Liu, Xi Yin, Alan Yuille, Andrew Brown, Mannat Singh
CVPR 2025 highlight 原文 | 代码 💡 未复现
Generative Photography: Scene-Consistent Camera Control for Realistic Text-to-Image Synthesis
Yu Yuan, Xijun Wang, Yichen Sheng, Prateek Chennuri, Xingguang Zhang, Stanley Chan
CVPR 2025 highlight 原文 | 代码 💡 未复现
Generative Multiview Relighting for 3D Reconstruction under Extreme Illumination Variation
Hadi Alzayer, Philipp Henzler, Jonathan T. Barron, Jia-Bin Huang, Pratul P. Srinivasan, Dor Verbin
CVPR 2025 highlight 原文 | 代码 💡 未复现
Generative Densification: Learning to Densify Gaussians for High-Fidelity Generalizable 3D Reconstruction
Seungtae Nam, Xiangyu Sun, Gyeongjin Kang, Younggeun Lee, Seungjun Oh, Eunbyung Park
CVPR 2025 highlight 原文 | 代码 💡 未复现
GEN3C: 3D-Informed World-Consistent Video Generation with Precise Camera Control
Xuanchi Ren, Tianchang Shen, Jiahui Huang, Huan Ling, Yifan Lu, Merlin Nimier-David, Thomas Müller, Alexander Keller, Sanja Fidler, Jun Gao
CVPR 2025 highlight 原文 | 代码 💡 未复现
Holmes-VAU: Towards Long-term Video Anomaly Understanding at Any Granularity
Huaxin Zhang, Xiaohao Xu, Xiang Wang, Jialong Zuo, Xiaonan Huang, Changxin Gao, Shanjun Zhang, Li Yu, Nong Sang
CVPR 2025 highlight 原文 | 代码 💡 未复现
HELVIPAD: A Real-World Dataset for Omnidirectional Stereo Depth Estimation
Mehdi Zayene, Jannik Endres, Albias Havolli, Charles Corbière, Salim Cherkaoui, Alexandre Kontouli, Alexandre Alahi
CVPR 2025 highlight 原文 | 代码 💡 未复现
ImViD: Immersive Volumetric Videos for Enhanced VR Engagement
Zhengxian Yang, Shi Pan, Shengqi Wang, Haoxiang Wang, Li Lin, Guanjun Li, Zhengqi Wen, Borong Lin, Jianhua Tao, Tao Yu
CVPR 2025 highlight 原文 | 代码 💡 未复现
Inst3D-LMM: Instance-Aware 3D Scene Understanding with Multi-modal Instruction Tuning
Hanxun Yu, Wentong Li, Song Wang, Junbo Chen, Jianke Zhu
CVPR 2025 highlight 原文 | 代码 💡 未复现
Learning 4D Panoptic Scene Graph Generation from Rich 2D Visual Scene
Shengqiong Wu, Hao Fei, Jingkang Yang, Xiangtai Li, Juncheng Li, Hanwang Zhang, Tat-seng Chua
CVPR 2025 highlight 原文 | 代码 💡 未复现
Mamba as a Bridge: Where Vision Foundation Models Meet Vision Language Models for Domain-Generalized Semantic Segmentation
Xin Zhang, Robby T. Tan
CVPR 2025 highlight 原文 | 代码 💡 未复现
MammAlps: A Multi-view Video Behavior Monitoring Dataset of Wild Mammals in the Swiss Alps
Valentin Gabeff, Haozhe Qi, Brendan Flaherty, Gencer Sumbül, Alexander Mathis, Devis Tuia
CVPR 2025 highlight 原文 | 代码 💡 未复现
Matrix3D: Large Photogrammetry Model All-in-One
Yuanxun Lu, Jingyang Zhang, Tian Fang, Jean-Daniel Nahmias, Yanghai Tsin, Long Quan, Xun Cao, Yao Yao, Shiwei Li
CVPR 2025 highlight 原文 | 代码 💡 未复现
MITracker: Multi-View Integration for Visual Object Tracking
Mengjie Xu, Yitao Zhu, Haotian Jiang, Jiaming Li, Zhenrong Shen, Sheng Wang, Haolin Huang, Xinyu Wang, Qing Yang, Han Zhang, Qian Wang
CVPR 2025 highlight 原文 | 代码 💡 未复现
Open-Canopy: Towards Very High Resolution Forest Monitoring
Fajwel Fogel, Yohann Perron, Nikola Besic, Laurent Saint-André, Agnès Pellissier-Tanon, Martin Schwartz, Thomas Boudras, Ibrahim Fayad, Alexandre d'Aspremont, Loic Landrieu, Philippe Ciais
CVPR 2025 highlight 原文 | 代码 💡 未复现
OpenHumanVid: A Large-Scale High-Quality Dataset for Enhancing Human-Centric Video Generation
Hui Li, Mingwang Xu, Yun Zhan, Shan Mu, Jiaye Li, Kaihui Cheng, Yuxuan Chen, Tan Chen, Mao Ye, Jingdong Wang, Siyu Zhu
CVPR 2025 highlight 原文 | 代码 💡 未复现
OpticalNet: An Optical Imaging Dataset and Benchmark Beyond the Diffraction Limit
Benquan Wang, Ruyi An, Jin-Kyu So, Sergei Kurdiumov, Eng Aik Chan, Giorgio Adamo, Yuhan Peng, Yewen Li, Bo An
CVPR 2025 highlight 原文 | 代码 💡 未复现
Optimizing for the Shortest Path in Denoising Diffusion Model
Ping Chen, Xingpeng Zhang, Zhaoxiang Liu, Huan Hu, Xiang Liu, Kai Wang, Min Wang, Yanlin Qian, Shiguo Lian
CVPR 2025 highlight 原文 | 代码 💡 未复现
Reason-before-Retrieve: One-Stage Reflective Chain-of-Thoughts for Training-Free Zero-Shot Composed Image Retrieval
Yuanmin Tang, Xiaoting Qin, Jue Zhang, Jing Yu, Gaopeng Gou, Gang Xiong, Qingwei Ling, Saravan Rajmohan, Dongmei Zhang, Qi Wu
CVPR 2025 highlight 原文 | 代码 💡 未复现
SmartCLIP: Modular Vision-language Alignment with Identification Guarantees
Shaoan Xie, Lingjing Lingjing, Yujia Zheng, Yu Yao, Zeyu Tang, Eric P. Xing, Guangyi Chen, Kun Zhang
CVPR 2025 highlight 原文 | 代码 💡 未复现
Structured 3D Latents for Scalable and Versatile 3D Generation
Jianfeng Xiang, Zelong Lv, Sicheng Xu, Yu Deng, Ruicheng Wang, Bowen Zhang, Dong Chen, Xin Tong, Jiaolong Yang
CVPR 2025 highlight 原文 | 代码 💡 未复现
StyleSSP: Sampling StartPoint Enhancement for Training-free Diffusion-based Method for Style Transfer
Ruojun Xu, Weijie Xi, Xiaodi Wang, Yongbo Mao, Zach Cheng
CVPR 2025 highlight 原文 | 代码 💡 未复现
Towards Autonomous Micromobility through Scalable Urban Simulation
Wayne Wu, Honglin He, Chaoyuan Zhang, Jack He, Seth Z. Zhao, Ran Gong, Quanyi Li, Bolei Zhou
CVPR 2025 highlight 原文 | 代码 💡 未复现
UIBDiffusion: Universal Imperceptible Backdoor Attack for Diffusion Models
Yuning Han, Bingyin Zhao, Rui Chu, Feng Luo, Biplab Sikdar, Yingjie Lao
CVPR 2025 highlight 原文 | 代码 💡 未复现
UCOD-DPL: Unsupervised Camouflaged Object Detection via Dynamic Pseudo-label Learning
Weiqi Yan, Lvhai Chen, Huaijia Kou, Shengchuan Zhang, Yan Zhang, Liujuan Cao
CVPR 2025 highlight 原文 | 代码 💡 未复现
Video Depth Anything: Consistent Depth Estimation for Super-Long Videos
Sili Chen, Hengkai Guo, Shengnan Zhu, Feihu Zhang, Zilong Huang, Jiashi Feng, Bingyi Kang
CVPR 2025 highlight 原文 | 代码 💡 未复现
World-consistent Video Diffusion with Explicit 3D Modeling
Qihang Zhang, Shuangfei Zhai, Miguel Ángel Bautista Martin, Kevin Miao, Alexander Toshev, Joshua Susskind, Jiatao Gu
CVPR 2025 highlight 原文 | 代码 💡 未复现
Your ViT is Secretly an Image Segmentation Model
Tommie Kerssies, Niccolò Cavagnero, Alexander Hermans, Narges Norouzi, Giuseppe Averta, Bastian Leibe, Gijs Dubbelman, Daan de Geus
CVPR 2025 highlight 原文 | 代码 💡 未复现
WonderWorld: Interactive 3D Scene Generation from a Single Image
Hong-Xing Yu, Haoyi Duan, Charles Herrmann, William T. Freeman, Jiajun Wu
CVPR 2025 highlight 原文 | 代码 💡 未复现
Relightable Gaussian Codec Avatars
Shunsuke Saito, Gabriel Schwartz, Tomas Simon, Junxuan Li, Giljoo Nam
CVPR 2024 oral 原文 | 代码 💡 未复现
Ranni: Taming Text-to-Image Diffusion for Accurate Instruction Following
Yutong Feng, Biao Gong, Di Chen, Yujun Shen, Yu Liu, Jingren Zhou
CVPR 2024 oral 原文 | 代码 💡 未复现
Rethinking Inductive Biases for Surface Normal Estimation
Gwangbin Bae, Andrew J. Davison
CVPR 2024 oral 原文 | 代码 💡 未复现
PaSCo: Urban 3D Panoptic Scene Completion with Uncertainty Awareness
Anh-Quan Cao, Angela Dai, Raoul de Charette
CVPR 2024 oral 原文 | 代码 💡 未复现
Transcriptomics-guided Slide Representation Learning in Computational Pathology
Guillaume Jaume, Lukas Oldenburg, Anurag Vaidya, Richard J. Chen, Drew F. K. Williamson, Thomas Peeters, Andrew H. Song, Faisal Mahmood
CVPR 2024 oral 原文 | 代码 💡 未复现
DiffusionLight: Light Probes for Free by Painting a Chrome Ball
Pakkapon Phongthawee, Worameth Chinchuthakun, Nontaphat Sinsunthithet, Amit Raj, Varun Jampani, Pramook Khungurn, Supasorn Suwajanakorn
CVPR 2024 oral 原文 | 代码 💡 未复现
URHand: Universal Relightable Hands
Zhaoxi Chen, Gyeongsik Moon, Kaiwen Guo, Chen Cao, Stanislav Pidhorskyi, Tomas Simon, Rohan Joshi, Yuan Dong, Yichen Xu, Bernardo Pires, He Wen, Lucas Evans, Bo Peng, Julia Buffalini, Autumn Trimble, Kevyn McPhail, Melissa Schoeller, Shoou-I Yu, Javier Romero, Michael Zollhöfer, Yaser Sheikh, Ziwei Liu, Shunsuke Saito
CVPR 2024 oral 原文 | 代码 💡 未复现
Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation
Bingxin Ke, Anton Obukhov, Shengyu Huang, Nando Metzger, Rodrigo Caye Daudt, Konrad Schindler
CVPR 2024 oral 原文 | 代码 💡 未复现
PhysGaussian: Physics-Integrated 3D Gaussians for Generative Dynamics
Tianyi Xie, Zeshun Zong, Yuxing Qiu, Xuan Li, Yutao Feng, Yin Yang, Chenfanfu Jiang
CVPR 2024 oral 原文 | 代码 💡 未复现
HumanGaussian: Text-Driven 3D Human Generation with Gaussian Splatting
Xian Liu, Xiaohang Zhan, Jiaxiang Tang, Ying Shan, Gang Zeng, Dahua Lin, Xihui Liu, Ziwei Liu
CVPR 2024 oral 原文 | 代码 💡 未复现
Prompt Highlighter: Interactive Control for Multi-Modal LLMs
Yuechen Zhang, Shengju Qian, Bohao Peng, Shu Liu, Jiaya Jia
CVPR 2024 oral 原文 | 代码 💡 未复现
Cooperation Does Matter: Exploring Multi-Order Bilateral Relations for Audio-Visual Segmentation
Qi Yang, Xing Nie, Tong Li, Pengfei Gao, Ying Guo, Cheng Zhen, Pengfei Yan, Shiming Xiang
CVPR 2024 oral 原文 | 代码 💡 未复现
Upscale-A-Video: Temporal-Consistent Diffusion Model for Real-World Video Super-Resolution
Shangchen Zhou, Peiqing Yang, Jianyi Wang, Yihang Luo, Chen Change Loy
CVPR 2024 oral 原文 | 代码 💡 未复现
Putting the Object Back into Video Object Segmentation
Ho Kei Cheng, Seoung Wug Oh, Brian Price, Joon-Young Lee, Alexander Schwing
CVPR 2024 oral 原文 | 代码 💡 未复现
InstanceDiffusion: Instance-level Control for Image Generation
Xudong Wang, Trevor Darrell, Sai Saketh Rambhatla, Rohit Girdhar, Ishan Misra
CVPR 2024 oral 原文 | 代码 💡 未复现
OMG-Seg: Is One Model Good Enough For All Segmentation?
Xiangtai Li, Haobo Yuan, Wei Li, Henghui Ding, Size Wu, Wenwei Zhang, Yining Li, Kai Chen, Chen Change Loy
CVPR 2024 oral 原文 | 代码 💡 未复现
Towards Language-Driven Video Inpainting via Multimodal Large Language Models
Jianzong Wu, Xiangtai Li, Chenyang Si, Shangchen Zhou, Jingkang Yang, Jiangning Zhang, Yining Li, Kai Chen, Yunhai Tong, Ziwei Liu, Chen Change Loy
CVPR 2024 oral 原文 | 代码 💡 未复现
VBench: Comprehensive Benchmark Suite for Video Generative Models
Ziqi Huang, Fan Zhang, Xiaojie Xu, Yinan He, Jiashuo Yu, Ziyue Dong, Qianli Ma, Nattapol Chanpaisit, Chenyang Si, Yuming Jiang, Yaohui Wang, Xinyuan Chen, Ying-Cong Chen, Limin Wang, Dahua Lin, Yu Qiao, Ziwei Liu
CVPR 2024 oral 原文 | 代码 💡 未复现
PIGEON: Predicting Image Geolocations
Lukas Haas, Michal Skreta, Silas Alberti, Chelsea Finn
CVPR 2024 oral 原文 | 代码 💡 未复现
DiffPortrait3D: Controllable Diffusion for Zero-Shot Portrait View Synthesis
Yuming Gu, You Xie, Hongyi Xu, Guoxian Song, Yichun Shi, Di Chang, Jing Yang, Linjie Luo
CVPR 2024 highlight 原文 | 代码 💡 未复现
Domain Prompt Learning with Quaternion Networks
Qinglong Cao, Zhengqin Xu, Yuntian Chen, Chao Ma, Xiaokang Yang
CVPR 2024 highlight 原文 | 代码 💡 未复现
DragDiffusion: Harnessing Diffusion Models for Interactive Point-based Image Editing
Yujun Shi, Chuhui Xue, Jun Hao Liew, Jiachun Pan, Hanshu Yan, Wenqing Zhang, Vincent Y. F. Tan, Song Bai
CVPR 2024 highlight 原文 | 代码 💡 未复现
Fast ODE-based Sampling for Diffusion Models in Around 5 Steps
Zhenyu Zhou, Defang Chen, Can Wang, Chun Chen
CVPR 2024 highlight 原文 | 代码 💡 未复现
FlowVid: Taming Imperfect Optical Flows for Consistent Video-to-Video Synthesis
Feng Liang, Bichen Wu, Jialiang Wang, Licheng Yu, Kunpeng Li, Yinan Zhao, Ishan Misra, Jia-Bin Huang, Peizhao Zhang, Peter Vajda, Diana Marculescu
CVPR 2024 highlight 原文 | 代码 💡 未复现
General Object Foundation Model for Images and Videos at Scale
Junfeng Wu, Yi Jiang, Qihao Liu, Zehuan Yuan, Xiang Bai, Song Bai
CVPR 2024 highlight 原文 | 代码 💡 未复现
Insect-Foundation: A Foundation Model and Large-scale 1M Dataset for Visual Insect Understanding
Hoang-Quan Nguyen, Thanh-Dat Truong, Xuan Bac Nguyen, Ashley Dowling, Xin Li, Khoa Luu
CVPR 2024 highlight 原文 | 代码 💡 未复现
Move as You Say, Interact as You Can: Language-guided Human Motion Generation with Scene Affordance
Zan Wang, Yixin Chen, Baoxiong Jia, Puhao Li, Jinlu Zhang, Jingze Zhang, Tengyu Liu, Yixin Zhu, Wei Liang, Siyuan Huang
CVPR 2024 highlight 原文 | 代码 💡 未复现
Object Recognition as Next Token Predictio
Kaiyu Yue, Bor-Chun Chen, Jonas Geiping, Hengduo Li, Tom Goldstein, Ser-Nam Lim
CVPR 2024 highlight 原文 | 代码 💡 未复现
RAVE: Randomized Noise Shuffling for Fast and Consistent Video Editing with Diffusion Models
Ozgur Kara, Bariscan Kurtkaya, Hidir Yesiltepe, James M. Rehg, Pinar Yanardag
CVPR 2024 highlight 原文 | 代码 💡 未复现
Readout Guidance: Learning Control from Diffusion Features
Grace Luo, Trevor Darrell, Oliver Wang, Dan B Goldman, Aleksander Holynski
CVPR 2024 highlight 原文 | 代码 💡 未复现
Real Acoustic Fields: An Audio-Visual Room Acoustics Dataset and Benchmark
Ziyang Chen, Israel D. Gebru, Christian Richardt, Anurag Kumar, William Laney, Andrew Owens, Alexander Richard
CVPR 2024 highlight 原文 | 代码 💡 未复现
RichDreamer: A Generalizable Normal-Depth Diffusion Model for Detail Richness in Text-to-3D
Lingteng Qiu, Guanying Chen, Xiaodong Gu, Qi Zuo, Mutian Xu, Yushuang Wu, Weihao Yuan, Zilong Dong, Liefeng Bo, Xiaoguang Han
CVPR 2024 highlight 原文 | 代码 💡 未复现
RobustSAM: Segment Anything Robustly on Degraded Images
Wei-Ting Chen, Yu-Jiet Vong, Sy-Yen Kuo, Sizhuo Ma, Jian Wang
CVPR 2024 highlight 原文 | 代码 💡 未复现
Scaling Up Dynamic Human-Scene Interaction Modeling
Nan Jiang, Zhiyuan Zhang, Hongjie Li, Xiaoxuan Ma, Zan Wang, Yixin Chen, Tengyu Liu, Yixin Zhu, Siyuan Huang
CVPR 2024 highlight 原文 | 代码 💡 未复现
SDDGR: Stable Diffusion-based Deep Generative Replay for Class Incremental Object Detection
Junsu Kim, Hoseong Cho, Jihyeon Kim, Yihalem Yimolal Tiruneh, Seungryul Baek
CVPR 2024 highlight 原文 | 代码 💡 未复现
SpatialTracker: Tracking Any 2D Pixels in 3D Space
Yuxi Xiao, Qianqian Wang, Shangzhan Zhang, Nan Xue, Sida Peng, Yujun Shen, Xiaowei Zhou
CVPR 2024 highlight 原文 | 代码 💡 未复现
TFMQ-DM:Temporal Feature Maintenance Quantization for Diffusion Models
Yushi Huang, Ruihao Gong, Jing Liu, Tianlong Chen, Xianglong Liu
CVPR 2024 highlight 原文 | 代码 💡 未复现
Text Is MASS: Modeling as Stochastic Embedding for Text-Video Retrieval
Jiamian Wang, Guohao Sun, Pichao Wang, Dongfang Liu, Sohail Dianat, Majid Rabbani, Raghuveer Rao, Zhiqiang Tao
CVPR 2024 highlight 原文 | 代码 💡 未复现
Towards Learning a Generalist Model for Embodied Navigation
Duo Zheng, Shijia Huang, Lin Zhao, Yiwu Zhong, Liwei Wang
CVPR 2024 highlight 原文 | 代码 💡 未复现
UniMODE: Unified Monocular 3D Object Detection
Zhuoling Li, Xiaogang Xu, SerNam Lim, Hengshuang Zhao
CVPR 2024 highlight 原文 | 代码 💡 未复现
Unsupervised Keypoints from Pretrained Diffusion Models
Eric Hedlin, Gopal Sharma, Shweta Mahajan, Xingzhe He, Hossam Isack, Abhishek Kar Helge Rhodin, Andrea Tagliasacchi, Kwang Moo Yi
CVPR 2024 highlight 原文 | 代码 💡 未复现
VA3: Virtually Assured Amplification Attack on Probabilistic Copyright Protection for Text-to-Image Generative Models
Xiang Li, Qianli Shen, Kenji Kawaguchi
CVPR 2024 highlight 原文 | 代码 💡 未复现
VecFusion: Vector Font Generation with Diffusion
Vikas Thamizharasan, Difan Liu, Shantanu Agarwal, Matthew Fisher, Michael Gharbi, Oliver Wang, Alec Jacobson, Evangelos Kalogerakis
CVPR 2024 highlight 原文 | 代码 💡 未复现
Wonder3D: Single Image to 3D using Cross-Domain Diffusion
Xiaoxiao Long, Yuan-Chen Guo, Cheng Lin, Yuan Liu, Zhiyang Dou, Lingjie Liu, Yuexin Ma, Song-Hai Zhang, Marc Habermann, Christian Theobalt, Wenping Wang
CVPR 2024 highlight 原文 | 代码 💡 未复现
VTimeLLM: Empower LLM to Grasp Video Moments
Bin Huang, Xin Wang, Hong Chen, Zihan Song, Wenwu Zhu
CVPR 2024 highlight 原文 | 代码 💡 未复现
LHM: Large Animatable Human Reconstruction Model from a Single Image in Seconds
Lingteng Qiu, Xiaodong Gu, Peihao Li, Qi Zuo, Weichao Shen, Junfei Zhang, Kejie Qiu, Weihao Yuan, Guanying Chen, Zilong Dong, Liefeng Bo
ICCV 2025 poster 原文 | 代码 💡 未复现
EasyControl: Adding Efficient and Flexible Control for Diffusion Transformer
Yuxuan Zhang, Yirui Yuan, Yiren Song, Haofan Wang, Jiaming Liu
ICCV 2025 poster 原文 | 代码 💡 未复现
MVSplat: Efficient 3D Gaussian Splatting from Sparse Multi-View Images
Yuedong Chen, Haofei Xu, Chuanxia Zheng, Bohan Zhuang, Marc Pollefeys, Andreas Geiger, Tat-Jen Cham, Jianfei Cai
ECCV 2024 Oral 原文 | 代码 💡 未复现
FSGS: Real-Time Few-shot View Synthesis using Gaussian Splatting
Zehao Zhu, Zhiwen Fan, Yifan Jiang, Zhangyang Wang
ECCV 2024 poster 原文 | 代码 💡 未复现
ZigMa: A DiT-style Zigzag Mamba Diffusion Model
Vincent Tao Hu, Stefan Andreas Baumann, Ming Gui, Olga Grebenkova, Pingchuan Ma, Johannes Schusterbauer, Björn Ommer
ECCV 2024 poster 原文 | 代码 💡 未复现
Bridging Synthetic and Real Worlds for Pre-training Scene Text Detectors
Tongkun Guan, Wei Shen, Xue Yang, Xuehui Wang, Xiaokang Yang
ECCV 2024 poster 原文 | 代码 💡 未复现
PosFormer: Recognizing Complex Handwritten Mathematical Expression with Position Forest Transformer
Tongkun Guan, Chengyu Lin, Wei Shen, Xiaokang Yang
ECCV 2024 poster 原文 | 代码 💡 未复现
Fully Sparse 3D Occupancy Prediction
Haisong Liu, Yang Chen, Haiguang Wang, Zetong Yang, Tianyu Li, Jia Zeng, Li Chen, Hongyang Li, Limin Wang
ECCV 2024 poster 原文 | 代码 💡 未复现
NeRF-MAE: Masked AutoEncoders for Self-Supervised 3D Representation Learning for Neural Radiance Fields
Muhammad Zubair Irshad, Sergey Zakharov, Vitor Guizilini, Adrien Gaidon, Zsolt Kira, Rares Ambrus
ECCV 2024 Poster 原文 | 代码 💡 未复现
ControlCap: Controllable Region-level Captioning
Yuzhong Zhao, Yue Liu, Zonghao Guo, Weijia Wu, Chen Gong, Fang Wan, Qixiang Ye
ECCV 2024 Poster 原文 | 代码 💡 未复现
GiT: Towards Generalist Vision Transformer through Universal Language Interface
Haiyang Wang, Hao Tang, Li Jiang, Shaoshuai Shi, Muhammad Ferjad Naeem, Hongsheng Li, Bernt Schiele, Liwei Wang
ECCV 2024 Oral 原文 | 代码 💡 未复现
Relation DETR: Exploring Explicit Position Relation Prior for Object Detection
Xiuquan Hou, Meiqin Liu, Senlin Zhang, Ping Wei, Badong Chen, Xuguang Lan
ECCV 2024 Oral 原文 | 代码 💡 未复现
FairDomain: Achieving Fairness in Cross-Domain Medical Image Segmentation and Classification
Yu Tian, Congcong Wen, Min Shi, Muhammad Muneeb Afzal, Hao Huang, Muhammad Osama Khan, Yan Luo, Yi Fang, Mengyu Wang
ECCV 2024 poster 原文 | 代码 💡 未复现
OneRestore: A Universal Restoration Framework for Composite Degradation
Yu Guo, Yuan Gao, Yuxu Lu, Huilin Zhu, Ryan Wen Liu, Shengfeng He
ECCV 2024 poster 原文 | 代码 💡 未复现
VideoStudio: Generating Consistent-Content and Multi-Scene Videos
Fuchen Long, Zhaofan Qiu, Ting Yao, Tao Mei
ECCV 2024 poster 原文 | 代码 💡 未复现
Zero-shot Object Counting with Good Exemplars
Huilin Zhu, Jingling Yuan, Zhengwei Yang, Yu Guo, Zheng Wang, Xian Zhong, Shengfeng He
ECCV 2024 poster 原文 | 代码 💡 未复现
SPVLoc: Semantic Panoramic Viewport Matching for 6D Camera Localization in Unseen Environments
Niklas Gard, Anna Hilsmann, Peter Eisert
ECCV 2024 Oral 原文 | 代码 💡 未复现
Stereo Any Video: Temporally Consistent Stereo Matching
Junpeng Jing;Weixun Luo;Ye Mao; Krystian Mikolajczyk
ICCV 2025 highlight 原文 | 代码 💡 未复现
InfiniteYou: Flexible Photo Recrafting While Preserving Your Identity
Liming Jiang, Qing Yan, Yumin Jia, Zichuan Liu, Hao Kang, Xin Lu
ICCV 2025 highlight 原文 | 代码 💡 未复现
MeshAnything V2: Artist-Created Mesh Generation With Adjacent Mesh Tokenization
Yiwen Chen, Yikai Wang, Yihao Luo, Zhengyi Wang, Zilong Chen, Jun Zhu, Chi Zhang, Guosheng Lin
ICCV 2025 poster 原文 | 代码 💡 未复现
From Reusing to Forecasting: Accelerating Diffusion Models with TaylorSeers
Jiacheng Liu, Chang Zou, Yuanhuiyi Lyu, Junjie Chen, Linfeng Zhang
ICCV 2025 poster 原文 | 代码 💡 未复现
MotionFollower: Editing Video Motion via Lightweight Score-Guided Diffusion
Shuyuan Tu, Qi Dai, Zihao Zhang, Sicheng Xie, Zhi-Qi Cheng, Chong Luo, Xintong Han, Zuxuan Wu, Yu-Gang Jiang
ICCV 2025 poster 原文 | 代码 💡 未复现
AnimateAnyMesh: A Feed-Forward 4D Foundation Model for Text-Driven Universal Mesh Animation
Zijie Wu, Chaohui Yu, Fan Wang, Xiang Bai
ICCV 2025 poster 原文 | 代码 💡 未复现
VSSD: Vision Mamba with Non-Causal State Space Duality
Yuheng Shi, Minjing Dong, Mingjia Li, Chang Xu
ICCV 2025 poster 原文 | 代码 💡 未复现
FreeMorph: Tuning-Free Generalized Image Morphing with Diffusion Model
Yukang Cao、Chenyang Si、Jinghao Wang、Ziwei Liu
ICCV 2025 poster 原文 | 代码 💡 未复现
GENMO: A GENeralist Model for Human MOtion
Jiefeng Li, Jinkun Cao, Haotian Zhang, Davis Rempe, Jan Kautz, Umar Iqbal, Ye Yuan
ICCV 2025 poster 原文 | 代码 💡 未复现
Vamba: Understanding Hour-Long Videos with Hybrid Mamba-Transformers
Weiming Ren, Wentao Ma, Huan Yang, Cong Wei, Ge Zhang, Wenhu Chen
ICCV 2025 poster 原文 | 代码 💡 未复现
Medical World Model: Generative Simulation of Tumor Evolution for Treatment Planning
Yijun Yang, Zhao-Yang Wang, Qiuping Liu, Shuwen Sun, Kang Wang, Rama Chellappa, Zongwei Zhou, Alan Yuille, Lei Zhu, Yu-Dong Zhang, Jieneng Chen
ICCV 2025 poster 原文 | 代码 💡 未复现
Where, What, Why: Towards Explainable Driver Attention Prediction
Yuchen Zhou, Jiayu Tang, Xiaoyan Xiao, Yueyao Lin, Linkai Liu, Zipeng Guo, Hao Fei, Xiaobo Xia, Chao Gou
ICCV 2025 poster 原文 | 代码 💡 未复现
I Open at the Close: A Deep Reinforcement Learning Evaluation of Open Streets Initiatives
R. Teal Witter, Lucas Rosenblatt
AAAI 2024 Technical 原文 | 代码 💡 未复现
MedAlign: A Clinician-Generated Dataset for Instruction Following with Electronic Medical Records
Scott L. Fleming, Alejandro Lozano, William J. Haberkorn, Jenelle A. Jindal, Eduardo Reis, Rahul Thapa, Louis Blankemeier, Julian Z. Genkins, Ethan Steinberg, Ashwin Nayak, Birju Patel, Chia-Chun Chiang, Alison Callahan, Zepeng Huo, Sergios Gatidis, Scott Adams, Oluseyi Fayanju, Shreya J. Shah, Thomas Savage, Ethan Goh, Akshay S. Chaudhari, Nima Aghaeepour, Christopher Sharp, Michael A. Pfeffer, Percy Liang, Jonathan H. Chen, Keith E. Morse, Emma P. Brunskill, Jason A. Fries, Nigam H. Shah
AAAI 2023 Technical 原文 | 代码 💡 未复现
Multi-Modal Discussion Transformer: Integrating Text, Images and Graph Transformers to Detect Hate Speech on Social Media
Liam Hebert, Gaurav Sahu, Yuxuan Guo, Nanda Kishore Sreenivas, Lukasz Golab, Robin Cohen
AAAI 2024 Technical 原文 | 代码 💡 未复现
Quantile-Regression-Ensemble: A Deep Learning Algorithm for Downscaling Extreme Precipitation
Thomas Bailie, Yun Sing Koh, Neelesh Rampal, Peter B. Gibson
AAAI 2024 Technical 原文 💡 未复现
Spatial-Logic-Aware Weakly Supervised Learning for Flood Mapping on Earth Imagery
Zelin Xu, Tingsong Xiao, Wenchong He, Yu Wang, Zhe Jiang, Shigang Chen, Yiqun Xie, Xiaowei Jia, Da Yan, Yang Zhou
AAAI 2024 Technical 原文 | 代码 💡 未复现
Vector Field Oriented Diffusion Model for Crystal Material Generation
Astrid Klipfel, Yael Fregier , Adlane Sayede, Zied Bouraoui
AAAI 2023 Technical 原文 | 代码 💡 未复现
Periodic Graph Transformers for Crystal Material Property Prediction
Keqiang Yan,Yi Liu,Yuchao Lin,Shuiwang Ji
AAAI 2022 Technical 原文 | 代码 💡 未复现
ContraNovo: A Contrastive Learning Approach to Enhance De Novo Peptide Sequencing
Zhi Jin, Sheng Xu, Xiang Zhang, Tianze Ling, Nanqing Dong, Wanli Ouyang, Zhiqiang Gao, Cheng Chang, Siqi Sun
AAAI 2023 Technical 原文 | 代码 💡 未复现
Dual-Channel Learning Framework for Drug-Drug Interaction Prediction via Relation-Aware Heterogeneous Graph Transformer
Xiaorui Su, Pengwei Hu, Zhu-Hong You, Philip S. Yu, Lun Hu
AAAI 2024 Technical 原文 | 代码 💡 未复现
Explore 3D Dance Generation via Reward Model from Automatically-Ranked Demonstrations
Zilin Wang, Haolin Zhuang, Lu Li, Yinmin Zhang, Junjie Zhong, Jun Chen, Yu Yang, Boshi Tang, Zhiyong Wu
AAAI 2023 Technical 原文 | 代码 💡 未复现
Heterogeneous Graph Reasoning for Fact Checking over Texts and Tables
Haisong Gong, Weizhi Xu, Shu Wu, Qiang Liu, Liang Wang
AAAI 2024 Technical 原文 | 代码 💡 未复现
PosDiffNet: Positional Neural Diffusion for Point Cloud Registration in a Large Field of View with Perturbations
Rui She, Sijie Wang, Qiyu Kang, Kai Zhao, Yang Song, Wee Peng Tay, Tianyu Geng
AAAI 2024 Technical 原文 | 代码 💡 未复现
SeGA: Preference-Aware Self-Contrastive Learning with Prompts for Anomalous User Detection on Twitter
Ying-Ying Chang, Wei-Yao Wang, Wen-Chih Peng
AAAI 2023 Technical 原文 | 代码 💡 未复现
Text-Guided Molecule Generation with Diffusion Language Model
Haisong Gong, Qiang Liu, Shu Wu, Liang Wang
AAAI 2024 Technical 原文 | 代码 💡 未复现
Unsupervised Gene-Cell Collective Representation Learning with Optimal Transport
Jixiang Yu, Nanjun Chen, Ming Gao, Xiangtao Li, Ka-Chun Wong
AAAI 2024 Technical 原文 💡 未复现
Bi-directional Adapter for Multi-modal Tracking
Bing Cao, Junliang Guo, Pengfei Zhu, Qinghua Hu
AAAI 2023 Technical 原文 | 代码 💡 未复现
DanceAnyWay: Synthesizing Beat-Guided 3D Dances with Randomized Temporal Contrastive Learning
Aneesh Bhattacharya, Manas Paranjape, Uttaran Bhattacharya, Aniket Bera
AAAI 2024 Technical 原文 | 代码 💡 未复现
Deep Linear Array Pushbroom Image Restoration: A Degradation Pipeline and Jitter-Aware Restoration Network
Zida Chen, Ziran Zhang, Haoying Li, Menghao Li, Yueting Chen, Qi Li, Huajun Feng, Zhihai Xu, Shiqi Chen
AAAI 2024 Technical 原文 | 代码 💡 未复现
DiffSED: Sound Event Detection with Denoising Diffusion
Swapnil Bhosale, Sauradip Nag, Diptesh Kanojia, Jiankang Deng, Xiatian Zhu
AAAI 2023 Technical 原文 | 代码 💡 未复现
Domain-Controlled Prompt Learning
Qinglong Cao, Zhengqin Xu, Yuntian Chen, Chao Ma, Xiaokang Yang
AAAI 2023 Technical 原文 | 代码 💡 未复现
DreamIdentity: Improved Editability for Efficient Face-identity Preserved Image Generation
Zhuowei Chen, Shancheng Fang, Wei Liu, Qian He, Mengqi Huang, Yongdong Zhang, Zhendong Mao
AAAI 2023 Technical 原文 | 代码 💡 未复现
DreamStyler: Paint by Style Inversion with Text-to-Image Diffusion Models
Namhyuk Ahn, Junsoo Lee, Chunggi Lee, Kunhee Kim, Daesik Kim, Seung-Hun Nam, Kibeom Hong
AAAI 2023 Technical 原文 | 代码 💡 未复现
Evaluate Geometry of Radiance Fields with Low-frequency Color Prior
Namhyuk Ahn, Junsoo Lee, Chunggi Lee, Kunhee Kim, Daesik Kim, Seung-Hun Nam, Kibeom Hong
AAAI 2024 Technical 原文 | 代码 💡 未复现
EVE: Efficient Vision-Language Pre-training with Masked Prediction and Modality-Aware MoE
Junyi Chen, Longteng Guo, Jia Sun, Shuai Shao, Zehuan Yuan, Liang Lin, Dongyu Zhang
AAAI 2024 Technical 原文 | 代码 💡 未复现
Exploiting Polarized Material Cues for Robust Car Detection
Wen Dong, Haiyang Mei, Ziqi Wei, Ao Jin, Sen Qiu, Qiang Zhang, Xin Yang
AAAI 2024 Technical 原文 | 代码 💡 未复现
Federated Modality-specific Encoders and Multimodal Anchors for Personalized Brain Tumor Segmentation
Qian Dai, Dong Wei, Hong Liu, Jinghan Sun, Liansheng Wang, Yefeng Zheng
AAAI 2024 Technical 原文 | 代码 💡 未复现
Markerless Multi-view 3D Human Pose Estimation: a survey
Ana Filipa Rodrigues Nogueira,Hélder P. Oliveira,Luís F. Teixeira
AAAI 2024 Technical 原文 | 代码 💡 未复现
Generating and Reweighting Dense Contrastive Patterns for Unsupervised Anomaly Detection
Songmin Dai, Yifan Wu, Xiaoqiang Li, Xiangyang Xue
AAAI 2023 Technical 原文 | 代码 💡 未复现
SimROD: A Simple Baseline for Raw Object Detection with Global and Local Enhancements
Haiyang Xie, Xi Shen, Shihua Huang, Qirui Wang, Zheng Wang
AAAI 2025 Technical 原文 | 代码 💡 未复现
Hyp-OW: Exploiting Hierarchical Structure Learning with Hyperbolic Distance Enhances Open World Object Detection
Thang Doan, Xin Li,Sima Behpour,Wenbin He,Liang Gou,Liu Ren
AAAI 2024 Technical 原文 | 代码 💡 未复现
iDet3D: Towards Efficient Interactive Object Detection for LiDAR Point Clouds
Dongmin Choi, Wonwoo Cho, Kangyeol Kim, Jaegul Choo
AAAI 2023 Technical 原文 | 代码 💡 未复现
Image Safeguarding: Reasoning with Conditional Vision Language Model and Obfuscating Unsafe Content Counterfactually
Mazal Bethany, Brandon Wherry, Nishant Vishwamitra, Peyman Najafirad
AAAI 2024 Technical 原文 | 代码 💡 未复现
Improving Diffusion-Based Image Synthesis with Context Prediction
Ling Yang, Jingwei Liu, Shenda Hong, Zhilong Zhang, Zhilin Huang, Zheming Cai, Wentao Zhang, Bin Cui
AAAI 2024 Technical 原文 💡 未复现
UniPre3D: Unified Pre-training of 3D Point Cloud Models with Cross-Modal Gaussian Splatting
Ziyi Wang, Yanran Zhang, Jie Zhou, Jiwen Lu
AAAI 2025 Technical 原文 | 代码 💡 未复现
SpatioTemporal Difference Network for Video Depth Super-Resolution
Zhengxue Wang, Yuan Wu, Xiang Li, Zhiqiang Yan, Jian Yang
AAAI 2025 Technical 原文 | 代码 💡 未复现
Iterative Token Evaluation and Refinement for Real-World Super-Resolution
Chaofeng Chen, Shangchen Zhou, Liang Liao, Haoning Wu, Wenxiu Sun, Qiong Yan, Weisi Lin
AAAI 2023 Technical 原文 | 代码 💡 未复现
LDMVFI: Video Frame Interpolation with Latent Diffusion Models
Duolikun Danier, Fan Zhang, David Bull
AAAI 2023 Technical 原文 | 代码 💡 未复现
Joint Demosaicing and Denoising for Spike Camera
Yanchen Dong, Ruiqin Xiong, Jing Zhao, Jian Zhang, Xiaopeng Fan, Shuyuan Zhu, Tiejun Huang
AAAI 2024 Technical 原文 | 代码 💡 未复现
De-LightSAM: Modality-Decoupled Lightweight SAM for Generalizable Medical Segmentation
Qing Xu, Jiaxuan Li, Xiangjian He, Chenxin Li, Fiseha B. Tesem, Wenting Duan, Zhen Chen, Rong Qu, Jonathan M. Garibaldi,Chang Wen Chen
AAAI 2024 Technical 原文 | 代码 💡 未复现
LogoStyleFool: Vitiating Video Recognition Systems via Logo Style Transfer
Yuxin Cao, Ziyu Zhao, Xi Xiao, Derui Wang, Minhui Xue, Jin Lu
AAAI 2024 Technical 原文 💡 未复现
M-BEV: Masked BEV Perception for Robust Autonomous Driving
Siran Chen, Yue Ma, Yu Qiao, Yali Wang
AAAI 2023 Technical 原文 | 代码 💡 未复现
MeDM: Mediating Image Diffusion Models for Video-to-Video Translation with Temporal Correspondence Guidance
Ernie Chu, Tzuhsuan Huang, Shuo-Yen Lin, Jun-Cheng Chen
AAAI 2023 Technical 原文 | 代码 💡 未复现
Mimic: Speaking Style Disentanglement for Speech-Driven 3D Facial Animation
Hui Fu, Zeqing Wang, Ke Gong, Keze Wang, Tianshui Chen, Haojie Li, Haifeng Zeng, Wenxiong Kang
AAAI 2023 Technical 原文 | 代码 💡 未复现
NILUT: Conditional Neural Implicit 3D Lookup Tables for Image Enhancement
Marcos V. Conde, Javier Vazquez-Corral, Michael S. Brown, Radu Timofte
AAAI 2023 Technical 原文 | 代码 💡 未复现
A Range-Null Space Decomposition Approach for Fast and Flexible Spectral Compressive Imaging
Junyu Wang, Shijie Wang, Ruijie Zhang, Zengqiang Zheng, Wenyu Liu, Xinggang Wang
AAAI 2023 Technical 原文 | 代码 💡 未复现
Mixed Degradation Image Restoration via Local Dynamic Optimization and Conditional Embedding
Yubin Gu, Yuan Meng, Xiaoshuai Sun, Jiayi Ji, Weijian Ruan, Rongrong Ji
AAAI 2024 Technical 原文 | 代码 💡 未复现
Revisiting Point Cloud Completion: Are We Ready For The Real-World?
Stuti Pathak, Prashant Kumar, Dheeraj Baiju, Nicholus Mboga, Gunther Steenackers, Rudi Penne
AAAI 2024 Technical 原文 💡 未复现
PuzzleFusion++: Auto-agglomerative 3D Fracture Assembly by Denoise and Verify
Zhengqing Wang, Jiacheng Chen, Yasutaka Furukawa
AAAI 2024 Technical 原文 | 代码 💡 未复现
PNeSM: Arbitrary 3D Scene Stylization via Prompt-Based Neural Style Mapping
Jiafu Chen, Wei Xing, Jiakai Sun, Tianyi Chu, Yiling Huang, Boyan Ji, Lei Zhao, Huaizhong Lin, Haibo Chen, Zhizhong Wang
AAAI 2024 Technical 原文 | 代码 💡 未复现
PPEA-Depth: Progressive Parameter-Efficient Adaptation for Self-Supervised Monocular Depth Estimation
Yue-Jiang Dong, Yuan-Chen Guo, Ying-Tian Liu, Fang-Lue Zhang, Song-Hai Zhang
AAAI 2024 Technical 原文 | 代码 💡 未复现
ResMatch: Residual Attention Learning for Local Feature Matching
Yuxin Deng,Jiayi Ma
AAAI 2023 Technical 原文 | 代码 💡 未复现
Bidirectional Multi-Scale Implicit Neural Representations for Image Deraining
Xiang Chen,Jinshan Pan,Jiangxin Dong
AAAI 2024 Technical 原文 | 代码 💡 未复现
Self-Supervised Bird’s Eye View Motion Prediction with Cross-Modality Signals
Shaoheng Fang, Zuhong Liu, Mingyu Wang, Chenxin Xu, Yiqi Zhong, Siheng Chen
AAAI 2024 Technical 原文 | 代码 💡 未复现
SHaRPose: Sparse High-Resolution Representation for Human Pose Estimation
Xiaoqi An, Lin Zhao, Chen Gong, Nannan Wang, Di Wang, Jian Yang
AAAI 2023 Technical 原文 | 代码 💡 未复现
Simple Image-level Classification Improves Open-vocabulary Object Detection
Ruohuan Fang, Guansong Pang, Xiao Bai
AAAI 2023 Technical 原文 | 代码 💡 未复现
SparseGNV: Generating Novel Views of Indoor Scenes with Sparse Input Views
Weihao Cheng Yan-Pei Cao Ying Shan
AAAI 2023 Technical 原文 | 代码 💡 未复现
A Systematic Investigation on Deep Learning-Based Omnidirectional Image and Video Super-Resolution
Qianqian Zhao, Chunle Guo, Tianyi Zhang, Junpei Zhang, Peiyang Jia, Tan Su, Wenjie Jiang, Chongyi Li
AAAI 2025 Technical 原文 | 代码 💡 未复现
TDeLTA: A Light-weight and Robust Table Detection Method based on Learning Text Arrangement
Yang Fan, Xiangping Wu, Qingcai Chen, Heng Li, Yan Huang, Zhixiang Cai, Qitian Wu
AAAI 2023 Technical 原文 💡 未复现
Variance-insensitive and Target-preserving Mask Refinement for Interactive Image Segmentation
Chaowei Fang, Ziyin Zhou, Junye Chen, Hanjing Su, Qingyao Wu, Guanbin Li
AAAI 2023 Technical 原文 💡 未复现
VIXEN: Visual Text Comparison Network for Image Difference Captioning
Alexander Black, Jing Shi, Yifei Fan, Tu Bui, John Collomosse
AAAI 2024 Technical 原文 | 代码 💡 未复现
Weak Distribution Detectors Lead to Stronger Generalizability of Vision-Language Prompt Tuning
Kun Ding, Haojian Zhang, Qiang Yu, Ying Wang, Shiming Xiang, Chunhong Pan
AAAI 2024 Technical 原文 | 代码 💡 未复现
WebVLN: Vision-and-Language Navigation on Websites
Qi Chen, Dileepa Pitawela, Chongyang Zhao, Gengze Zhou, Hsiang-Ting Chen, Qi Wu
AAAI 2023 Technical 原文 | 代码 💡 未复现
WeditGAN: Few-Shot Image Generation via Latent Space Relocation
Yuxuan Duan, Li Niu, Yan Hong, Liqing Zhang
AAAI 2024 Technical 原文 | 代码 💡 未复现
3D Visibility-aware Generalizable Neural Radiance Fields for Interacting Hands
Xuan Huang, Hanhui Li, Zejun Yang, Zhisheng Wang, Xiaodan Liang
AAAI 2024 Technical 原文 | 代码 💡 未复现
MAC: A Benchmark for Multiple Attribute Compositional Zero-Shot Learning
Xuan Huang, Hanhui Li, Zejun Yang, Zhisheng Wang, Xiaodan Liang
AAAI 2025 Technical 原文 | 代码 💡 未复现
A General Implicit Framework for Fast NeRF Composition and Rendering
Xinyu Gao, Ziyi Yang, Yunlu Zhao, Yuxiang Sun, Xiaogang Jin, Changqing Zou
AAAI 2024 Technical 原文 | 代码 💡 未复现
A User-Friendly Framework for Generating Model-Preferred Prompts in Text-to-Image Synthesis
Nailei Hei, Qianyu Guo, Zihao Wang, Yan Wang, Haofen Wang, Wenqiang Zhang
AAAI 2024 Technical 原文 | 代码 💡 未复现
AnomalyGPT: Detecting Industrial Anomalies Using Large Vision-Language Models
Zhaopeng Gu, Bingke Zhu, Guibo Zhu, Yingying Chen, Ming Tang, Jinqiao Wang
AAAI 2023 Technical 原文 | 代码 💡 未复现
Arbitrary-Scale Video Super-Resolution with Structural and Textural Priors
Wei Shang, Dongwei Ren, Wanying Zhang, Yuming Fang, Wangmeng Zuo, Kede Ma
AAAI 2024 Technical 原文 | 代码 💡 未复现
BLIVA: A Simple Multimodal LLM for Better Handling of Text-Rich Visual Questions
Wenbo Hu, Yifan Xu, Yi Li, Weiyue Li, Zeyuan Chen, Zhuowen Tu
AAAI 2023 Technical 原文 | 代码 💡 未复现
Depth-Guided Robust and Fast Point Cloud Fusion NeRF for Sparse Input Views
Shuai Guo, Qiuwen Wang, Yijie Gao, Rong Xie, Li Song
AAAI 2024 Technical 原文 | 代码 💡 未复现
Distribution Matching for Multi-Task Learning of Classification Tasks: a Large-Scale Study on Faces & Beyond
Dimitrios Kollias, Viktoriia Sharmanska, Stefanos Zafeiriou
AAAI 2024 Technical 原文 💡 未复现
Open-Vocabulary HOI Detection with Interaction-aware Prompt and Concept Calibration
Ting Lei, Shaofeng Yin, Qingchao Chen, Yuxin Peng, Yang Liu
AAAI 2025 Technical 原文 | 代码 💡 未复现
Expand-and-Quantize: Unsupervised Semantic Segmentation Using High-Dimensional Space and Product Quantization
Jiyoung Kim, Kyuhong Shim, Insu Lee, Byonghyo Shim
AAAI 2023 Technical 原文 💡 未复现
Expediting Contrastive Language-Image Pretraining via Self-distilled Encoders
Bumsoo Kim, Jinhyung Kim, Yeonsik Jo, Seung Hwan Kim
AAAI 2023 Technical 原文 | 代码 💡 未复现
Exploring Self- and Cross-Triplet Correlations for Human-Object Interaction Detection
Weibo Jiang, Weihong Ren, Jiandong Tian, Liangqiong Qu, Zhiyong Wang, Honghai Liu
AAAI 2024 Technical 原文 | 代码 💡 未复现
Fine-Grained Multi-View Hand Reconstruction Using Inverse Rendering
Qijun Gan, Wentong Li, Jinwei Ren, Jianke Zhu
AAAI 2024 Technical 原文 | 代码 💡 未复现
Frequency-Adaptive Pan-Sharpening with Mixture of Experts
Xuanhua He, Keyu Yan, Rui Li, Chengjun Xie, Jie Zhang, Man Zhou
AAAI 2024 Technical 原文 | 代码 💡 未复现
Frequency-Controlled Diffusion Model for Versatile Text-Guided Image-to-Image Translation
Xiang Gao, Zhengbo Xu, Junhan Zhao, Jiaying Liu
AAAI 2025 Technical 原文 | 代码 💡 未复现
GSN: Generalisable Segmentation in Neural Radiance Field
Vinayak Gupta, Rahul Goel, Sirikonda Dhawal, P. J. Narayanan
AAAI 2024 Technical 原文 | 代码 💡 未复现
Hand-Centric Motion Refinement for 3D Hand-Object Interaction via Hierarchical Spatial-Temporal Modeling
Yuze Hao, Jianrong Zhang, Tao Zhuo, Fuan Wen, Hehe Fan
AAAI 2024 Technical 原文 | 代码 💡 未复现
High-Fidelity Diffusion-based Image Editing
Chen Hou, Guoqiang Wei, Zhibo Chen
AAAI 2024 Technical 原文 💡 未复现
HuTuMotion: Human-Tuned Navigation of Latent Motion Diffusion Models with Minimal Feedback
Gaoge Han, Shaoli Huang, Mingming Gong, Jinglei Tang
AAAI 2023 Technical 原文 | 代码 💡 未复现
Exploring Phrase-Level Grounding with Text-to-Image Diffusion Model
Danni Yang, Ruohan Dong, Jiayi Ji, Yiwei Ma, Haowei Wang, Xiaoshuai Sun, Rongrong Ji
AAAI 2024 Technical 原文 | 代码 💡 未复现
2DMamba: Efficient State Space Model for Image Representation with Applications on Giga-Pixel Whole Slide Image Classification
Jingwei Zhang, Anh Tien Nguyen, Xi Han, Vincent Quoc-Huy Trinh, Hong Qin, Dimitris Samaras, Mahdi S. Hosseini
CVPR 2025 poster 原文 | 代码 💡 未复现
3D Dental Model Segmentation with Geometrical Boundary Preserving
Shufan Xi, Zexian Liu, Junlin Chang, Hongyu Wu, Xiaogang Wang, Aimin Hao
CVPR 2025 poster 原文 | 代码 💡 未复现
3D Gaussian Head Avatars with Expressive Dynamic Appearances by Compact Tensorial Representations
Yating Wang, Xuan Wang, Ran Yi, Yanbo Fan, Jichen Hu, Jingcheng Zhu, Lizhuang Ma
CVPR 2025 poster 原文 | 代码 💡 未复现
3D Gaussian Inpainting with Depth-Guided Cross-View Consistency
Sheng-Yu Huang, Zi-Ting Chou, Yu-Chiang Frank Wang
CVPR 2025 poster 原文 | 代码 💡 未复现
3D-GRAND: A Million-Scale Dataset for 3D-LLMs with Better Grounding and Less Hallucination
Jianing Yang, Xuweiyi Chen, Nikhil Madaan, Madhavan Iyengar, Shengyi Qian, David F. Fouhey, Joyce Chai
CVPR 2025 poster 原文 | 代码 💡 未复现
3D-GSW: 3D Gaussian Splatting for Robust Watermarking
Youngdong Jang, Hyunje Park, Feng Yang, Heeju Ko, Euijin Choo, Sangpil Kim
CVPR 2025 poster 原文 | 代码 💡 未复现
3D-HGS: 3D Half-Gaussian Splatting
Haolin Li,Jinyang Liu,Mario Sznaier,Octavia Camps
CVPR 2025 poster 原文 | 代码 💡 未复现
3D-LLaVA: Towards Generalist 3D LMMs with Omni Superpoint Transformer
Jiajun Deng, Tianyu He, Li Jiang, Tianyu Wang, Feras Dayoub, Ian Reid
CVPR 2025 poster 原文 | 代码 💡 未复现
3D-Mem: 3D Scene Memory for Embodied Exploration and Reasoning
Yuncong Yang, Han Yang, Jiachen Zhou, Peihao Chen, Hongxin Zhang, Yilun Du, Chuang Gan
CVPR 2025 poster 原文 | 代码 💡 未复现
3DENHANCER: Consistent Multi-View Diffusion for 3D Enhancement
Yihang Luo, Shangchen Zhou, Yushi Lan, Xingang Pan, Chen Change Loy
CVPR 2025 poster 原文 | 代码 💡 未复现
3DGUT: Enabling Distorted Cameras and Secondary Rays in Gaussian Splatting
Qi Wu, Janick Martinez Esturo, Ashkan Mirzaei, Nicolas Moenne-Loccoz, Zan Gojcic
CVPR 2025 poster 原文 | 代码 💡 未复现
4D LangSplat: 4D Language Gaussian Splatting via Multimodal Large Language Models
Wanhua Li, Renping Zhou, Jiawei Zhou, Yingwei Song, Johannes Herter, Minghan Qin, Gao Huang, Hanspeter Pfister
CVPR 2025 poster 原文 | 代码 💡 未复现
4DTAM: Non-Rigid Tracking and Mapping via Dynamic Surface Gaussians
Hidenobu Matsuki Gwangbin Bae Andrew J. Davison
CVPR 2025 poster 原文 | 代码 💡 未复现
A Bias-Free Training Paradigm for More General AI-generated Image Detection
Fabrizio Guillaro, Giada Zingarini, Ben Usman, Avneesh Sud, Davide Cozzolino, Luisa Verdoliva
CVPR 2025 poster 原文 | 代码 💡 未复现
A Data-Centric Revisit of Pre-Trained Vision Models for Robot Learning
Xin Wen, Bingchen Zhao, Yilun Chen, Jiangmiao Pang, Xiaojuan Qi
CVPR 2025 poster 原文 | 代码 💡 未复现