在申请任务前,请务必仔细阅读我们的 贡献流程和奖励规则。
论文名称 & 作者 | 会议/期刊 & 年份 | 形式 | 相关链接 | 状态 / 操作 |
---|---|---|---|---|
Attention is all you need Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, ナ「kasz Kaiser, Illia Polosukhin |
NIPS 2017 | 原文 | 代码 | ✅ 已完成 | |
Improving language understanding by generative pre-training Alec Radford,Karthik Narasimhan,Tim Salimans,Ilya Sutskever |
OpenAI 2018 | 原文 | 💡 未复现 | |
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova |
NAACL 2019 | 原文 | 💡 未复现 | |
Language models are unsupervised multitask learners Alec Radford,Jeffrey Wu,Rewon Child,David Luan,Dario Amodei,Ilya Sutskever |
OpenAI 2019 | 原文 | 💡 未复现 | |
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, Peter J. Liu |
JMLR 2020 | 原文 | 代码 | 💡 未复现 | |
From Local to Global: A GraphRAG Approach to
Query-Focused Summarization Darren Edge, Ha Trinh, Newman Cheng, Joshua Bradley, Alex Chao, Apurva Mody, Steven Truitt, Dasha Metropolitansky, Robert Osazuwa Ness, Jonathan Larson |
2024 | 原文 | 代码 | 💡 未复现 | |
Language Models are Few-Shot Learners Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel Ziegler, Jeffrey Wu, Clemens Winter, Chris Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, Dario Amodei |
NIPS 2020 | 原文 | 代码 | 💡 未复现 | |
LoRA: Low-Rank Adaptation of Large Language Models Edward J Hu,yelong shen,Phillip Wallis,Zeyuan Allen-Zhu,Yuanzhi Li,Shean Wang,Lu Wang,Weizhu Chen |
ICLR 2022 | 原文 | 代码 | 💡 未复现 | |
Finetuned Language Models are Zero-Shot Learners Jason Wei,Maarten Bosma,Vincent Zhao,Kelvin Guu,Adams Wei Yu,Brian Lester,Nan Du,Andrew M. Dai,Quoc V Le |
ICLR 2022 | 原文 | 代码 | 💡 未复现 | |
LLaMA: Open and Efficient Foundation Language Models Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, Aurelien Rodriguez, Armand Joulin, Edouard Grave, Guillaume Lample |
2023 | 原文 | 代码 | 💡 未复现 | |
Self-Consistency Improves Chain of Thought Reasoning in Language Models Xuezhi Wang,Jason Wei,Dale Schuurmans,Quoc V Le,Ed H. Chi,Sharan Narang,Aakanksha Chowdhery,Denny Zhou |
ICLR 2023 | 原文 | 代码 | 💡 未复现 | |
Why Can GPT Learn In-Context? Language Models Secretly Perform Gradient Descent as Meta-Optimizers Damai Dai, Yutao Sun, Li Dong, Yaru Hao, Shuming Ma, Zhifang Sui, Furu Wei |
ACL 2023 | 原文 | 代码 | 💡 未复现 | |
Toolformer: Language Models Can Teach Themselves to Use Tools Timo Schick, Jane Dwivedi-Yu, Roberto Dessi, Roberta Raileanu, Maria Lomeli, Eric Hambro, Luke Zettlemoyer, Nicola Cancedda, Thomas Scialom |
NIPS 2023 | 原文 | 代码 | 💡 未复现 | |
DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability Curvature Eric Mitchell, Yoonho Lee, Alexander Khazatsky, Christopher D Manning, Chelsea Finn |
PMLR 2023 | 原文 | 代码 | 💡 未复现 | |
Recitation-augmented language models Zhiqing Sun,Xuezhi Wang,Yi Tay,Yiming Yang,Denny Zhou |
ICLR 2023 | 原文 | 代码 | 💡 未复现 | |
Self-Instruct: Aligning Language Models with Self-Generated Instructions Yizhong Wang, Yeganeh Kordi, Swaroop Mishra, Alisa Liu, Noah A. Smith, Daniel Khashabi, Hannaneh Hajishirzi |
ACL 2023 | 原文 | 代码 | 💡 未复现 | |
Automatic chain of thought prompting in large language models Zhuosheng Zhang,Aston Zhang,Mu Li,Alex Smola |
ICLR 2023 | 原文 | 代码 | 💡 未复现 | |
REALM: retrieval-augmented language model pre-training Kelvin Guu, Kenton Lee, Zora Tung, Panupong Pasupat, Ming-Wei Chang |
ICML 2020 | 原文 | 代码 | 💡 未复现 | |
Language Is Not All You Need: Aligning Perception with Language Models Shaohan Huang, Li Dong, Wenhui Wang, Yaru Hao, Saksham Singhal, Shuming Ma, Tengchao Lv, Lei Cui, Owais Khan Mohammed, Barun Patra, Qiang Liu, Kriti Aggarwal, Zewen Chi, Nils Bjorck, Vishrav Chaudhary, Subhojit Som, XIA SONG, Furu Wei |
NIPS 2023 | 原文 | 代码 | 💡 未复现 | |
Automatic Prompt Augmentation and Selection with Chain-of-Thought from Labeled Data Kashun Shum, Shizhe Diao, Tong Zhang |
EMNLP 2023 | 原文 | 代码 | 💡 未复现 | |
Weakly Supervised Semantic Segmentation via Alternate Self-Dual Teaching Dingwen Zhang, Wenyuan Zeng, Guangyu Guo, Chaowei Fang, Lechao Cheng, Ming-Ming Cheng, Junwei Han |
TIP 2025 | 原文 | 💡 未复现 | |
Text to Image for Multi-Label Image Recognition with Joint Prompt-Adapter Learning Chun-Mei Feng, Kai Yu, Xinxing Xu, Salman Khan, Rick Siow Mong Goh, Wangmeng Zuo, Yong Liu |
TPAMI 2024 | 原文 | 💡 未复现 | |
Segment Concealed Objects with Incomplete Supervision Chunming He, Kai Li, Yachao Zhang, Ziyun Yang, Youwei Pang, Longxiang Tang, Chengyu Fang, Yulun Zhang, Linghe Kong, Xiu Li, Sina Farsiu |
TPAMI 2025 | 原文 | 代码 | 💡 未复现 | |
Event-based Stereo Depth Estimation: A Survey Suman Ghosh, Guillermo Gallego |
TPAMI 2025 | 原文 | 💡 未复现 | |
Efficient Low-Resolution Face Recognition via Bridge Distillation Shiming Ge, Shengwei Zhao, Chenyu Li, Yu Zhang, Jia Li |
TIP 2024 | 原文 | 💡 未复现 | |
Towards Deviation-Robust Agent Navigation via Perturbation-Aware Contrastive Learning Bingqian Lin, Yanxin Long, Yi Zhu, Fengda Zhu, Xiaodan Liang, Qixiang Ye, Liang Lin |
TPAMI 2023 | 原文 | 代码 | 💡 未复现 | |
Bayesian Estimate of Mean Proper Scores for Diversity-Enhanced Active Learning Wei Tan, Lan Du, Wray Buntine |
TPAMI 2023 | 原文 | 💡 未复现 | |
Paragraph-to-Image Generation with Information-Enriched Diffusion Model Weijia Wu, Zhuang Li, Yefei He, Mike Zheng Shou, Chunhua Shen, Lele Cheng, Yan Li, Tingting Gao, Di Zhang |
IJCV 2025 | 原文 | 代码 | 💡 未复现 | |
Inherit with Distillation and Evolve with Contrast: Exploring Class Incremental Semantic Segmentation Without Exemplar Memory Danpei Zhao, Bo Yuan, Zhenwei Shi |
TPAMI 2023 | 原文 | 💡 未复现 | |
Dual Compensation Residual Networks for Class Imbalanced Learning Ruibing Hou, Hong Chang, Bingpeng Ma, Shiguang Shan, Xilin Chen |
TPAMI 2023 | 原文 | 💡 未复现 | |
End-to-end Alternating Optimization for Real-World Blind Super Resolution Zhengxiong Luo, Yan Huang, Shang Li, Liang Wang, Tieniu Tan |
IJCV 2023 | 原文 | 代码 | 💡 未复现 | |
YOLO-MS: Rethinking Multi-Scale Representation Learning for Real-time Object Detection Yuming Chen, Xinbin Yuan, Jiabao Wang, Ruiqi Wu, Xiang Li, Qibin Hou, Ming-Ming Cheng |
TPAMI 2025 | 原文 | 代码 | 💡 未复现 | |
A Survey on Graph Neural Networks for Time Series: Forecasting, Classification, Imputation, and Anomaly Detection Ming Jin, Huan Yee Koh, Qingsong Wen, Daniele Zambon, Cesare Alippi, Geoffrey I. Webb, Irwin King, Shirui Pan |
TPAMI 2024 | 原文 | 代码 | 💡 未复现 | |
SplatFlow: Learning Multi-frame Optical Flow via Splatting Bo Wang, Yifan Zhang, Jian Li, Yang Yu, Zhenping Sun, Li Liu, Dewen Hu |
IJCV 2024 | 原文 | 代码 | 💡 未复现 | |
Towards Expressive Spectral-Temporal Graph Neural Networks for Time Series Forecasting Ming Jin, Guangsi Shi, Yuan-Fang Li, Bo Xiong, Tian Zhou, Flora D. Salim, Liang Zhao, Lingfei Wu, Qingsong Wen, Shirui Pan |
TPAMI 2025 | 原文 | 💡 未复现 | |
Efficient Halftoning via Deep Reinforcement Learning Haitian Jiang, Dongliang Xiong, Xiaowen Jiang, Li Ding, Liang Chen, Kai Huang |
TIP 2023 | 原文 | 💡 未复现 | |
PAC-Bayes Bounds for Bandit Problems: A Survey and Experimental Comparison Hamish Flynn, David Reeb, Melih Kandemir, Jan Peters |
TPAMI 2023 | 原文 | 💡 未复现 | |
Salient Object Detection via Dynamic Scale Routing Zhenyu Wu, Shuai Li, Chenglizhao Chen, Hong Qin, Aimin Hao |
TIP 2022 | 原文 | 代码 | 💡 未复现 | |
Twin Contrastive Learning for Online Clustering Yunfan Li, Mouxing Yang, Dezhong Peng, Taihao Li, Jiantao Huang, Xi Peng |
IJCV 2022 | 原文 | 💡 未复现 | |
Kernel-Based Generalized Median Computation for Consensus Learning Andreas Nienkötter, Xiaoyi Jiang |
TPAMI 2022 | 原文 | 💡 未复现 | |
A Tale of HodgeRank and Spectral Method: Target Attack Against Rank Aggregation Is the Fixed Point of Adversarial Game Ke Ma, Qianqian Xu, Jinshan Zeng, Guorong Li, Xiaochun Cao, Qingming Huang |
TPAMI 2022 | 原文 | 💡 未复现 | |
Boosting Night-time Scene Parsing with Learnable Frequency Zhifeng Xie, Sen Wang, Ke Xu, Zhizhong Zhang, Xin Tan, Yuan Xie, Lizhuang Ma |
TIP 2023 | 原文 | 💡 未复现 | |
SiamMask: A Framework for Fast Online Object Tracking and Segmentation Weiming Hu, Qiang Wang, Li Zhang, Luca Bertinetto, Philip H. S. Torr |
TPAMI 2022 | 原文 | 💡 未复现 | |
SERE: Exploring Feature Self-relation for Self-supervised Transformer Zhong-Yu Li, Shanghua Gao, Ming-Ming Cheng |
TPAMI 2023 | 原文 | 代码 | 💡 未复现 | |
Parallel and Distributed Graph Neural Networks: An In-Depth Concurrency Analysis Maciej Besta, Torsten Hoefler |
TPAMI 2023 | 原文 | 💡 未复现 | |
Efficient Burst Raw Denoising with Variance Stabilization and Multi-frequency Denoising Network Dasong Li, Yi Zhang, Ka Lung Law, Xiaogang Wang, Hongwei Qin, Hongsheng Li |
IJCV 2022 | 原文 | 💡 未复现 | |
Incomplete Gamma Kernels: Generalizing Locally Optimal Projection Operators Patrick Stotko, Michael Weinmann, Reinhard Klein |
TPAMI 2024 | 原文 | 代码 | 💡 未复现 | |
From Slow Bidirectional to Fast Autoregressive Video Diffusion Models Tianwei Yin, Qiang Zhang, Richard Zhang, William T. Freeman, Fredo Durand, Eli Shechtman, Xun Huang |
CVPR 2025 | 原文 | 代码 | 💡 未复现 | |
FastComposer: Tuning-Free Multi-Subject Image Generation with Localized Attention Guangxuan Xiao, Tianwei Yin, William T. Freeman, Frédo Durand, Song Han |
IJCV 2024 | 原文 | 代码 | 💡 未复现 | |
ROGRAG: A Robustly Optimized GraphRAG Framework Zhefan Wang, Huanjun Kong, Jie Ying, Wanli Ouyang, Nanqing Dong |
ACL 2025 | 原文 | 代码 | 💡 未复现 | |
Dual Encoder GAN Inversion for High-Fidelity 3D Head Reconstruction from Single Images Bahri Batuhan Bilecen, Ahmet Berke Gokmen, Aysegul Dundar |
NIPS 2024 | 原文 | 代码 | 💡 未复现 | |
Can We Leave Deepfake Data Behind in Training Deepfake Detector Jikang Cheng, Zhiyuan Yan, Ying Zhang, Yuhao Luo, Zhongyuan Wang, Chen Li |
NIPS 2024 | 原文 | 代码 | 💡 未复现 | |
VAST: A Vision-Audio-Subtitle-Text Omni-Modality Foundation Model and Dataset Sihan Chen, Handong Li, Qunbo Wang, Zijia Zhao, Mingzhen Sun, Xinxin Zhu, Jing Liu |
NIPS 2023 | 原文 | 代码 | 💡 未复现 | |
MASt3R-SLAM: Real-Time Dense SLAM with 3D Reconstruction Priors Riku Murai, Eric Dexheimer, Andrew J. Davison |
CVPR 2025 | 原文 | 代码 | 💡 未复现 | |
MAGiC-SLAM: Multi-Agent Gaussian Globally Consistent SLAM Vladimir Yugay, Theo Gevers, Martin R. Oswald |
CVPR 2025 | 原文 | 代码 | 💡 未复现 | |
Murre: Multi-view Reconstruction via SfM-guided Monocular Depth Estimation Haoyu Guo, He Zhu, Sida Peng, Haotong Lin, Yunzhi Yan, Tao Xie, Wenguan Wang, Xiaowei Zhou, Hujun Bao |
CVPR 2025 | 原文 | 代码 | 💡 未复现 | |
STAR: Spatial-Temporal Augmentation with Text-to-Video Models for Real-World Video Super-Resolution Rui Xie, Yinhong Liu, Penghao Zhou, Chen Zhao, Jun Zhou, Kai Zhang, Zhenyu Zhang, Jian Yang, Zhenheng Yang, Ying Tai |
ICCV 2025 | 原文 | 代码 | 💡 未复现 | |
SemanticDraw: Towards Real-Time Interactive Content Creation from Image Diffusion Models Jaerin Lee, Daniel Sungho Jung, Kanggeon Lee, Kyoung Mu Lee |
CVPR 2025 | 原文 | 代码 | 💡 未复现 | |
HSMR: Reconstructing Humans with a Biomechanically Accurate Skeleton Yan Xia, Xiaowei Zhou, Etienne Vouga, Qixing Huang, Georgios Pavlakos |
CVPR 2025 | 原文 | 代码 | 💡 未复现 | |
RLAIF-V: Open-Source AI Feedback Leads to Super GPT-4V Trustworthiness Tianyu Yu, Haoye Zhang, Qiming Li, Qixin Xu, Yuan Yao, Da Chen, Xiaoman Lu, Ganqu Cui, Yunkai Dang, Taiwen He, Xiaocheng Feng, Jun Song, Bo Zheng, Zhiyuan Liu, Tat-Seng Chua, Maosong Sun |
CVPR 2025 | 原文 | 代码 | 💡 未复现 | |
DFormer:Rethinking RGBD Representation Learning for Semantic Segmentation Bowen Yin, Xuying Zhang, Zhongyu Li, Li Liu, Ming-Ming Cheng, Qibin Hou |
CVPR 2025 | 原文 | 代码 | 💡 未复现 | |
GaussianCity: Generative Gaussian Splatting for Unbounded 3D City Generation Haozhe Xie, Zhaoxi Chen, Fangzhou Hong, Ziwei Liu |
CVPR 2025 | 原文 | 代码 | 💡 未复现 | |
PhysTwin: Physics-Informed Reconstruction and Simulation of Deformable Objects from Videos Hanxiao Jiang, Hao-Yu Hsu, Kaifeng Zhang, Hsin-Ni Yu, Shenlong Wang, Yunzhu Li |
ICCV 2025 | 原文 | 代码 | 💡 未复现 | |
UniGoal: Towards Universal Zero-shot Goal-oriented Navigation Hang Yin, Xiuwei Xu, Lingqing Zhao, Ziwei Wang, Jie Zhou, Jiwen Lu |
CVPR 2025 | 原文 | 代码 | 💡 未复现 | |
Momentum-GS: Momentum Gaussian Self-Distillation for High-Quality Large Scene Reconstruction Jixuan Fan, Wanhua Li, Yifei Han, Yansong Tang |
ICCV 2025 | 原文 | 代码 | 💡 未复现 | |
MINIMA: Modality Invariant Image Matching Jiangwei Ren, Xingyu Jiang, Zizhuo Li, Dingkang Liang, Xin Zhou, Xiang Bai |
CVPR 2025 | 原文 | 代码 | 💡 未复现 | |
Uni4D: Unifying Visual Foundation Models for 4D Modeling from a Single Video David Yifan Yao, Albert J. Zhai, Shenlong Wang |
CVPR 2025 | 原文 | 代码 | 💡 未复现 | |
Diffusion Renderer: Neural Inverse and Forward Rendering with Video Diffusion Models Ruofan Liang, Zan Gojcic, Huan Ling, Jacob Munkberg, Jon Hasselgren, Zhi-Hao Lin, Jun Gao, Alexander Keller, Nandita Vijaykumar, Sanja Fidler, Zian Wang |
CVPR 2025 | 原文 | 代码 | 💡 未复现 | |
Linear Programming Bounds on k-Uniform States Yu Ning, Fei Shi, Tao Luo, Xiande Zhang |
ICCV 2025 | 原文 | 代码 | 💡 未复现 | |
LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation Jiaxiang Tang, Zhaoxi Chen, Xiaokang Chen, Tengfei Wang, Gang Zeng, Ziwei Liu |
ECCV 2024 | 原文 | 代码 | 💡 未复现 | |
VideoMamba: State Space Model for Efficient Video Understanding Kunchang Li, Xinhao Li, Yi Wang, Yinan He, Yali Wang, Limin Wang, Yu Qiao |
ECCV 2024 | 原文 | 代码 | 💡 未复现 | |
DriveLM: Driving with Graph Visual Question Answering Chonghao Sima, Katrin Renz, Kashyap Chitta, Li Chen, Hanxue Zhang, Chengen Xie, Jens Beißwenger, Ping Luo, Andreas Geiger, Hongyang Li |
ECCV 2024 | 原文 | 代码 | 💡 未复现 | |
GRiT: A Generative Region-to-text Transformer for Object Understanding Jialian Wu, Jianfeng Wang, Zhengyuan Yang, Zhe Gan, Zicheng Liu, Junsong Yuan, Lijuan Wang |
ECCV 2024 | 原文 | 代码 | 💡 未复现 | |
PointLLM: Empowering Large Language Models to Understand Point Clouds Runsen Xu, Xiaolong Wang, Tai Wang, Yilun Chen, Jiangmiao Pang, Dahua Lin |
ECCV 2024 | 原文 | 代码 | 💡 未复现 | |
nuCraft: Crafting High Resolution 3D Semantic Occupancy for Unified 3D Scene Understanding Benjin Zhu, Zhe Wang, and Hongsheng Li |
ECCV 2024 | 原文 | 代码 | 💡 未复现 | |
Adversarial Diffusion Distillation Axel Sauer, Dominik Lorenz, Andreas Blattmann, Robin Rombach |
ECCV 2024 | 原文 | 代码 | 💡 未复现 | |
Generative Image Dynamics Zhengqi Li, Richard Tucker, Noah Snavely, Aleksander Holynski |
CVPR 2024 | 最佳论文 | 原文 | 代码 | 💡 未复现 |
Rich Human Feedback for Text-to-Image Generation Youwei Liang, Junfeng He, Gang Li, Peizhao Li, Arseniy Klimovskiy, Nicholas Carolan, Jiao Sun, Jordi Pont-Tuset, Sarah Young, Feng Yang, Junjie Ke, Krishnamurthy Dj Dvijotham, Katie Collins, Yiwen Luo, Yang Li, Kai J Kohlhoff, Deepak Ramachandran, Vidhya Navalpakkam |
CVPR 2024 | 最佳论文 | 原文 | 代码 | 💡 未复现 |
Mip-Splatting: Alias-free 3D Gaussian Splatting Zehao Yu, Anpei Chen, Binbin Huang, Torsten Sattler, Andreas Geiger |
CVPR 2024 | 最佳学生论文 | 原文 | 代码 | 💡 未复现 |
BioCLIP: A Vision Foundation Model for the Tree of Life Samuel Stevens, Jiaman Wu, Matthew J Thompson, Elizabeth G Campolongo, Chan Hee Song, David Edward Carlyn, Li Dong, Wasila M Dahdul, Charles Stewart, Tanya Berger-Wolf, Wei-Lun Chao, Yu Su |
CVPR 2024 | 最佳学生论文 | 原文 | 代码 | 💡 未复现 |
4D Gaussian Splatting for Real-Time Dynamic Scene Rendering Guanjun Wu, Taoran Yi, Jiemin Fang, Lingxi Xie, Xiaopeng Zhang, Wei Wei, Wenyu Liu, Qi Tian, Xinggang Wang |
CVPR 2024 | 原文 | 代码 | 💡 未复现 | |
Depth Anything: Unleashing The Power of Large-Scale Unlabeled Data Lihe Yang, Bingyi Kang, Zilong Huang, Xiaogang Xu, Jiashi Feng, Hengshuang Zhao |
CVPR 2024 | 原文 | 代码 | 💡 未复现 | |
LISA: Reasoning Segmentation Via Large Language Model Xin Lai, Zhuotao Tian, Yukang Chen, Yanwei Li, Yuhui Yuan, Shu Liu, Jiaya Jia |
CVPR 2024 | 原文 | 代码 | 💡 未复现 | |
InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks Zhe Chen, Jiannan Wu, Wenhai Wang, Weijie Su, Guo Chen, Sen Xing, Muyan Zhong, Qinglong Zhang, Xizhou Zhu, Lewei Lu, Bin Li, Ping Luo, Tong Lu, Yu Qiao, Jifeng Dai |
CVPR 2024 | 原文 | 代码 | 💡 未复现 | |
MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark Xiang Yue, Yuansheng Ni, Kai Zhang, Tianyu Zheng, Ruoqi Liu, Ge Zhang, Samuel Stevens, Dongfu Jiang, Weiming Ren, Yuxuan Sun, Cong Wei, Botao Yu, Ruibin Yuan, Renliang Sun, Ming Yin, Boyuan Zheng, Zhenzhu Yang, Yibo Liu, Wenhao Huang, Huan Sun, Yu Su, Wenhu Chen |
CVPR 2024 | 原文 | 代码 | 💡 未复现 | |
EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything Yunyang Xiong, Bala Varadarajan, Lemeng Wu, Xiaoyu Xiang, Fanyi Xiao, Chenchen Zhu, Xiaoliang Dai, Dilin Wang, Fei Sun, Forrest Iandola, Raghuraman Krishnamoorthi, Vikas Chandra |
CVPR 2024 | 原文 | 代码 | 💡 未复现 | |
Improved Baselines with Visual Instruction Tuning (LLaVA-1.5) Haotian Liu, Chunyuan Li, Yuheng Li, Yong Jae Lee |
CVPR 2024 | 原文 | 代码 | 💡 未复现 | |
DemoFusion: Democratising High-Resolution Image Generation With No $$$ Ruoyi Du, Dongliang Chang, Timothy Hospedales, Yi-Zhe Song, Zhanyu Ma |
CVPR 2024 | 原文 | 代码 | 💡 未复现 | |
ViewDiff: 3D-Consistent Image Generation with Text-to-Image Models Lukas Höllein, Aljaž Božič, Norman Müller, David Novotny, Hung-Yu Tseng, Christian Richardt, Michael Zollhöfer, Matthias Nießner |
CVPR 2024 | 原文 | 代码 | 💡 未复现 | |
OmniGlue: Generalizable Feature Matching with Foundation Model Guidance Hanwen Jiang, Arjun Karpur, Bingyi Cao, Qixing Huang, Andre Araujo |
CVPR 2024 | 原文 | 代码 | 💡 未复现 | |
DocRes: A Generalist Model Toward Unifying Document Image Restoration Tasks Jiaxin Zhang, Dezhi Peng, Chongyu Liu, Peirong Zhang, Lianwen Jin |
CVPR 2024 | 原文 | 代码 | 💡 未复现 | |
MobileCLIP: Fast Image-Text Models through Multi-Modal Reinforced Training Pavan Kumar Anasosalu Vasu, Hadi Pouransari, Fartash Faghri, Raviteja Vemulapalli, Oncel Tuzel |
CVPR 2024 | 原文 | 代码 | 💡 未复现 | |
Describing Differences in Image Sets with Natural Language Lisa Dunlap, Yuhui Zhang, Xiaohan Wang, Ruiqi Zhong, Trevor Darrell, Jacob Steinhardt, Joseph E. Gonzalez, Serena Yeung-Levy |
CVPR 2024 | 原文 | 代码 | 💡 未复现 | |
XFeat: Accelerated Features for Lightweight Image Matching Guilherme Potje, Felipe Cadar, Andre Araujo, Renato Martins, Erickson R. Nascimento |
CVPR 2024 | 原文 | 代码 | 💡 未复现 | |
pixelSplat: 3D Gaussian Splats from Image Pairs for Scalable Generalizable 3D Reconstruction David Charatan, Sizhe Li, Andrea Tagliasacchi, Vincent Sitzmann |
CVPR 2024 | 原文 | 💡 未复现 | |
GPT4Point: A Unified Framework for Point-Language Understanding and Generation Zhangyang Qi, Ye Fang, Zeyi Sun, Xiaoyang Wu, Tong Wu, Jiaqi Wang, Dahua Lin, Hengshuang Zhao |
CVPR 2024 | 原文 | 💡 未复现 | |
Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks Bin Xiao, Haiping Wu, Weijian Xu, Xiyang Dai, Houdong Hu, Yumao Lu, Michael Zeng, Ce Liu, Lu Yuan |
CVPR 2024 | 原文 | 💡 未复现 | |
Identity-Preserving Text-to-Video Generation by Frequency Decomposition Shenghai Yuan, Jinfa Huang, Xianyi He, Yunyuan Ge, Yujun Shi, Liuhan Chen, Jiebo Luo, Li Yuan |
CVPR 2025 | 原文 | 代码 | 💡 未复现 | |
Cinemo: Consistent and Controllable Image Animation with Motion Diffusion Models Xin Ma, Yaohui Wang, Gengyun Jia, Xinyuan Chen, Yuan-Fang Li, Cunjian Chen, Yu Qiao |
CVPR 2025 | 原文 | 代码 | 💡 未复现 | |
X-Dyna: Expressive Dynamic Human Image Animation Di Chang, Hongyi Xu, You Xie, Yipeng Gao, Zhengfei Kuang, Shengqu Cai, Chenxu Zhang, Guoxian Song, Chao Wang, Yichun Shi, Zeyuan Chen, Shijie Zhou, Linjie Luo, Gordon Wetzstein, Mohammad Soleymani |
CVPR 2025 | 原文 | 代码 | 💡 未复现 | |
PhyT2V: LLM-Guided Iterative Self-Refinement for Physics-Grounded Text-to-Video Generation Qiyao Xue, Xiangyu Yin, Boyuan Yang, Wei Gao |
CVPR 2025 | 原文 | 代码 | 💡 未复现 | |
Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model Feng Liu, Shiwei Zhang, Xiaofeng Wang, Yujie Wei, Haonan Qiu, Yuzhong Zhao, Yingya Zhang, Qixiang Ye, Fang Wan |
CVPR 2025 | 原文 | 代码 | 💡 未复现 | |
AR-Diffusion: Asynchronous Video Generation with Auto-Regressive Diffusion Mingzhen Sun, Weining Wang, Gen Li, Jiawei Liu, Jiahui Sun, Wanquan Feng, Shanshan Lao, SiYu Zhou, Qian He, Jing Liu |
CVPR 2025 | 原文 | 代码 | 💡 未复现 | |
Number it: Temporal Grounding Videos like Flipping Manga Yongliang Wu, Xinting Hu, Yuyang Sun, Yizhou Zhou, Wenbo Zhu, Fengyun Rao, Bernt Schiele, Xu Yang |
CVPR 2025 | 原文 | 代码 | 💡 未复现 | |
Edit Away and My Face Will not Stay: Personal Biometric Defense against Malicious Generative Editing Hanhui Wang, Yihua Zhang, Ruizheng Bai, Yue Zhao, Sijia Liu, Zhengzhong Tu |
CVPR 2025 | 原文 | 代码 | 💡 未复现 | |
h-Edit: Effective and Flexible Diffusion-Based Editing via Doob’s h-Transform Toan Nguyen, Kien Do, Duc Kieu, Thin Nguyen |
CVPR 2025 | 原文 | 代码 | 💡 未复现 | |
OverLoCK: An Overview-first-Look-Closely-next ConvNet with Context-Mixing Dynamic Kernels Meng Lou, Yizhou Yu |
CVPR 2025 | oral | 原文 | 代码 | 💡 未复现 |
Alias-Free Latent Diffusion Models:Improving Fractional Shift Equivariance of Diffusion Latent Space Yifan Zhou, Zeqi Xiao, Shuai Yang, Xingang Pan |
CVPR 2025 | oral | 原文 | 代码 | 💡 未复现 |
3D Student Splatting and Scooping Jialin Zhu, Jiangbei Yue, Feixiang He, He Wang |
CVPR 2025 | oral | 原文 | 代码 | 💡 未复现 |
CAP4D: Creating Animatable 4D Portrait Avatars with Morphable Multi-View Diffusion Models Felix Taubner, Ruihang Zhang, Mathieu Tuli, David B. Lindell |
CVPR 2025 | oral | 原文 | 代码 | 💡 未复现 |
Multi-view Reconstruction via SfM-guided Monocular Depth Estimation Haoyu Guo, He Zhu, Sida Peng, Haotong Lin, Yunzhi Yan, Tao Xie, Wenguan Wang, Xiaowei Zhou, Hujun Bao |
CVPR 2025 | oral | 原文 | 代码 | 💡 未复现 |
Closed-Loop Supervised Fine-Tuning of Tokenized Traffic Models Zhejun Zhang, Peter Karkus, Maximilian Igl, Wenhao Ding, Yuxiao Chen, Boris Ivanovic, Marco Pavone |
CVPR 2025 | oral | 原文 | 代码 | 💡 未复现 |
CustAny: Customizing Anything from A Single Example Lingjie Kong, Kai Wu, Xiaobin Hu, Wenhui Han, Jinlong Peng, Chengming Xu, Donghao Luo, Mengtian Li, Jiangning Zhang, Chengjie Wang, Yanwei Fu |
CVPR 2025 | oral | 原文 | 代码 | 💡 未复现 |
VGGT:Visual Geometry Grounded Transformer Jianyuan Wang, Minghao Chen, Nikita Karaev, Andrea Vedaldi, Christian Rupprecht, David Novotny |
CVPR 2025 | oral,Award Candidate | 原文 | 代码 | 💡 未复现 |
Navigation World Models Amir Bar, Gaoyue Zhou, Danny Tran, Trevor Darrell, Yann LeCun |
CVPR 2025 | oral | 原文 | 代码 | 💡 未复现 |
MegaSaM: Accurate, Fast, and Robust Structure and Motion from Casual Dynamic Videos Zhengqi Li, Richard Tucker, Forrester Cole, Qianqian Wang, Linyi Jin, Vickie Ye, Angjoo Kanazawa, Aleksander Holynski, Noah Snavely |
CVPR 2025 | oral | 原文 | 代码 | 💡 未复现 |
FoundationStereo: Zero-Shot Stereo Matching Bowen Wen, Matthew Trepte, Joseph Aribido, Jan Kautz, Orazio Gallo, Stan Birchfield |
CVPR 2025 | oral,Award Candidate | 原文 | 代码 | 💡 未复现 |
The PanAf-FGBG Dataset: Understanding the Impact of Backgrounds in Wildlife Behaviour Recognition Otto Brookes, Maksim Kukushkin, Majid Mirmehdi, Colleen Stephens, Paula Dieguez, Thurston C. Hicks, Sorrel Jones, Kevin Lee, Maureen S. McCarthy, Amelia Meier, Emmanuelle Normand, Erin G. Wessling, Roman M. Wittig, Kevin Langergraber, Klaus Zuberbühler, Lukas Boesch, Thomas Schmid, Mimi Arandjelovic, Hjalmar Kühl, Tilo Burghardt |
CVPR 2025 | oral,Award Candidate | 原文 | 代码 | 💡 未复现 |
Improving Diffusion Inverse Problem Solving with Decoupled Noise Annealing Bingliang Zhang, Wenda Chu, Julius Berner, Chenlin Meng, Anima Anandkumar, Yang Song |
CVPR 2025 | oral | 原文 | 代码 | 💡 未复现 |
MV-DUSt3R+: Single-StageSceneReconstruction fromSparseViewsIn2Seconds Zhenggang Tang, Yuchen Fan, Dilin Wang, Hongyu Xu, Rakesh Ranjan, Alexander Schwing, Zhicheng Yan |
CVPR 2025 | oral | 原文 | 代码 | 💡 未复现 |
DIFIX3D+: Improving 3D Reconstructions with Single-Step Diffusion Models Jay Zhangjie Wu, Yuxuan Zhang, Haithem Turki, Xuanchi Ren, Jun Gao, Mike Zheng Shou, Sanja Fidler, Zan Gojcic, Huan Ling |
CVPR 2025 | oral,Award Candidate | 原文 | 代码 | 💡 未复现 |
DIFFUSIONRENDERER: Neural Inverse and Forward Rendering with Video Diffusion Models Ruofan Liang, Zan Gojcic, Huan Ling, Jacob Munkberg, Jon Hasselgren, Zhi-Hao Lin, Jun Gao, Alexander Keller, Nandita Vijaykumar, Sanja Fidler, Zian Wang |
CVPR 2025 | oral | 原文 | 代码 | 💡 未复现 |
OpenING: A Comprehensive Benchmark for Judging Open-ended Interleaved Image-Text Generation Pengfei Zhou, Xiaopeng Peng, Jiajun Song, Chuanhao Li, Zhaopan Xu, Yue Yang, Ziyao Guo, Hao Zhang, Yuqi Lin, Yefei He, Lirui Zhao, Shuo Liu, Tianhua Li, Yuxuan Xie, Xiaojun Chang, Yu Qiao, Wenqi Shao, Kaipeng Zhang |
CVPR 2025 | oral | 原文 | 代码 | 💡 未复现 |
RandAR: Decoder-only Autoregressive Visual Generation in Random Orders Ziqi Pang, Tianyuan Zhang, Fujun Luan, Yunze Man, Hao Tan, Kai Zhang, William T. Freeman, Yu-Xiong Wang |
CVPR 2025 | oral | 原文 | 代码 | 💡 未复现 |
AnyEdit: Mastering Unified High-Quality Image Editing for Any Idea Qifan Yu, Wei Chow, Zhongqi Yue, Kaihang Pan, Yang Wu, Xiaoyang Wan, Juncheng Li, Siliang Tang, Hanwang Zhang, Yueting Zhuang |
CVPR 2025 | oral | 原文 | 代码 | 💡 未复现 |
VideoEspresso: A Large-Scale Chain-of-Thought Dataset for Fine-Grained Video Reasoning via Core Frame Selection Songhao Han, Wei Huang, Hairong Shi, Le Zhuo, Xiu Su, Shifeng Zhang, Xu Zhou, Xiaojuan Qi, Yue Liao, Si Liu |
CVPR 2025 | oral | 原文 | 代码 | 💡 未复现 |
SegEarth-OV: Towards Training-Free Open-Vocabulary Segmentation for Remote Sensing Images Kaiyu Li, Ruixun Liu, Xiangyong Cao, Xueru Bai, Feng Zhou, Deyu Meng, Zhi Wang |
CVPR 2025 | oral | 原文 | 代码 | 💡 未复现 |
Minority-Focused Text-to-Image Generation via Prompt Optimization Soobin Um, Jong Chul Ye |
CVPR 2025 | oral | 原文 | 代码 | 💡 未复现 |
Autoregressive Distillation of Diffusion Transformers Yeongmin Kim, Sotiris Anagnostidis, Yuming Du, Edgar Schönfeld, Jonas Kohler, Markos Georgopoulos, Albert Pumarola, Ali Thabet, Artsiom Sanakoyeu |
CVPR 2025 | oral | 原文 | 代码 | 💡 未复现 |
Molmo and PixMo:Open Weights and Open Data for State-of-the-Art Vision-Language Mo Matt Deitke, Christopher Clark, Sangho Lee, Rohun Tripathi, Yue Yang, Jae Sung Park, Mohammadreza Salehi, Niklas Muennighoff, Kyle Lo, Luca Soldaini, Jiasen Lu, Taira Anderson, Erin Bransom, Kiana Ehsani, Huong Ngo, YenSung Chen, Ajay Patel, Mark Yatskar, Chris Callison-Burch, Andrew Head, Rose Hendrix, Favyen Bastani, Eli VanderBilt, Nathan Lambert, Yvonne Chou, Arnavi Chheda, Jenna Sparks, Sam Skjonsberg, Michael Schmitz, Aaron Sarnat, Byron Bischoff, Pete Walsh, Chris Newell, Piper Wolters, Tanmay Gupta, Kuo-Hao Zeng, Jon Borchardt, Dirk Groeneveld, Crystal Nam, Sophie Lebrecht, Caitlin Wittlif, Carissa Schoenick, Oscar Michel, Ranjay Krishna, Luca Weihs, Noah A. Smith, Hannaneh Hajishirzi, Ross Girshick, Ali Farhadi, Aniruddha Kembhavi |
CVPR 2025 | oral | 原文 | 代码 | 💡 未复现 |
Continuous 3D Perception Model with Persistent State Qianqian Wang, Yifei Zhang, Aleksander Holynski, Alexei A. Efros, Angjoo Kanazawa |
CVPR 2025 | oral | 原文 | 代码 | 💡 未复现 |
Q-Eval-100K: Evaluating Visual Quality and Alignment Level for Text-to-Vision Content Zicheng Zhang, Tengchuan Kou, Shushi Wang, Chunyi Li, Wei Sun, Wei Wang, Xiaoyu Li, Zongyu Wang, Xuezhi Cao, Xiongkuo Min, Xiaohong Liu, Guangtao Zhai |
CVPR 2025 | oral | 原文 | 代码 | 💡 未复现 |
Video-XL:Extra-Long Vision Language Model for Hour-Scale Video Understanding Yan Shu, Zheng Liu, Peitian Zhang, Minghao Qin, Junjie Zhou, Zhengyang Liang, Tiejun Huang, Bo Zhao |
CVPR 2025 | oral | 原文 | 代码 | 💡 未复现 |
Go-with-the-Flow: Motion-Controllable Video Diffusion Models Using Real-Time Warped Noise Ryan Burgert, Yuancheng Xu, Wenqi Xian, Oliver Pilarski, Pascal Clausen, Mingming He, Li Ma, Yitong Deng, Lingxiao Li, Mohsen Mousavi, Michael Ryoo, Paul Debevec, Ning Yu |
CVPR 2025 | oral | 原文 | 代码 | 💡 未复现 |
CleanDIFT: Diffusion Features without Noise Nick Stracke, Stefan Andreas Baumann, Kolja Bauer, Frank Fundel, Björn Ommer |
CVPR 2025 | oral | 原文 | 代码 | 💡 未复现 |
CraftsMan3D: High-fidelity Mesh Generation with 3D Native Diffusion and Interactive Geometry Refiner Weiyu Li, Jiarui Liu, Hongyu Yan, Rui Chen, Yixun Liang, Xuelin Chen, Ping Tan, Xiaoxiao Long |
CVPR 2025 | oral | 原文 | 代码 | 💡 未复现 |
DreamRelation: Bridging Customization and Relation Generation Qingyu Shi, Lu Qi, Jianzong Wu, Jinbin Bai, Jingbo Wang, Yunhai Tong, Xiangtai Li |
CVPR 2025 | oral | 原文 | 代码 | 💡 未复现 |
Generative Multimodal Pretraining with Discrete Diffusion Timestep Tokens Kaihang Pan, Wang Lin, Zhongqi Yue, Tenglong Ao, Liyu Jia, Wei Zhao, Juncheng Li, Siliang Tang, Hanwang Zhang |
CVPR 2025 | oral | 原文 | 代码 | 💡 未复现 |
3DTopia-XL: Scaling High-quality 3D Asset Generation via Primitive Diffusion Zhaoxi Chen, Jiaxiang Tang, Yuhao Dong, Ziang Cao, Fangzhou Hong, Yushi Lan, Tengfei Wang, Haozhe Xie, Tong Wu, Shunsuke Saito, Liang Pan, Dahua Lin, Ziwei Liu |
CVPR 2025 | highlight | 原文 | 代码 | 💡 未复现 |
AnySat: One Earth Observation Model for Many Resolutions, Scales, and Modalities Guillaume Astruc, Nicolas Gonthier, Clement Mallet, Loic Landrieu |
CVPR 2025 | highlight | 原文 | 代码 | 💡 未复现 |
Cross-modal Causal Relation Alignment for Video Question Grounding Weixing Chen, Yang Liu, Binglin Chen, Jiandong Su, Yongsen Zheng, Liang Lin |
CVPR 2025 | highlight | 原文 | 代码 | 💡 未复现 |
DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos Wenbo Hu, Xiangjun Gao, Xiaoyu Li, Sijie Zhao, Xiaodong Cun, Yong Zhang, Long Quan, Ying Shan |
CVPR 2025 | highlight | 原文 | 代码 | 💡 未复现 |
DroneSplat: 3D Gaussian Splatting for Robust 3D Reconstruction from In-the-Wild Drone Imagery Jiadong Tang, Yu Gao, Dianyi Yang, Liqi Yan, Yufeng Yue, Yi Yang |
CVPR 2025 | highlight | 原文 | 代码 | 💡 未复现 |
Empowering Vector Graphics with Consistently Arbitrary Viewing and View-dependent Visibility Yidi Li, Jun Xiao, Zhengda Lu, Yiqun Wang, Haiyong Jiang |
CVPR 2025 | highlight | 原文 | 代码 | 💡 未复现 |
Extrapolating and Decoupling Image-to-Video Generation Models: Motion Modeling is Easier Than You Think Jie Tian, Xiaoye Qu, Zhenyi Lu, Wei Wei, Sichen Liu, Yu Cheng |
CVPR 2025 | highlight | 原文 | 代码 | 💡 未复现 |
ETAP: Event-based Tracking of Any Point Friedhelm Hamann, Daniel Gehrig, Filbert Febryanto, Kostas Daniilidis, Guillermo Gallego |
CVPR 2025 | highlight | 原文 | 代码 | 💡 未复现 |
Flash3D: Super-scaling Point Transformers through Joint Hardware-Geometry Locality Liyan Chen, Gregory P. Meyer, Zaiwei Zhang, Eric M. Wolff, Paul Vernaza |
CVPR 2025 | highlight | 原文 | 代码 | 💡 未复现 |
Flowing from Words to Pixels: A Noise-Free Framework for Cross-Modality Evolution Qihao Liu, Xi Yin, Alan Yuille, Andrew Brown, Mannat Singh |
CVPR 2025 | highlight | 原文 | 代码 | 💡 未复现 |
Generative Photography: Scene-Consistent Camera Control for Realistic Text-to-Image Synthesis Yu Yuan, Xijun Wang, Yichen Sheng, Prateek Chennuri, Xingguang Zhang, Stanley Chan |
CVPR 2025 | highlight | 原文 | 代码 | 💡 未复现 |
Generative Multiview Relighting for 3D Reconstruction under Extreme Illumination Variation Hadi Alzayer, Philipp Henzler, Jonathan T. Barron, Jia-Bin Huang, Pratul P. Srinivasan, Dor Verbin |
CVPR 2025 | highlight | 原文 | 代码 | 💡 未复现 |
Generative Densification: Learning to Densify Gaussians for High-Fidelity Generalizable 3D Reconstruction Seungtae Nam, Xiangyu Sun, Gyeongjin Kang, Younggeun Lee, Seungjun Oh, Eunbyung Park |
CVPR 2025 | highlight | 原文 | 代码 | 💡 未复现 |
GEN3C: 3D-Informed World-Consistent Video Generation with Precise Camera Control Xuanchi Ren, Tianchang Shen, Jiahui Huang, Huan Ling, Yifan Lu, Merlin Nimier-David, Thomas Müller, Alexander Keller, Sanja Fidler, Jun Gao |
CVPR 2025 | highlight | 原文 | 代码 | 💡 未复现 |
Holmes-VAU: Towards Long-term Video Anomaly Understanding at Any Granularity Huaxin Zhang, Xiaohao Xu, Xiang Wang, Jialong Zuo, Xiaonan Huang, Changxin Gao, Shanjun Zhang, Li Yu, Nong Sang |
CVPR 2025 | highlight | 原文 | 代码 | 💡 未复现 |
HELVIPAD: A Real-World Dataset for Omnidirectional Stereo Depth Estimation Mehdi Zayene, Jannik Endres, Albias Havolli, Charles Corbière, Salim Cherkaoui, Alexandre Kontouli, Alexandre Alahi |
CVPR 2025 | highlight | 原文 | 代码 | 💡 未复现 |
ImViD: Immersive Volumetric Videos for Enhanced VR Engagement Zhengxian Yang, Shi Pan, Shengqi Wang, Haoxiang Wang, Li Lin, Guanjun Li, Zhengqi Wen, Borong Lin, Jianhua Tao, Tao Yu |
CVPR 2025 | highlight | 原文 | 代码 | 💡 未复现 |
Inst3D-LMM: Instance-Aware 3D Scene Understanding with Multi-modal Instruction Tuning Hanxun Yu, Wentong Li, Song Wang, Junbo Chen, Jianke Zhu |
CVPR 2025 | highlight | 原文 | 代码 | 💡 未复现 |
Learning 4D Panoptic Scene Graph Generation from Rich 2D Visual Scene Shengqiong Wu, Hao Fei, Jingkang Yang, Xiangtai Li, Juncheng Li, Hanwang Zhang, Tat-seng Chua |
CVPR 2025 | highlight | 原文 | 代码 | 💡 未复现 |
Mamba as a Bridge: Where Vision Foundation Models Meet Vision Language Models for Domain-Generalized Semantic Segmentation Xin Zhang, Robby T. Tan |
CVPR 2025 | highlight | 原文 | 代码 | 💡 未复现 |
MammAlps: A Multi-view Video Behavior Monitoring Dataset of Wild Mammals in the Swiss Alps Valentin Gabeff, Haozhe Qi, Brendan Flaherty, Gencer Sumbül, Alexander Mathis, Devis Tuia |
CVPR 2025 | highlight | 原文 | 代码 | 💡 未复现 |
Matrix3D: Large Photogrammetry Model All-in-One Yuanxun Lu, Jingyang Zhang, Tian Fang, Jean-Daniel Nahmias, Yanghai Tsin, Long Quan, Xun Cao, Yao Yao, Shiwei Li |
CVPR 2025 | highlight | 原文 | 代码 | 💡 未复现 |
MITracker: Multi-View Integration for Visual Object Tracking Mengjie Xu, Yitao Zhu, Haotian Jiang, Jiaming Li, Zhenrong Shen, Sheng Wang, Haolin Huang, Xinyu Wang, Qing Yang, Han Zhang, Qian Wang |
CVPR 2025 | highlight | 原文 | 代码 | 💡 未复现 |
Open-Canopy: Towards Very High Resolution Forest Monitoring Fajwel Fogel, Yohann Perron, Nikola Besic, Laurent Saint-André, Agnès Pellissier-Tanon, Martin Schwartz, Thomas Boudras, Ibrahim Fayad, Alexandre d'Aspremont, Loic Landrieu, Philippe Ciais |
CVPR 2025 | highlight | 原文 | 代码 | 💡 未复现 |
OpenHumanVid: A Large-Scale High-Quality Dataset for Enhancing Human-Centric Video Generation Hui Li, Mingwang Xu, Yun Zhan, Shan Mu, Jiaye Li, Kaihui Cheng, Yuxuan Chen, Tan Chen, Mao Ye, Jingdong Wang, Siyu Zhu |
CVPR 2025 | highlight | 原文 | 代码 | 💡 未复现 |
OpticalNet: An Optical Imaging Dataset and Benchmark Beyond the Diffraction Limit Benquan Wang, Ruyi An, Jin-Kyu So, Sergei Kurdiumov, Eng Aik Chan, Giorgio Adamo, Yuhan Peng, Yewen Li, Bo An |
CVPR 2025 | highlight | 原文 | 代码 | 💡 未复现 |
Optimizing for the Shortest Path in Denoising Diffusion Model Ping Chen, Xingpeng Zhang, Zhaoxiang Liu, Huan Hu, Xiang Liu, Kai Wang, Min Wang, Yanlin Qian, Shiguo Lian |
CVPR 2025 | highlight | 原文 | 代码 | 💡 未复现 |
Reason-before-Retrieve: One-Stage Reflective Chain-of-Thoughts for Training-Free Zero-Shot Composed Image Retrieval Yuanmin Tang, Xiaoting Qin, Jue Zhang, Jing Yu, Gaopeng Gou, Gang Xiong, Qingwei Ling, Saravan Rajmohan, Dongmei Zhang, Qi Wu |
CVPR 2025 | highlight | 原文 | 代码 | 💡 未复现 |
SmartCLIP: Modular Vision-language Alignment with Identification Guarantees Shaoan Xie, Lingjing Lingjing, Yujia Zheng, Yu Yao, Zeyu Tang, Eric P. Xing, Guangyi Chen, Kun Zhang |
CVPR 2025 | highlight | 原文 | 代码 | 💡 未复现 |
Structured 3D Latents for Scalable and Versatile 3D Generation Jianfeng Xiang, Zelong Lv, Sicheng Xu, Yu Deng, Ruicheng Wang, Bowen Zhang, Dong Chen, Xin Tong, Jiaolong Yang |
CVPR 2025 | highlight | 原文 | 代码 | 💡 未复现 |
StyleSSP: Sampling StartPoint Enhancement for Training-free Diffusion-based Method for Style Transfer Ruojun Xu, Weijie Xi, Xiaodi Wang, Yongbo Mao, Zach Cheng |
CVPR 2025 | highlight | 原文 | 代码 | 💡 未复现 |
Towards Autonomous Micromobility through Scalable Urban Simulation Wayne Wu, Honglin He, Chaoyuan Zhang, Jack He, Seth Z. Zhao, Ran Gong, Quanyi Li, Bolei Zhou |
CVPR 2025 | highlight | 原文 | 代码 | 💡 未复现 |
UIBDiffusion: Universal Imperceptible Backdoor Attack for Diffusion Models Yuning Han, Bingyin Zhao, Rui Chu, Feng Luo, Biplab Sikdar, Yingjie Lao |
CVPR 2025 | highlight | 原文 | 代码 | 💡 未复现 |
UCOD-DPL: Unsupervised Camouflaged Object Detection via Dynamic Pseudo-label Learning Weiqi Yan, Lvhai Chen, Huaijia Kou, Shengchuan Zhang, Yan Zhang, Liujuan Cao |
CVPR 2025 | highlight | 原文 | 代码 | 💡 未复现 |
Video Depth Anything: Consistent Depth Estimation for Super-Long Videos Sili Chen, Hengkai Guo, Shengnan Zhu, Feihu Zhang, Zilong Huang, Jiashi Feng, Bingyi Kang |
CVPR 2025 | highlight | 原文 | 代码 | 💡 未复现 |
World-consistent Video Diffusion with Explicit 3D Modeling Qihang Zhang, Shuangfei Zhai, Miguel Ángel Bautista Martin, Kevin Miao, Alexander Toshev, Joshua Susskind, Jiatao Gu |
CVPR 2025 | highlight | 原文 | 代码 | 💡 未复现 |
Your ViT is Secretly an Image Segmentation Model Tommie Kerssies, Niccolò Cavagnero, Alexander Hermans, Narges Norouzi, Giuseppe Averta, Bastian Leibe, Gijs Dubbelman, Daan de Geus |
CVPR 2025 | highlight | 原文 | 代码 | 💡 未复现 |
WonderWorld: Interactive 3D Scene Generation from a Single Image Hong-Xing Yu, Haoyi Duan, Charles Herrmann, William T. Freeman, Jiajun Wu |
CVPR 2025 | highlight | 原文 | 代码 | 💡 未复现 |
Relightable Gaussian Codec Avatars Shunsuke Saito, Gabriel Schwartz, Tomas Simon, Junxuan Li, Giljoo Nam |
CVPR 2024 | oral | 原文 | 代码 | 💡 未复现 |
Ranni: Taming Text-to-Image Diffusion for Accurate Instruction Following Yutong Feng, Biao Gong, Di Chen, Yujun Shen, Yu Liu, Jingren Zhou |
CVPR 2024 | oral | 原文 | 代码 | 💡 未复现 |
Rethinking Inductive Biases for Surface Normal Estimation Gwangbin Bae, Andrew J. Davison |
CVPR 2024 | oral | 原文 | 代码 | 💡 未复现 |
PaSCo: Urban 3D Panoptic Scene Completion with Uncertainty Awareness Anh-Quan Cao, Angela Dai, Raoul de Charette |
CVPR 2024 | oral | 原文 | 代码 | 💡 未复现 |
Transcriptomics-guided Slide Representation Learning in Computational Pathology Guillaume Jaume, Lukas Oldenburg, Anurag Vaidya, Richard J. Chen, Drew F. K. Williamson, Thomas Peeters, Andrew H. Song, Faisal Mahmood |
CVPR 2024 | oral | 原文 | 代码 | 💡 未复现 |
DiffusionLight: Light Probes for Free by Painting a Chrome Ball Pakkapon Phongthawee, Worameth Chinchuthakun, Nontaphat Sinsunthithet, Amit Raj, Varun Jampani, Pramook Khungurn, Supasorn Suwajanakorn |
CVPR 2024 | oral | 原文 | 代码 | 💡 未复现 |
URHand: Universal Relightable Hands Zhaoxi Chen, Gyeongsik Moon, Kaiwen Guo, Chen Cao, Stanislav Pidhorskyi, Tomas Simon, Rohan Joshi, Yuan Dong, Yichen Xu, Bernardo Pires, He Wen, Lucas Evans, Bo Peng, Julia Buffalini, Autumn Trimble, Kevyn McPhail, Melissa Schoeller, Shoou-I Yu, Javier Romero, Michael Zollhöfer, Yaser Sheikh, Ziwei Liu, Shunsuke Saito |
CVPR 2024 | oral | 原文 | 代码 | 💡 未复现 |
Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation Bingxin Ke, Anton Obukhov, Shengyu Huang, Nando Metzger, Rodrigo Caye Daudt, Konrad Schindler |
CVPR 2024 | oral | 原文 | 代码 | 💡 未复现 |
PhysGaussian: Physics-Integrated 3D Gaussians for Generative Dynamics Tianyi Xie, Zeshun Zong, Yuxing Qiu, Xuan Li, Yutao Feng, Yin Yang, Chenfanfu Jiang |
CVPR 2024 | oral | 原文 | 代码 | 💡 未复现 |
HumanGaussian: Text-Driven 3D Human Generation with Gaussian Splatting Xian Liu, Xiaohang Zhan, Jiaxiang Tang, Ying Shan, Gang Zeng, Dahua Lin, Xihui Liu, Ziwei Liu |
CVPR 2024 | oral | 原文 | 代码 | 💡 未复现 |
Prompt Highlighter: Interactive Control for Multi-Modal LLMs Yuechen Zhang, Shengju Qian, Bohao Peng, Shu Liu, Jiaya Jia |
CVPR 2024 | oral | 原文 | 代码 | 💡 未复现 |
Cooperation Does Matter: Exploring Multi-Order Bilateral Relations for Audio-Visual Segmentation Qi Yang, Xing Nie, Tong Li, Pengfei Gao, Ying Guo, Cheng Zhen, Pengfei Yan, Shiming Xiang |
CVPR 2024 | oral | 原文 | 代码 | 💡 未复现 |
Upscale-A-Video: Temporal-Consistent Diffusion Model for Real-World Video Super-Resolution Shangchen Zhou, Peiqing Yang, Jianyi Wang, Yihang Luo, Chen Change Loy |
CVPR 2024 | oral | 原文 | 代码 | 💡 未复现 |
Putting the Object Back into Video Object Segmentation Ho Kei Cheng, Seoung Wug Oh, Brian Price, Joon-Young Lee, Alexander Schwing |
CVPR 2024 | oral | 原文 | 代码 | 💡 未复现 |
InstanceDiffusion: Instance-level Control for Image Generation Xudong Wang, Trevor Darrell, Sai Saketh Rambhatla, Rohit Girdhar, Ishan Misra |
CVPR 2024 | oral | 原文 | 代码 | 💡 未复现 |
OMG-Seg: Is One Model Good Enough For All Segmentation? Xiangtai Li, Haobo Yuan, Wei Li, Henghui Ding, Size Wu, Wenwei Zhang, Yining Li, Kai Chen, Chen Change Loy |
CVPR 2024 | oral | 原文 | 代码 | 💡 未复现 |
Towards Language-Driven Video Inpainting via Multimodal Large Language Models Jianzong Wu, Xiangtai Li, Chenyang Si, Shangchen Zhou, Jingkang Yang, Jiangning Zhang, Yining Li, Kai Chen, Yunhai Tong, Ziwei Liu, Chen Change Loy |
CVPR 2024 | oral | 原文 | 代码 | 💡 未复现 |
VBench: Comprehensive Benchmark Suite for Video Generative Models Ziqi Huang, Fan Zhang, Xiaojie Xu, Yinan He, Jiashuo Yu, Ziyue Dong, Qianli Ma, Nattapol Chanpaisit, Chenyang Si, Yuming Jiang, Yaohui Wang, Xinyuan Chen, Ying-Cong Chen, Limin Wang, Dahua Lin, Yu Qiao, Ziwei Liu |
CVPR 2024 | oral | 原文 | 代码 | 💡 未复现 |
PIGEON: Predicting Image Geolocations Lukas Haas, Michal Skreta, Silas Alberti, Chelsea Finn |
CVPR 2024 | oral | 原文 | 代码 | 💡 未复现 |
DiffPortrait3D: Controllable Diffusion for Zero-Shot Portrait View Synthesis Yuming Gu, You Xie, Hongyi Xu, Guoxian Song, Yichun Shi, Di Chang, Jing Yang, Linjie Luo |
CVPR 2024 | highlight | 原文 | 代码 | 💡 未复现 |
Domain Prompt Learning with Quaternion Networks Qinglong Cao, Zhengqin Xu, Yuntian Chen, Chao Ma, Xiaokang Yang |
CVPR 2024 | highlight | 原文 | 代码 | 💡 未复现 |
DragDiffusion: Harnessing Diffusion Models for Interactive Point-based Image Editing Yujun Shi, Chuhui Xue, Jun Hao Liew, Jiachun Pan, Hanshu Yan, Wenqing Zhang, Vincent Y. F. Tan, Song Bai |
CVPR 2024 | highlight | 原文 | 代码 | 💡 未复现 |
Fast ODE-based Sampling for Diffusion Models in Around 5 Steps Zhenyu Zhou, Defang Chen, Can Wang, Chun Chen |
CVPR 2024 | highlight | 原文 | 代码 | 💡 未复现 |
FlowVid: Taming Imperfect Optical Flows for Consistent Video-to-Video Synthesis Feng Liang, Bichen Wu, Jialiang Wang, Licheng Yu, Kunpeng Li, Yinan Zhao, Ishan Misra, Jia-Bin Huang, Peizhao Zhang, Peter Vajda, Diana Marculescu |
CVPR 2024 | highlight | 原文 | 代码 | 💡 未复现 |
General Object Foundation Model for Images and Videos at Scale Junfeng Wu, Yi Jiang, Qihao Liu, Zehuan Yuan, Xiang Bai, Song Bai |
CVPR 2024 | highlight | 原文 | 代码 | 💡 未复现 |
Insect-Foundation: A Foundation Model and Large-scale 1M Dataset for Visual Insect Understanding Hoang-Quan Nguyen, Thanh-Dat Truong, Xuan Bac Nguyen, Ashley Dowling, Xin Li, Khoa Luu |
CVPR 2024 | highlight | 原文 | 代码 | 💡 未复现 |
Move as You Say, Interact as You Can: Language-guided Human Motion Generation with Scene Affordance Zan Wang, Yixin Chen, Baoxiong Jia, Puhao Li, Jinlu Zhang, Jingze Zhang, Tengyu Liu, Yixin Zhu, Wei Liang, Siyuan Huang |
CVPR 2024 | highlight | 原文 | 代码 | 💡 未复现 |
Object Recognition as Next Token Predictio Kaiyu Yue, Bor-Chun Chen, Jonas Geiping, Hengduo Li, Tom Goldstein, Ser-Nam Lim |
CVPR 2024 | highlight | 原文 | 代码 | 💡 未复现 |
RAVE: Randomized Noise Shuffling for Fast and Consistent Video Editing with Diffusion Models Ozgur Kara, Bariscan Kurtkaya, Hidir Yesiltepe, James M. Rehg, Pinar Yanardag |
CVPR 2024 | highlight | 原文 | 代码 | 💡 未复现 |
Readout Guidance: Learning Control from Diffusion Features Grace Luo, Trevor Darrell, Oliver Wang, Dan B Goldman, Aleksander Holynski |
CVPR 2024 | highlight | 原文 | 代码 | 💡 未复现 |
Real Acoustic Fields: An Audio-Visual Room Acoustics Dataset and Benchmark Ziyang Chen, Israel D. Gebru, Christian Richardt, Anurag Kumar, William Laney, Andrew Owens, Alexander Richard |
CVPR 2024 | highlight | 原文 | 代码 | 💡 未复现 |
RichDreamer: A Generalizable Normal-Depth Diffusion Model for Detail Richness in Text-to-3D Lingteng Qiu, Guanying Chen, Xiaodong Gu, Qi Zuo, Mutian Xu, Yushuang Wu, Weihao Yuan, Zilong Dong, Liefeng Bo, Xiaoguang Han |
CVPR 2024 | highlight | 原文 | 代码 | 💡 未复现 |
RobustSAM: Segment Anything Robustly on Degraded Images Wei-Ting Chen, Yu-Jiet Vong, Sy-Yen Kuo, Sizhuo Ma, Jian Wang |
CVPR 2024 | highlight | 原文 | 代码 | 💡 未复现 |
Scaling Up Dynamic Human-Scene Interaction Modeling Nan Jiang, Zhiyuan Zhang, Hongjie Li, Xiaoxuan Ma, Zan Wang, Yixin Chen, Tengyu Liu, Yixin Zhu, Siyuan Huang |
CVPR 2024 | highlight | 原文 | 代码 | 💡 未复现 |
SDDGR: Stable Diffusion-based Deep Generative Replay for Class Incremental Object Detection Junsu Kim, Hoseong Cho, Jihyeon Kim, Yihalem Yimolal Tiruneh, Seungryul Baek |
CVPR 2024 | highlight | 原文 | 代码 | 💡 未复现 |
SpatialTracker: Tracking Any 2D Pixels in 3D Space Yuxi Xiao, Qianqian Wang, Shangzhan Zhang, Nan Xue, Sida Peng, Yujun Shen, Xiaowei Zhou |
CVPR 2024 | highlight | 原文 | 代码 | 💡 未复现 |
TFMQ-DM:Temporal Feature Maintenance Quantization for Diffusion Models Yushi Huang, Ruihao Gong, Jing Liu, Tianlong Chen, Xianglong Liu |
CVPR 2024 | highlight | 原文 | 代码 | 💡 未复现 |
Text Is MASS: Modeling as Stochastic Embedding for Text-Video Retrieval Jiamian Wang, Guohao Sun, Pichao Wang, Dongfang Liu, Sohail Dianat, Majid Rabbani, Raghuveer Rao, Zhiqiang Tao |
CVPR 2024 | highlight | 原文 | 代码 | 💡 未复现 |
Towards Learning a Generalist Model for Embodied Navigation Duo Zheng, Shijia Huang, Lin Zhao, Yiwu Zhong, Liwei Wang |
CVPR 2024 | highlight | 原文 | 代码 | 💡 未复现 |
UniMODE: Unified Monocular 3D Object Detection Zhuoling Li, Xiaogang Xu, SerNam Lim, Hengshuang Zhao |
CVPR 2024 | highlight | 原文 | 代码 | 💡 未复现 |
Unsupervised Keypoints from Pretrained Diffusion Models Eric Hedlin, Gopal Sharma, Shweta Mahajan, Xingzhe He, Hossam Isack, Abhishek Kar Helge Rhodin, Andrea Tagliasacchi, Kwang Moo Yi |
CVPR 2024 | highlight | 原文 | 代码 | 💡 未复现 |
VA3: Virtually Assured Amplification Attack on Probabilistic Copyright Protection for Text-to-Image Generative Models Xiang Li, Qianli Shen, Kenji Kawaguchi |
CVPR 2024 | highlight | 原文 | 代码 | 💡 未复现 |
VecFusion: Vector Font Generation with Diffusion Vikas Thamizharasan, Difan Liu, Shantanu Agarwal, Matthew Fisher, Michael Gharbi, Oliver Wang, Alec Jacobson, Evangelos Kalogerakis |
CVPR 2024 | highlight | 原文 | 代码 | 💡 未复现 |
Wonder3D: Single Image to 3D using Cross-Domain Diffusion Xiaoxiao Long, Yuan-Chen Guo, Cheng Lin, Yuan Liu, Zhiyang Dou, Lingjie Liu, Yuexin Ma, Song-Hai Zhang, Marc Habermann, Christian Theobalt, Wenping Wang |
CVPR 2024 | highlight | 原文 | 代码 | 💡 未复现 |
VTimeLLM: Empower LLM to Grasp Video Moments Bin Huang, Xin Wang, Hong Chen, Zihan Song, Wenwu Zhu |
CVPR 2024 | highlight | 原文 | 代码 | 💡 未复现 |
LHM: Large Animatable Human Reconstruction Model from a Single Image in Seconds Lingteng Qiu, Xiaodong Gu, Peihao Li, Qi Zuo, Weichao Shen, Junfei Zhang, Kejie Qiu, Weihao Yuan, Guanying Chen, Zilong Dong, Liefeng Bo |
ICCV 2025 | poster | 原文 | 代码 | 💡 未复现 |
EasyControl: Adding Efficient and Flexible Control for Diffusion Transformer Yuxuan Zhang, Yirui Yuan, Yiren Song, Haofan Wang, Jiaming Liu |
ICCV 2025 | poster | 原文 | 代码 | 💡 未复现 |
MVSplat: Efficient 3D Gaussian Splatting from Sparse Multi-View Images Yuedong Chen, Haofei Xu, Chuanxia Zheng, Bohan Zhuang, Marc Pollefeys, Andreas Geiger, Tat-Jen Cham, Jianfei Cai |
ECCV 2024 | Oral | 原文 | 代码 | 💡 未复现 |
FSGS: Real-Time Few-shot View Synthesis using Gaussian Splatting Zehao Zhu, Zhiwen Fan, Yifan Jiang, Zhangyang Wang |
ECCV 2024 | poster | 原文 | 代码 | 💡 未复现 |
ZigMa: A DiT-style Zigzag Mamba Diffusion Model Vincent Tao Hu, Stefan Andreas Baumann, Ming Gui, Olga Grebenkova, Pingchuan Ma, Johannes Schusterbauer, Björn Ommer |
ECCV 2024 | poster | 原文 | 代码 | 💡 未复现 |
Bridging Synthetic and Real Worlds for Pre-training Scene Text Detectors Tongkun Guan, Wei Shen, Xue Yang, Xuehui Wang, Xiaokang Yang |
ECCV 2024 | poster | 原文 | 代码 | 💡 未复现 |
PosFormer: Recognizing Complex Handwritten Mathematical Expression with Position Forest Transformer Tongkun Guan, Chengyu Lin, Wei Shen, Xiaokang Yang |
ECCV 2024 | poster | 原文 | 代码 | 💡 未复现 |
Fully Sparse 3D Occupancy Prediction Haisong Liu, Yang Chen, Haiguang Wang, Zetong Yang, Tianyu Li, Jia Zeng, Li Chen, Hongyang Li, Limin Wang |
ECCV 2024 | poster | 原文 | 代码 | 💡 未复现 |
NeRF-MAE: Masked AutoEncoders for Self-Supervised 3D Representation Learning for Neural Radiance Fields Muhammad Zubair Irshad, Sergey Zakharov, Vitor Guizilini, Adrien Gaidon, Zsolt Kira, Rares Ambrus |
ECCV 2024 | Poster | 原文 | 代码 | 💡 未复现 |
ControlCap: Controllable Region-level Captioning Yuzhong Zhao, Yue Liu, Zonghao Guo, Weijia Wu, Chen Gong, Fang Wan, Qixiang Ye |
ECCV 2024 | Poster | 原文 | 代码 | 💡 未复现 |
GiT: Towards Generalist Vision Transformer through Universal Language Interface Haiyang Wang, Hao Tang, Li Jiang, Shaoshuai Shi, Muhammad Ferjad Naeem, Hongsheng Li, Bernt Schiele, Liwei Wang |
ECCV 2024 | Oral | 原文 | 代码 | 💡 未复现 |
Relation DETR: Exploring Explicit Position Relation Prior for Object Detection Xiuquan Hou, Meiqin Liu, Senlin Zhang, Ping Wei, Badong Chen, Xuguang Lan |
ECCV 2024 | Oral | 原文 | 代码 | 💡 未复现 |
FairDomain: Achieving Fairness in Cross-Domain Medical Image Segmentation and Classification Yu Tian, Congcong Wen, Min Shi, Muhammad Muneeb Afzal, Hao Huang, Muhammad Osama Khan, Yan Luo, Yi Fang, Mengyu Wang |
ECCV 2024 | poster | 原文 | 代码 | 💡 未复现 |
OneRestore: A Universal Restoration Framework for Composite Degradation Yu Guo, Yuan Gao, Yuxu Lu, Huilin Zhu, Ryan Wen Liu, Shengfeng He |
ECCV 2024 | poster | 原文 | 代码 | 💡 未复现 |
VideoStudio: Generating Consistent-Content and Multi-Scene Videos Fuchen Long, Zhaofan Qiu, Ting Yao, Tao Mei |
ECCV 2024 | poster | 原文 | 代码 | 💡 未复现 |
Zero-shot Object Counting with Good Exemplars Huilin Zhu, Jingling Yuan, Zhengwei Yang, Yu Guo, Zheng Wang, Xian Zhong, Shengfeng He |
ECCV 2024 | poster | 原文 | 代码 | 💡 未复现 |
SPVLoc: Semantic Panoramic Viewport Matching for 6D Camera Localization in Unseen Environments Niklas Gard, Anna Hilsmann, Peter Eisert |
ECCV 2024 | Oral | 原文 | 代码 | 💡 未复现 |
Stereo Any Video: Temporally Consistent Stereo Matching Junpeng Jing;Weixun Luo;Ye Mao; Krystian Mikolajczyk |
ICCV 2025 | highlight | 原文 | 代码 | 💡 未复现 |
InfiniteYou: Flexible Photo Recrafting While Preserving Your Identity Liming Jiang, Qing Yan, Yumin Jia, Zichuan Liu, Hao Kang, Xin Lu |
ICCV 2025 | highlight | 原文 | 代码 | 💡 未复现 |
MeshAnything V2: Artist-Created Mesh Generation With Adjacent Mesh Tokenization Yiwen Chen, Yikai Wang, Yihao Luo, Zhengyi Wang, Zilong Chen, Jun Zhu, Chi Zhang, Guosheng Lin |
ICCV 2025 | poster | 原文 | 代码 | 💡 未复现 |
From Reusing to Forecasting: Accelerating Diffusion Models with TaylorSeers Jiacheng Liu, Chang Zou, Yuanhuiyi Lyu, Junjie Chen, Linfeng Zhang |
ICCV 2025 | poster | 原文 | 代码 | 💡 未复现 |
MotionFollower: Editing Video Motion via Lightweight Score-Guided Diffusion Shuyuan Tu, Qi Dai, Zihao Zhang, Sicheng Xie, Zhi-Qi Cheng, Chong Luo, Xintong Han, Zuxuan Wu, Yu-Gang Jiang |
ICCV 2025 | poster | 原文 | 代码 | 💡 未复现 |
AnimateAnyMesh: A Feed-Forward 4D Foundation Model for Text-Driven Universal Mesh Animation Zijie Wu, Chaohui Yu, Fan Wang, Xiang Bai |
ICCV 2025 | poster | 原文 | 代码 | 💡 未复现 |
VSSD: Vision Mamba with Non-Causal State Space Duality Yuheng Shi, Minjing Dong, Mingjia Li, Chang Xu |
ICCV 2025 | poster | 原文 | 代码 | 💡 未复现 |
FreeMorph: Tuning-Free Generalized Image Morphing with Diffusion Model Yukang Cao、Chenyang Si、Jinghao Wang、Ziwei Liu |
ICCV 2025 | poster | 原文 | 代码 | 💡 未复现 |
GENMO: A GENeralist Model for Human MOtion Jiefeng Li, Jinkun Cao, Haotian Zhang, Davis Rempe, Jan Kautz, Umar Iqbal, Ye Yuan |
ICCV 2025 | poster | 原文 | 代码 | 💡 未复现 |
Vamba: Understanding Hour-Long Videos with Hybrid Mamba-Transformers Weiming Ren, Wentao Ma, Huan Yang, Cong Wei, Ge Zhang, Wenhu Chen |
ICCV 2025 | poster | 原文 | 代码 | 💡 未复现 |
Medical World Model: Generative Simulation of Tumor Evolution for Treatment Planning Yijun Yang, Zhao-Yang Wang, Qiuping Liu, Shuwen Sun, Kang Wang, Rama Chellappa, Zongwei Zhou, Alan Yuille, Lei Zhu, Yu-Dong Zhang, Jieneng Chen |
ICCV 2025 | poster | 原文 | 代码 | 💡 未复现 |
Where, What, Why: Towards Explainable Driver Attention Prediction Yuchen Zhou, Jiayu Tang, Xiaoyan Xiao, Yueyao Lin, Linkai Liu, Zipeng Guo, Hao Fei, Xiaobo Xia, Chao Gou |
ICCV 2025 | poster | 原文 | 代码 | 💡 未复现 |
I Open at the Close: A Deep Reinforcement Learning Evaluation of Open Streets Initiatives R. Teal Witter, Lucas Rosenblatt |
AAAI 2024 | Technical | 原文 | 代码 | 💡 未复现 |
MedAlign: A Clinician-Generated Dataset for Instruction Following with Electronic Medical Records Scott L. Fleming, Alejandro Lozano, William J. Haberkorn, Jenelle A. Jindal, Eduardo Reis, Rahul Thapa, Louis Blankemeier, Julian Z. Genkins, Ethan Steinberg, Ashwin Nayak, Birju Patel, Chia-Chun Chiang, Alison Callahan, Zepeng Huo, Sergios Gatidis, Scott Adams, Oluseyi Fayanju, Shreya J. Shah, Thomas Savage, Ethan Goh, Akshay S. Chaudhari, Nima Aghaeepour, Christopher Sharp, Michael A. Pfeffer, Percy Liang, Jonathan H. Chen, Keith E. Morse, Emma P. Brunskill, Jason A. Fries, Nigam H. Shah |
AAAI 2023 | Technical | 原文 | 代码 | 💡 未复现 |
Multi-Modal Discussion Transformer: Integrating Text, Images and Graph Transformers to Detect Hate Speech on Social Media Liam Hebert, Gaurav Sahu, Yuxuan Guo, Nanda Kishore Sreenivas, Lukasz Golab, Robin Cohen |
AAAI 2024 | Technical | 原文 | 代码 | 💡 未复现 |
Quantile-Regression-Ensemble: A Deep Learning Algorithm for Downscaling Extreme Precipitation Thomas Bailie, Yun Sing Koh, Neelesh Rampal, Peter B. Gibson |
AAAI 2024 | Technical | 原文 | 💡 未复现 |
Spatial-Logic-Aware Weakly Supervised Learning for Flood Mapping on Earth Imagery Zelin Xu, Tingsong Xiao, Wenchong He, Yu Wang, Zhe Jiang, Shigang Chen, Yiqun Xie, Xiaowei Jia, Da Yan, Yang Zhou |
AAAI 2024 | Technical | 原文 | 代码 | 💡 未复现 |
Vector Field Oriented Diffusion Model for Crystal Material Generation Astrid Klipfel, Yael Fregier , Adlane Sayede, Zied Bouraoui |
AAAI 2023 | Technical | 原文 | 代码 | 💡 未复现 |
Periodic Graph Transformers for Crystal Material Property Prediction Keqiang Yan,Yi Liu,Yuchao Lin,Shuiwang Ji |
AAAI 2022 | Technical | 原文 | 代码 | 💡 未复现 |
ContraNovo: A Contrastive Learning Approach to Enhance De Novo Peptide Sequencing Zhi Jin, Sheng Xu, Xiang Zhang, Tianze Ling, Nanqing Dong, Wanli Ouyang, Zhiqiang Gao, Cheng Chang, Siqi Sun |
AAAI 2023 | Technical | 原文 | 代码 | 💡 未复现 |
Dual-Channel Learning Framework for Drug-Drug Interaction Prediction via Relation-Aware Heterogeneous Graph Transformer Xiaorui Su, Pengwei Hu, Zhu-Hong You, Philip S. Yu, Lun Hu |
AAAI 2024 | Technical | 原文 | 代码 | 💡 未复现 |
Explore 3D Dance Generation via Reward Model from Automatically-Ranked Demonstrations Zilin Wang, Haolin Zhuang, Lu Li, Yinmin Zhang, Junjie Zhong, Jun Chen, Yu Yang, Boshi Tang, Zhiyong Wu |
AAAI 2023 | Technical | 原文 | 代码 | 💡 未复现 |
Heterogeneous Graph Reasoning for Fact Checking over Texts and Tables Haisong Gong, Weizhi Xu, Shu Wu, Qiang Liu, Liang Wang |
AAAI 2024 | Technical | 原文 | 代码 | 💡 未复现 |
PosDiffNet: Positional Neural Diffusion for Point Cloud Registration in a Large Field of View with Perturbations Rui She, Sijie Wang, Qiyu Kang, Kai Zhao, Yang Song, Wee Peng Tay, Tianyu Geng |
AAAI 2024 | Technical | 原文 | 代码 | 💡 未复现 |
SeGA: Preference-Aware Self-Contrastive Learning with Prompts for Anomalous User Detection on Twitter Ying-Ying Chang, Wei-Yao Wang, Wen-Chih Peng |
AAAI 2023 | Technical | 原文 | 代码 | 💡 未复现 |
Text-Guided Molecule Generation with Diffusion Language Model Haisong Gong, Qiang Liu, Shu Wu, Liang Wang |
AAAI 2024 | Technical | 原文 | 代码 | 💡 未复现 |
Unsupervised Gene-Cell Collective Representation Learning with Optimal Transport Jixiang Yu, Nanjun Chen, Ming Gao, Xiangtao Li, Ka-Chun Wong |
AAAI 2024 | Technical | 原文 | 💡 未复现 |
Bi-directional Adapter for Multi-modal Tracking Bing Cao, Junliang Guo, Pengfei Zhu, Qinghua Hu |
AAAI 2023 | Technical | 原文 | 代码 | 💡 未复现 |
DanceAnyWay: Synthesizing Beat-Guided 3D Dances with Randomized Temporal Contrastive Learning Aneesh Bhattacharya, Manas Paranjape, Uttaran Bhattacharya, Aniket Bera |
AAAI 2024 | Technical | 原文 | 代码 | 💡 未复现 |
Deep Linear Array Pushbroom Image Restoration: A Degradation Pipeline and Jitter-Aware Restoration Network Zida Chen, Ziran Zhang, Haoying Li, Menghao Li, Yueting Chen, Qi Li, Huajun Feng, Zhihai Xu, Shiqi Chen |
AAAI 2024 | Technical | 原文 | 代码 | 💡 未复现 |
DiffSED: Sound Event Detection with Denoising Diffusion Swapnil Bhosale, Sauradip Nag, Diptesh Kanojia, Jiankang Deng, Xiatian Zhu |
AAAI 2023 | Technical | 原文 | 代码 | 💡 未复现 |
Domain-Controlled Prompt Learning Qinglong Cao, Zhengqin Xu, Yuntian Chen, Chao Ma, Xiaokang Yang |
AAAI 2023 | Technical | 原文 | 代码 | 💡 未复现 |
DreamIdentity: Improved Editability for Efficient Face-identity Preserved Image Generation Zhuowei Chen, Shancheng Fang, Wei Liu, Qian He, Mengqi Huang, Yongdong Zhang, Zhendong Mao |
AAAI 2023 | Technical | 原文 | 代码 | 💡 未复现 |
DreamStyler: Paint by Style Inversion with Text-to-Image Diffusion Models Namhyuk Ahn, Junsoo Lee, Chunggi Lee, Kunhee Kim, Daesik Kim, Seung-Hun Nam, Kibeom Hong |
AAAI 2023 | Technical | 原文 | 代码 | 💡 未复现 |
Evaluate Geometry of Radiance Fields with Low-frequency Color Prior Namhyuk Ahn, Junsoo Lee, Chunggi Lee, Kunhee Kim, Daesik Kim, Seung-Hun Nam, Kibeom Hong |
AAAI 2024 | Technical | 原文 | 代码 | 💡 未复现 |
EVE: Efficient Vision-Language Pre-training with Masked Prediction and Modality-Aware MoE Junyi Chen, Longteng Guo, Jia Sun, Shuai Shao, Zehuan Yuan, Liang Lin, Dongyu Zhang |
AAAI 2024 | Technical | 原文 | 代码 | 💡 未复现 |
Exploiting Polarized Material Cues for Robust Car Detection Wen Dong, Haiyang Mei, Ziqi Wei, Ao Jin, Sen Qiu, Qiang Zhang, Xin Yang |
AAAI 2024 | Technical | 原文 | 代码 | 💡 未复现 |
Federated Modality-specific Encoders and Multimodal Anchors for Personalized Brain Tumor Segmentation Qian Dai, Dong Wei, Hong Liu, Jinghan Sun, Liansheng Wang, Yefeng Zheng |
AAAI 2024 | Technical | 原文 | 代码 | 💡 未复现 |
Markerless Multi-view 3D Human Pose Estimation: a survey Ana Filipa Rodrigues Nogueira,Hélder P. Oliveira,Luís F. Teixeira |
AAAI 2024 | Technical | 原文 | 代码 | 💡 未复现 |
Generating and Reweighting Dense Contrastive Patterns for Unsupervised Anomaly Detection Songmin Dai, Yifan Wu, Xiaoqiang Li, Xiangyang Xue |
AAAI 2023 | Technical | 原文 | 代码 | 💡 未复现 |
SimROD: A Simple Baseline for Raw Object Detection with Global and Local Enhancements Haiyang Xie, Xi Shen, Shihua Huang, Qirui Wang, Zheng Wang |
AAAI 2025 | Technical | 原文 | 代码 | 💡 未复现 |
Hyp-OW: Exploiting Hierarchical Structure Learning with Hyperbolic Distance Enhances Open World Object Detection Thang Doan, Xin Li,Sima Behpour,Wenbin He,Liang Gou,Liu Ren |
AAAI 2024 | Technical | 原文 | 代码 | 💡 未复现 |
iDet3D: Towards Efficient Interactive Object Detection for LiDAR Point Clouds Dongmin Choi, Wonwoo Cho, Kangyeol Kim, Jaegul Choo |
AAAI 2023 | Technical | 原文 | 代码 | 💡 未复现 |
Image Safeguarding: Reasoning with Conditional Vision Language Model and Obfuscating Unsafe Content Counterfactually Mazal Bethany, Brandon Wherry, Nishant Vishwamitra, Peyman Najafirad |
AAAI 2024 | Technical | 原文 | 代码 | 💡 未复现 |
Improving Diffusion-Based Image Synthesis with Context Prediction Ling Yang, Jingwei Liu, Shenda Hong, Zhilong Zhang, Zhilin Huang, Zheming Cai, Wentao Zhang, Bin Cui |
AAAI 2024 | Technical | 原文 | 💡 未复现 |
UniPre3D: Unified Pre-training of 3D Point Cloud Models with Cross-Modal Gaussian Splatting
Ziyi Wang, Yanran Zhang, Jie Zhou, Jiwen Lu |
AAAI 2025 | Technical | 原文 | 代码 | 💡 未复现 |
SpatioTemporal Difference Network for Video Depth Super-Resolution Zhengxue Wang, Yuan Wu, Xiang Li, Zhiqiang Yan, Jian Yang |
AAAI 2025 | Technical | 原文 | 代码 | 💡 未复现 |
Iterative Token Evaluation and Refinement for Real-World Super-Resolution Chaofeng Chen, Shangchen Zhou, Liang Liao, Haoning Wu, Wenxiu Sun, Qiong Yan, Weisi Lin |
AAAI 2023 | Technical | 原文 | 代码 | 💡 未复现 |
LDMVFI: Video Frame Interpolation with Latent Diffusion Models Duolikun Danier, Fan Zhang, David Bull |
AAAI 2023 | Technical | 原文 | 代码 | 💡 未复现 |
Joint Demosaicing and Denoising for Spike Camera Yanchen Dong, Ruiqin Xiong, Jing Zhao, Jian Zhang, Xiaopeng Fan, Shuyuan Zhu, Tiejun Huang |
AAAI 2024 | Technical | 原文 | 代码 | 💡 未复现 |
De-LightSAM: Modality-Decoupled Lightweight SAM for Generalizable Medical Segmentation Qing Xu, Jiaxuan Li, Xiangjian He, Chenxin Li, Fiseha B. Tesem, Wenting Duan, Zhen Chen, Rong Qu, Jonathan M. Garibaldi,Chang Wen Chen |
AAAI 2024 | Technical | 原文 | 代码 | 💡 未复现 |
LogoStyleFool: Vitiating Video Recognition Systems via Logo Style Transfer Yuxin Cao, Ziyu Zhao, Xi Xiao, Derui Wang, Minhui Xue, Jin Lu |
AAAI 2024 | Technical | 原文 | 💡 未复现 |
M-BEV: Masked BEV Perception for Robust Autonomous Driving Siran Chen, Yue Ma, Yu Qiao, Yali Wang |
AAAI 2023 | Technical | 原文 | 代码 | 💡 未复现 |
MeDM: Mediating Image Diffusion Models for Video-to-Video Translation with Temporal Correspondence Guidance Ernie Chu, Tzuhsuan Huang, Shuo-Yen Lin, Jun-Cheng Chen |
AAAI 2023 | Technical | 原文 | 代码 | 💡 未复现 |
Mimic: Speaking Style Disentanglement for Speech-Driven 3D Facial Animation Hui Fu, Zeqing Wang, Ke Gong, Keze Wang, Tianshui Chen, Haojie Li, Haifeng Zeng, Wenxiong Kang |
AAAI 2023 | Technical | 原文 | 代码 | 💡 未复现 |
NILUT: Conditional Neural Implicit 3D Lookup Tables for Image Enhancement Marcos V. Conde, Javier Vazquez-Corral, Michael S. Brown, Radu Timofte |
AAAI 2023 | Technical | 原文 | 代码 | 💡 未复现 |
A Range-Null Space Decomposition Approach for Fast and Flexible Spectral Compressive Imaging Junyu Wang, Shijie Wang, Ruijie Zhang, Zengqiang Zheng, Wenyu Liu, Xinggang Wang |
AAAI 2023 | Technical | 原文 | 代码 | 💡 未复现 |
Mixed Degradation Image Restoration via Local Dynamic Optimization and Conditional Embedding
Yubin Gu, Yuan Meng, Xiaoshuai Sun, Jiayi Ji, Weijian Ruan, Rongrong Ji |
AAAI 2024 | Technical | 原文 | 代码 | 💡 未复现 |
Revisiting Point Cloud Completion: Are We Ready For The Real-World? Stuti Pathak, Prashant Kumar, Dheeraj Baiju, Nicholus Mboga, Gunther Steenackers, Rudi Penne |
AAAI 2024 | Technical | 原文 | 💡 未复现 |
PuzzleFusion++: Auto-agglomerative 3D Fracture Assembly by Denoise and Verify
Zhengqing Wang, Jiacheng Chen, Yasutaka Furukawa |
AAAI 2024 | Technical | 原文 | 代码 | 💡 未复现 |
PNeSM: Arbitrary 3D Scene Stylization via Prompt-Based Neural Style Mapping Jiafu Chen, Wei Xing, Jiakai Sun, Tianyi Chu, Yiling Huang, Boyan Ji, Lei Zhao, Huaizhong Lin, Haibo Chen, Zhizhong Wang |
AAAI 2024 | Technical | 原文 | 代码 | 💡 未复现 |
PPEA-Depth: Progressive Parameter-Efficient Adaptation for Self-Supervised Monocular Depth Estimation Yue-Jiang Dong, Yuan-Chen Guo, Ying-Tian Liu, Fang-Lue Zhang, Song-Hai Zhang |
AAAI 2024 | Technical | 原文 | 代码 | 💡 未复现 |
ResMatch: Residual Attention Learning for Local Feature Matching Yuxin Deng,Jiayi Ma |
AAAI 2023 | Technical | 原文 | 代码 | 💡 未复现 |
Bidirectional Multi-Scale Implicit Neural Representations for Image Deraining Xiang Chen,Jinshan Pan,Jiangxin Dong |
AAAI 2024 | Technical | 原文 | 代码 | 💡 未复现 |
Self-Supervised Bird’s Eye View Motion Prediction with Cross-Modality Signals Shaoheng Fang, Zuhong Liu, Mingyu Wang, Chenxin Xu, Yiqi Zhong, Siheng Chen |
AAAI 2024 | Technical | 原文 | 代码 | 💡 未复现 |
SHaRPose: Sparse High-Resolution Representation for Human Pose Estimation Xiaoqi An, Lin Zhao, Chen Gong, Nannan Wang, Di Wang, Jian Yang |
AAAI 2023 | Technical | 原文 | 代码 | 💡 未复现 |
Simple Image-level Classification Improves Open-vocabulary Object Detection Ruohuan Fang, Guansong Pang, Xiao Bai |
AAAI 2023 | Technical | 原文 | 代码 | 💡 未复现 |
SparseGNV: Generating Novel Views of Indoor Scenes with Sparse Input Views Weihao Cheng Yan-Pei Cao Ying Shan |
AAAI 2023 | Technical | 原文 | 代码 | 💡 未复现 |
A Systematic Investigation on Deep Learning-Based Omnidirectional Image and Video Super-Resolution
Qianqian Zhao, Chunle Guo, Tianyi Zhang, Junpei Zhang, Peiyang Jia, Tan Su, Wenjie Jiang, Chongyi Li |
AAAI 2025 | Technical | 原文 | 代码 | 💡 未复现 |
TDeLTA: A Light-weight and Robust Table Detection Method based on Learning Text Arrangement Yang Fan, Xiangping Wu, Qingcai Chen, Heng Li, Yan Huang, Zhixiang Cai, Qitian Wu |
AAAI 2023 | Technical | 原文 | 💡 未复现 |
Variance-insensitive and Target-preserving Mask Refinement for Interactive Image Segmentation Chaowei Fang, Ziyin Zhou, Junye Chen, Hanjing Su, Qingyao Wu, Guanbin Li |
AAAI 2023 | Technical | 原文 | 💡 未复现 |
VIXEN: Visual Text Comparison Network for Image Difference Captioning Alexander Black, Jing Shi, Yifei Fan, Tu Bui, John Collomosse |
AAAI 2024 | Technical | 原文 | 代码 | 💡 未复现 |
Weak Distribution Detectors Lead to Stronger Generalizability of Vision-Language Prompt Tuning Kun Ding, Haojian Zhang, Qiang Yu, Ying Wang, Shiming Xiang, Chunhong Pan |
AAAI 2024 | Technical | 原文 | 代码 | 💡 未复现 |
WebVLN: Vision-and-Language Navigation on Websites Qi Chen, Dileepa Pitawela, Chongyang Zhao, Gengze Zhou, Hsiang-Ting Chen, Qi Wu |
AAAI 2023 | Technical | 原文 | 代码 | 💡 未复现 |
WeditGAN: Few-Shot Image Generation via Latent Space Relocation Yuxuan Duan, Li Niu, Yan Hong, Liqing Zhang |
AAAI 2024 | Technical | 原文 | 代码 | 💡 未复现 |
3D Visibility-aware Generalizable Neural Radiance Fields for Interacting Hands Xuan Huang, Hanhui Li, Zejun Yang, Zhisheng Wang, Xiaodan Liang |
AAAI 2024 | Technical | 原文 | 代码 | 💡 未复现 |
MAC: A Benchmark for Multiple Attribute Compositional Zero-Shot Learning
Xuan Huang, Hanhui Li, Zejun Yang, Zhisheng Wang, Xiaodan Liang |
AAAI 2025 | Technical | 原文 | 代码 | 💡 未复现 |
A General Implicit Framework for Fast NeRF Composition and Rendering Xinyu Gao, Ziyi Yang, Yunlu Zhao, Yuxiang Sun, Xiaogang Jin, Changqing Zou |
AAAI 2024 | Technical | 原文 | 代码 | 💡 未复现 |
A User-Friendly Framework for Generating Model-Preferred Prompts in Text-to-Image Synthesis Nailei Hei, Qianyu Guo, Zihao Wang, Yan Wang, Haofen Wang, Wenqiang Zhang |
AAAI 2024 | Technical | 原文 | 代码 | 💡 未复现 |
AnomalyGPT: Detecting Industrial Anomalies Using Large Vision-Language Models Zhaopeng Gu, Bingke Zhu, Guibo Zhu, Yingying Chen, Ming Tang, Jinqiao Wang |
AAAI 2023 | Technical | 原文 | 代码 | 💡 未复现 |
Arbitrary-Scale Video Super-Resolution with Structural and Textural Priors
Wei Shang, Dongwei Ren, Wanying Zhang, Yuming Fang, Wangmeng Zuo, Kede Ma |
AAAI 2024 | Technical | 原文 | 代码 | 💡 未复现 |
BLIVA: A Simple Multimodal LLM for Better Handling of Text-Rich Visual Questions Wenbo Hu, Yifan Xu, Yi Li, Weiyue Li, Zeyuan Chen, Zhuowen Tu |
AAAI 2023 | Technical | 原文 | 代码 | 💡 未复现 |
Depth-Guided Robust and Fast Point Cloud Fusion NeRF for Sparse Input Views Shuai Guo, Qiuwen Wang, Yijie Gao, Rong Xie, Li Song |
AAAI 2024 | Technical | 原文 | 代码 | 💡 未复现 |
Distribution Matching for Multi-Task Learning of Classification Tasks: a Large-Scale Study on Faces & Beyond Dimitrios Kollias, Viktoriia Sharmanska, Stefanos Zafeiriou |
AAAI 2024 | Technical | 原文 | 💡 未复现 |
Open-Vocabulary HOI Detection with Interaction-aware Prompt and Concept Calibration
Ting Lei, Shaofeng Yin, Qingchao Chen, Yuxin Peng, Yang Liu |
AAAI 2025 | Technical | 原文 | 代码 | 💡 未复现 |
Expand-and-Quantize: Unsupervised Semantic Segmentation Using High-Dimensional Space and Product Quantization Jiyoung Kim, Kyuhong Shim, Insu Lee, Byonghyo Shim |
AAAI 2023 | Technical | 原文 | 💡 未复现 |
Expediting Contrastive Language-Image Pretraining via Self-distilled Encoders Bumsoo Kim, Jinhyung Kim, Yeonsik Jo, Seung Hwan Kim |
AAAI 2023 | Technical | 原文 | 代码 | 💡 未复现 |
Exploring Self- and Cross-Triplet Correlations for Human-Object Interaction Detection Weibo Jiang, Weihong Ren, Jiandong Tian, Liangqiong Qu, Zhiyong Wang, Honghai Liu |
AAAI 2024 | Technical | 原文 | 代码 | 💡 未复现 |
Fine-Grained Multi-View Hand Reconstruction Using Inverse Rendering Qijun Gan, Wentong Li, Jinwei Ren, Jianke Zhu |
AAAI 2024 | Technical | 原文 | 代码 | 💡 未复现 |
Frequency-Adaptive Pan-Sharpening with Mixture of Experts Xuanhua He, Keyu Yan, Rui Li, Chengjun Xie, Jie Zhang, Man Zhou |
AAAI 2024 | Technical | 原文 | 代码 | 💡 未复现 |
Frequency-Controlled Diffusion Model for Versatile Text-Guided Image-to-Image Translation Xiang Gao, Zhengbo Xu, Junhan Zhao, Jiaying Liu |
AAAI 2025 | Technical | 原文 | 代码 | 💡 未复现 |
GSN: Generalisable Segmentation in Neural Radiance Field Vinayak Gupta, Rahul Goel, Sirikonda Dhawal, P. J. Narayanan |
AAAI 2024 | Technical | 原文 | 代码 | 💡 未复现 |
Hand-Centric Motion Refinement for 3D Hand-Object Interaction via Hierarchical Spatial-Temporal Modeling Yuze Hao, Jianrong Zhang, Tao Zhuo, Fuan Wen, Hehe Fan |
AAAI 2024 | Technical | 原文 | 代码 | 💡 未复现 |
High-Fidelity Diffusion-based Image Editing Chen Hou, Guoqiang Wei, Zhibo Chen |
AAAI 2024 | Technical | 原文 | 💡 未复现 |
HuTuMotion: Human-Tuned Navigation of Latent Motion Diffusion Models with Minimal Feedback Gaoge Han, Shaoli Huang, Mingming Gong, Jinglei Tang |
AAAI 2023 | Technical | 原文 | 代码 | 💡 未复现 |
Exploring Phrase-Level Grounding with Text-to-Image Diffusion Model
Danni Yang, Ruohan Dong, Jiayi Ji, Yiwei Ma, Haowei Wang, Xiaoshuai Sun, Rongrong Ji |
AAAI 2024 | Technical | 原文 | 代码 | 💡 未复现 |
2DMamba: Efficient State Space Model for Image Representation with Applications on Giga-Pixel Whole Slide Image Classification Jingwei Zhang, Anh Tien Nguyen, Xi Han, Vincent Quoc-Huy Trinh, Hong Qin, Dimitris Samaras, Mahdi S. Hosseini |
CVPR 2025 | poster | 原文 | 代码 | 💡 未复现 |
3D Dental Model Segmentation with Geometrical Boundary Preserving Shufan Xi, Zexian Liu, Junlin Chang, Hongyu Wu, Xiaogang Wang, Aimin Hao |
CVPR 2025 | poster | 原文 | 代码 | 💡 未复现 |
3D Gaussian Head Avatars with Expressive Dynamic Appearances by Compact Tensorial Representations Yating Wang, Xuan Wang, Ran Yi, Yanbo Fan, Jichen Hu, Jingcheng Zhu, Lizhuang Ma |
CVPR 2025 | poster | 原文 | 代码 | 💡 未复现 |
3D Gaussian Inpainting with Depth-Guided Cross-View Consistency Sheng-Yu Huang, Zi-Ting Chou, Yu-Chiang Frank Wang |
CVPR 2025 | poster | 原文 | 代码 | 💡 未复现 |
3D-GRAND: A Million-Scale Dataset for 3D-LLMs with Better Grounding and Less Hallucination Jianing Yang, Xuweiyi Chen, Nikhil Madaan, Madhavan Iyengar, Shengyi Qian, David F. Fouhey, Joyce Chai |
CVPR 2025 | poster | 原文 | 代码 | 💡 未复现 |
3D-GSW: 3D Gaussian Splatting for Robust Watermarking Youngdong Jang, Hyunje Park, Feng Yang, Heeju Ko, Euijin Choo, Sangpil Kim |
CVPR 2025 | poster | 原文 | 代码 | 💡 未复现 |
3D-HGS: 3D Half-Gaussian Splatting Haolin Li,Jinyang Liu,Mario Sznaier,Octavia Camps |
CVPR 2025 | poster | 原文 | 代码 | 💡 未复现 |
3D-LLaVA: Towards Generalist 3D LMMs with Omni Superpoint Transformer Jiajun Deng, Tianyu He, Li Jiang, Tianyu Wang, Feras Dayoub, Ian Reid |
CVPR 2025 | poster | 原文 | 代码 | 💡 未复现 |
3D-Mem: 3D Scene Memory for Embodied Exploration and Reasoning Yuncong Yang, Han Yang, Jiachen Zhou, Peihao Chen, Hongxin Zhang, Yilun Du, Chuang Gan |
CVPR 2025 | poster | 原文 | 代码 | 💡 未复现 |
3DENHANCER: Consistent Multi-View Diffusion for 3D Enhancement Yihang Luo, Shangchen Zhou, Yushi Lan, Xingang Pan, Chen Change Loy |
CVPR 2025 | poster | 原文 | 代码 | 💡 未复现 |
3DGUT: Enabling Distorted Cameras and Secondary Rays in Gaussian Splatting Qi Wu, Janick Martinez Esturo, Ashkan Mirzaei, Nicolas Moenne-Loccoz, Zan Gojcic |
CVPR 2025 | poster | 原文 | 代码 | 💡 未复现 |
4D LangSplat: 4D Language Gaussian Splatting via Multimodal Large Language Models Wanhua Li, Renping Zhou, Jiawei Zhou, Yingwei Song, Johannes Herter, Minghan Qin, Gao Huang, Hanspeter Pfister |
CVPR 2025 | poster | 原文 | 代码 | 💡 未复现 |
4DTAM: Non-Rigid Tracking and Mapping via Dynamic Surface Gaussians Hidenobu Matsuki Gwangbin Bae Andrew J. Davison |
CVPR 2025 | poster | 原文 | 代码 | 💡 未复现 |
A Bias-Free Training Paradigm for More General AI-generated Image Detection Fabrizio Guillaro, Giada Zingarini, Ben Usman, Avneesh Sud, Davide Cozzolino, Luisa Verdoliva |
CVPR 2025 | poster | 原文 | 代码 | 💡 未复现 |
A Data-Centric Revisit of Pre-Trained Vision Models for Robot Learning Xin Wen, Bingchen Zhao, Yilun Chen, Jiangmiao Pang, Xiaojuan Qi |
CVPR 2025 | poster | 原文 | 代码 | 💡 未复现 |