CVPR'23 最新 70 篇论文分方向整理｜包含目标检测、图像处理、人脸、医学影像、半监督

2023-03-20 10:10 作者:极市平台 0人读过 | 我要投稿

编辑丨极市平台

CVPR2023已经放榜，今年有2360篇，接收率为25.78%。在CVPR2023正式会议召开前，为了让大家更快地获取和学习到计算机视觉前沿技术，极市对CVPR023 最新论文进行追踪，包括分研究方向的论文、代码汇总以及论文技术直播分享。

CVPR 2023 论文分方向整理目前在极市社区持续更新中，已累计更新了158篇，项目地址：https://www.cvmart.net/community/detail/7422

以下是最近更新的 CVPR 2023 论文，包含目标检测、图像处理、人脸、场景重建、医学影像、半监督学习/弱监督学习/无监督学习/自监督学习等方向。

可打包下载：https://www.cvmart.net/community/detail/7429

检测

2D目标检测(2D Object Detection

[1]CapDet: Unifying Dense Captioning and Open-World Detection Pretraining

paper：https://arxiv.org/abs/2303.02489

[2]Enhanced Training of Query-Based Object Detection via Selective Query Recollection

paper：https://arxiv.org/abs/2212.07593

code：https://github.com/Fangyi-Chen/SQR

[3]DETRs with Hybrid Matching

paper：https://arxiv.org/abs/2207.13080

code：https://github.com/HDETR

[4]YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors(YOLOv7

paper：https://arxiv.org/abs/2207.02696

code：https://github.com/WongKinYiu/yolov7

视频目标检测(Video Object Detection

[1]SCOTCH and SODA: A Transformer Video Shadow Detection Framework
paper：https://arxiv.org/abs/2211.06885

3D目标检测(3D object detection

[1]MSMDFusion: Fusing LiDAR and Camera at Multiple Scales with Multi-Depth Seeds for 3D Object Detection

paper：https://arxiv.org/abs/2209.03102

code：https://github.com/sxjyjay/msmdfusion

[2]Uni3D: A Unified Baseline for Multi-dataset 3D Object Detection

paper：https://arxiv.org/abs/2303.06880

code：https://github.com/PJLab-ADG/3DTrans

[3]LoGoNet: Towards Accurate 3D Object Detection with Local-to-Global Cross-Modal Fusion

paper：https://arxiv.org/abs/2303.03595

code：https://github.com/sankin97/LoGoNet

[4]ConQueR: Query Contrast Voxel-DETR for 3D Object Detection(3D 目标检测的Query Contrast Voxel-DETR

paper：https://arxiv.org/abs/2212.07289

code：https://github.com/poodarchu/ConQueR

显著性目标检测(Saliency Object Detection

[1]Texture-guided Saliency Distilling for Unsupervised Salient Object Detection

paper：https://arxiv.org/abs/2207.05921

code：https://github.com/moothes/A2S-v2

车道线检测(Lane Detection

[1]BEV-LaneDet: a Simple and Effective 3D Lane Detection Baseline

paper：https://arxiv.org/abs/2210.06006

异常检测(Anomaly Detection

[1]Block Selection Method for Using Feature Norm in Out-of-distribution Detection

paper：https://arxiv.org/abs/2212.02295

[2]Lossy Compression for Robust Unsupervised Time-Series Anomaly Detection

paper：https://arxiv.org/abs/2212.02303

[3]Multimodal Industrial Anomaly Detection via Hybrid Fusion

paper：https://arxiv.org/abs/2303.00601

code：https://github.com/nomewang/M3DM

分割(Segmentation

图像分割(Image Segmentation

[1]MP-Former: Mask-Piloted Transformer for Image Segmentation
paper：https://arxiv.org/abs/2303.07336

code：https://github.com/IDEA-Research/MP-Former

[2]Interactive Segmentation as Gaussian Process Classification

paper：https://arxiv.org/abs/2302.14578

语义分割(Semantic Segmentation

[1]Open-Vocabulary Semantic Segmentation with Mask-adapted CLIP

paper：http://arxiv.org/abs/2210.04150

code：https://github.com/facebookresearch/ov-seg

[2]Efficient Semantic Segmentation by Altering Resolutions for Compressed Videos

paper：https://arxiv.org/abs/2303.07224

code：https://github.com/THU-LYJ-Lab/AR-Seg

[3]SCPNet: Semantic Scene Completion on Point Cloud

paper：https://arxiv.org/abs/2303.06884

[4]On Calibrating Semantic Segmentation Models: Analyses and An Algorithm

paper：https://arxiv.org/abs/2212.12053

[5]Learning Open-vocabulary Semantic Segmentation Models From Natural Language Supervision

paper：https://arxiv.org/abs/2301.09121

[6]Revisiting Weak-to-Strong Consistency in Semi-Supervised Semantic Segmentation

paper：https://arxiv.org/abs/2208.09910

code：https://github.com/LiheYoung/UniMatch

[7]Foundation Model Drives Weakly Incremental Learning for Semantic Segmentation

paper：https://arxiv.org/abs/2302.14250

实例分割(Instance Segmentation

[1]ISBNet: a 3D Point Cloud Instance Segmentation Network with Instance-aware Sampling and Box-aware Dynamic Convolution

paper：https://arxiv.org/abs/2303.00246

[22]PolyFormer: Referring Image Segmentation as Sequential Polygon Generation(PolyFormer：将图像分割表述为顺序多边形生成

paper：https://arxiv.org/abs/2302.07387

目标跟踪(Object Tracking

[1]Observation-Centric SORT: Rethinking SORT for Robust Multi-Object Tracking

paper：https://arxiv.org/abs/2203.14360v2

code：https://github.com/noahcao/OC_SORT

[2]Focus On Details: Online Multi-object Tracking with Diverse Fine-grained Representation

paper：https://arxiv.org/abs/2302.14589

[3]Referring Multi-Object Tracking

paper：https://arxiv.org/abs/2303.03366

[4]Simple Cues Lead to a Strong Multi-Object Tracker

paper：https://arxiv.org/abs/2206.04656

图像处理(Image Processing

超分辨率(Super Resolution

[1]Denoising Diffusion Probabilistic Models for Robust Image Super-Resolution in the Wild(野外鲁棒图像超分辨率的去噪扩散概率模型

paper：https://arxiv.org/abs/2302.07864

project：https://sihyun.me/PVDM/

图像复原/图像增强/图像重建(Image Restoration/Image Reconstruction

[1]Learning Distortion Invariant Representation for Image Restoration from A Causality Perspective

paper：https://arxiv.org/abs/2303.06859

code：https://github.com/lixinustc/Casual-IRDIL

[2]DR2: Diffusion-based Robust Degradation Remover for Blind Face Restoration

paper：https://arxiv.org/abs/2303.06885

[3]Robust Unsupervised StyleGAN Image Restoration

paper：https://arxiv.org/abs/2302.06733

[4]Raw Image Reconstruction with Learned Compact Metadata

paper：https://arxiv.org/abs/2302.12995

[5]Efficient and Explicit Modelling of Image Hierarchies for Image Restoration

paper：https://arxiv.org/abs/2303.00748

code：https://github.com/ofsoundof/GRL-Image-Restoration

[6]Imagic: Text-Based Real Image Editing with Diffusion Models

paper：https://arxiv.org/abs/2210.09276

project：https://imagic-editing.github.io/

[7]High-resolution image reconstruction with latent diffusion models from human brain activity

paper：https://www.biorxiv.org/content/10.1101/2022.11.18.517004v2

project：https://sites.google.com/view/stablediffusion-with-brain/

[8]Solving 3D Inverse Problems using Pre-trained 2D Diffusion Models

paper：https://arxiv.org/abs/2211.10655

图像去噪/去模糊/去雨去雾(Image Denoising

[1]Uncertainty-Aware Unsupervised Image Deblurring with Deep Residual Prior

paper：https://arxiv.org/abs/2210.05361

[2]Polarized Color Image Denoising using Pocoformer

paper：https://arxiv.org/abs/2207.00215

[3]Blur Interpolation Transformer for Real-World Motion from Blur

paper：https://arxiv.org/abs/2211.11423

code：https://github.com/zzh-tech/BiT

[4]Structured Kernel Estimation for Photon-Limited Deconvolution

paper：https://arxiv.org/abs/2303.03472

code：https://github.com/sanghviyashiitb/structured-kernel-cvpr23

图像编辑/图像修复(Image Edit/Inpainting

[1]LANIT: Language-Driven Image-to-Image Translation for Unlabeled Data

paper：https://arxiv.org/abs/2208.14889

code：https://github.com/KU-CVLAB/LANIT

图像质量评估(Image Quality Assessment

[1]CR-FIQA: Face Image Quality Assessment by Learning Sample Relative Classifiability

paper：https://arxiv.org/abs/2112.06592

[2]Quality-aware Pre-trained Models for Blind Image Quality Assessment

paper：https://arxiv.org/abs/2303.00521

图像配准(Image Registration

[1]Indescribable Multi-modal Spatial Evaluator

paper：https://arxiv.org/abs/2303.00369

code：https://github.com/Kid-Liet/IMSE/pulse

人脸(Face

人脸生成/合成/重建/编辑(Face Generation/Face Synthesis/Face Reconstruction/Face Editing

[1]A Hierarchical Representation Network for Accurate and Detailed Face Reconstruction from In-The-Wild Images

paper：https://arxiv.org/abs/2302.14434

[2]MetaPortrait: Identity-Preserving Talking Head Generation with Fast Personalized Adaptation(MetaPortrait：具有快速个性化适应的身份保持谈话头像生成

paper：https://arxiv.org/abs/2212.08062

code：https://github.com/Meta-Portrait/MetaPortrait

人脸伪造/反欺骗(Face Forgery/Face Anti-Spoofing

[1]Physical-World Optical Adversarial Attacks on 3D Face Recognition

paper：https://arxiv.org/abs/2205.13412

医学影像(Medical Imaging

[1]Deep Feature In-painting for Unsupervised Anomaly Detection in X-ray Images

paper：https://arxiv.org/pdf/2111.13495.pdf

code：https://github.com/tiangexiang/SQUID

[2]Label-Free Liver Tumor Segmentation

paper：https://arxiv.org/pdf/2210.14845.pdf

code：https://github.com/MrGiovanni/SyntheticTumors

图像生成/图像合成(Image Generation/Image Synthesis

[1]DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation

paper：https://arxiv.org/abs/2208.12242

code：https://github.com/PaddlePaddle/PaddleNLP/tree/develop/ppdiffusers/examples/dreambooth

[2]Progressive Open Space Expansion for Open-Set Model Attribution

paper：https://arxiv.org/abs/2303.06877

code：https://github.com/tianyunyoung/pose

[3]Person Image Synthesis via Denoising Diffusion Model

paper：https://arxiv.org/abs/2211.12500

[4]Solving 3D Inverse Problems using Pre-trained 2D Diffusion Models(使用预训练的 2D 扩散模型解决 3D 逆问题

paper：https://arxiv.org/abs/2211.10655

[5]Parallel Diffusion Models of Operator and Image for Blind Inverse Problems(盲反问题算子和图像的并行扩散模型

paper：https://arxiv.org/abs/2211.10656

场景重建/视图合成/新视角合成(Novel View Synthesis

[1]3D Video Loops from Asynchronous Input

paper：https://arxiv.org/abs/2303.05312

code：https://github.com/limacv/VideoLoop3D

[2]NeRFLiX: High-Quality Neural View Synthesis by Learning a Degradation-Driven Inter-viewpoint MiXer

paper：https://arxiv.org/abs/2303.06919

code：https://t.co/uNiTd9ujCv

[3]NeRF-Gaze: A Head-Eye Redirection Parametric Model for Gaze Estimation

paper：https://arxiv.org/abs/2212.14710

[4]Renderable Neural Radiance Map for Visual Navigation

paper：https://arxiv.org/abs/2303.00304

[5]Real-Time Neural Light Field on Mobile Devices

paper：https://arxiv.org/abs/2212.08057

project：https://snap-research.github.io/MobileR2L/

[6]Latent-NeRF for Shape-Guided Generation of 3D Shapes and Textures

paper：https://arxiv.org/abs/2211.07600

code：https://github.com/eladrich/latent-nerf

[7]NoPe-NeRF: Optimising Neural Radiance Field with No Pose Prior

paper：https://arxiv.org/abs/2212.07388

project：https://nope-nerf.active.vision/

多模态学习(Multi-Modal Learning

[1]Align and Attend: Multimodal Summarization with Dual Contrastive Losses

paper：https://arxiv.org/abs/2303.07284

code：https://boheumd.github.io/A2Summ/

[2]Towards All-in-one Pre-training via Maximizing Multi-modal Mutual Information(通过最大化多模态互信息实现一体化预训练

paper：https://arxiv.org/abs/2211.09807

code：https://github.com/OpenGVLab/M3I-Pretraining

[3]Uni-Perceiver v2: A Generalist Model for Large-Scale Vision and Vision-Language Tasks(Uni-Perceiver v2：用于大规模视觉和视觉语言任务的通才模型

paper：https://arxiv.org/abs/2211.09808

code：https://github.com/fundamentalvision/Uni-Perceiver

半监督学习/弱监督学习/无监督学习/自监督学习(Self-supervised Learning/Semi-supervised Learning)

[1]The Dialog Must Go On: Improving Visual Dialog via Generative Self-Training

paper：https://arxiv.org/abs/2205.12502

code：https://github.com/gicheonkang/gst-visdial

[2]Three Guidelines You Should Know for Universally Slimmable Self-Supervised Learning

paper：https://arxiv.org/abs/2303.06870

code：https://github.com/megvii-research/US3L-CVPR2023

[3]Mask3D: Pre-training 2D Vision Transformers by Learning Masked 3D Priors

paper：https://arxiv.org/abs/2302.14746

[4]Siamese Image Modeling for Self-Supervised Vision Representation Learning

paper：https://arxiv.org/abs/2206.01204

code：https://github.com/fundamentalvision/Siamese-Image-Modeling

[5]Cut and Learn for Unsupervised Object Detection and Instance Segmentation

paper：https://arxiv.org/abs/2301.11320

project：http://people.eecs.berkeley.edu

标签：人工智能计算机视觉深度学习