欢迎光临散文网 会员登陆 & 注册

CVPR'23 最新 89 篇打包下载|涵盖视频目标检测、关键点检测、异常检测等

2023-04-04 11:46 作者:极市平台  | 我要投稿

编辑丨极市平台

CVPR2023已经放榜,今年有2360篇,接收率为25.78%。在CVPR2023正式会议召开前,为了让大家更快地获取和学习到计算机视觉前沿技术,极市对CVPR023 最新论文进行追踪,包括分研究方向的论文、代码汇总以及论文技术直播分享。

CVPR 2023 论文分方向整理目前在极市社区持续更新中,已累计更新了693篇,项目地址:cvmart.net/community/de

以下是最近更新的 CVPR 2023 论文,包含检测、分割、人脸、视频处理、医学影像、神经网络结构、多模态、小样本学习等方向。

打包下载地址: cvmart.net/community/de

2D目标检测(2D Object Detection)

[1]What Can Human Sketches Do for Object Detection?
paper:arxiv.org/abs/2303.1514

视频目标检测(Video Object Detection)

[1]Collaborative Noisy Label Cleaner: Learning Scene-aware Trailers for Multi-modal Highlight Detection in Movies
paper:arxiv.org/abs/2303.1476 code:github.com/tencentyoutu

[2]3D Video Object Detection with Learnable Object-Centric Global Optimization
paper:arxiv.org/abs/2303.1541 code:github.com/jiaweihe1996

3D目标检测(3D object detection)

[1]Learned Two-Plane Perspective Prior based Image Resampling for Efficient Object Detection
paper:arxiv.org/abs/2303.1431

[2]Adaptive Sparse Convolutional Networks with Global Context Enhancement for Faster Object Detection on Drone Images
paper:arxiv.org/abs/2303.1448 code:github.com/cuogeihong/c

[3]Viewpoint Equivariance for Multi-View 3D Object Detection
paper:arxiv.org/abs/2303.1454 code:github.com/tri-ml/vedet

伪装目标检测(Camouflaged Object Detection)

[1]Feature Shrinkage Pyramid for Camouflaged Object Detection with Transformers
paper:arxiv.org/abs/2303.1481 code:github.com/zhouhuang23/

关键点检测(Keypoint Detection)

[1]Unified Keypoint-based Action Recognition Framework via Structured Keypoint Pooling
paper:arxiv.org/abs/2303.1527

异常检测(Anomaly Detection)

[1]WinCLIP: Zero-/Few-Shot Anomaly Classification and Segmentation
paper:arxiv.org/abs/2303.1481

[2]SimpleNet: A Simple Network for Image Anomaly Detection and Localization
paper:arxiv.org/abs/2303.1514 code:github.com/donaldrr/sim

[3]Prompt-Guided Zero-Shot Anomaly Action Recognition using Pretrained Deep Skeleton Features
paper:arxiv.org/abs/2303.1516

图像分割(Image Segmentation)

[1]Parameter Efficient Local Implicit Image Function Network for Face Segmentation
paper:arxiv.org/abs/2303.1512

[2]EFEM: Equivariant Neural Field Expectation Maximization for 3D Object Segmentation Without Scene Supervision
paper:arxiv.org/abs/2303.1544

全景分割(Panoptic Segmentation)

[1]You Only Segment Once: Towards Real-Time Panoptic Segmentation
paper:arxiv.org/abs/2303.1465 code:github.com/hujiecpp/yos

语义分割(Semantic Segmentation)

[1]Both Style and Distortion Matter: Dual-Path Unsupervised Domain Adaptation for Panoramic Semantic Segmentation
paper:arxiv.org/abs/2303.1436

[2]Instant Domain Augmentation for LiDAR Semantic Segmentation
paper:arxiv.org/abs/2303.1437

[3]Leveraging Hidden Positives for Unsupervised Semantic Segmentation
paper:arxiv.org/abs/2303.1501 code:github.com/hynnsk/hp

实例分割(Instance Segmentation)

[1]DoNet: Deep De-overlapping Network for Cytology Instance Segmentation
paper:arxiv.org/abs/2303.1437

[2]The Devil is in the Points: Weakly Semi-Supervised Instance Segmentation via Point-Guided Mask Representation
paper:arxiv.org/abs/2303.1506

视频目标分割(Video Object Segmentation)

[1]Spatio-Temporal Pixel-Level Contrastive Learning-based Source-Free Domain Adaptation for Video Semantic Segmentation
paper:arxiv.org/abs/2303.1436 code:github.com/shaoyuanlo/s

密集预测(Dense Prediction)

[1]Ensemble-based Blackbox Attacks on Dense Prediction
paper:arxiv.org/abs/2303.1430

[2]Ambiguity-Resistant Semi-Supervised Learning for Dense Object Detection
paper:arxiv.org/abs/2303.1496 code:github.com/PaddlePaddle

视频处理(Video Processing)

[1]Affordance Grounding from Demonstration Video to Target Image
paper:arxiv.org/abs/2303.1464 code:github.com/showlab/affo

[2]Frame Flexible Network
paper:arxiv.org/abs/2303.1481 code:github.com/bespontaneou

[3]Joint Video Multi-Frame Interpolation and Deblurring under Unknown Exposure Time
paper:arxiv.org/abs/2303.1504 code:github.com/shangwei5/vi

人体解析/人体姿态估计(Human Parsing/Human Pose Estimation)

[1]ScarceNet: Animal Pose Estimation with Scarce Annotations
paper:arxiv.org/abs/2303.1502 code:github.com/chaneyddtt/s

[2]Human Pose Estimation in Extremely Low-Light Conditions
paper:arxiv.org/abs/2303.1541

超分辨率(Super Resolution)

[1]Learning Generative Structure Prior for Blind Text Image Super-resolution
paper:arxiv.org/abs/2303.1472 code:github.com/csxmli2016/m

[2]Learning to Zoom and Unzoom
paper:arxiv.org/abs/2303.1539

图像复原/图像增强/图像重建(Image Restoration/Image Reconstruction)

[1]Visual-Tactile Sensing for In-Hand Object Reconstruction
paper:arxiv.org/abs/2303.1449

[2]3D-Aware Multi-Class Image-to-Image Translation with NeRFs
paper:arxiv.org/abs/2303.1501 code:github.com/sen-mao/3di2

图像去阴影/去反射(Image Shadow Removal/Image Reflection Removal)

[1]Nighttime Smartphone Reflective Flare Removal Using Optical Center Symmetry Prior
paper:arxiv.org/abs/2303.1504 code:github.com/ykdai/Bracke

图像去噪/去模糊/去雨去雾(Image Denoising)

[1]Curricular Contrastive Regularization for Physics-aware Single Image Dehazing
paper:arxiv.org/abs/2303.1421 code:github.com/yuzheng9/c2p

[2]Spatially Adaptive Self-Supervised Learning for Real-World Image Denoising
paper:arxiv.org/abs/2303.1493 code:github.com/nagejacob/sp

人脸生成/合成/重建/编辑(Face Generation/Face Synthesis/Face Reconstruction/Face Editing)

[1]OTAvatar: One-shot Talking Face Avatar with Controllable Tri-plane Rendering
paper:arxiv.org/abs/2303.1466

[2]High-fidelity 3D Human Digitization from Single 2K Resolution Images
paper:arxiv.org/abs/2303.1510

[3]FaceLit: Neural 3D Relightable Faces
paper:arxiv.org/abs/2303.1543

图像&视频检索/视频理解(Image&Video Retrieval/Video Understanding)

[1]Zero-Shot Everything Sketch-Based Image Retrieval, and in Explainable Style
paper:arxiv.org/abs/2303.1434 code:github.com/buptlinfy/zs

[2]Selective Structured State-Spaces for Long-Form Video Understanding
paper:arxiv.org/abs/2303.1452

行为识别/动作识别/检测/分割/定位(Action/Activity Recognition)

[1]3Mformer: Multi-order Multi-mode Transformer for Skeletal Action Recognition
paper:arxiv.org/abs/2303.1447

行人重识别/检测(Re-Identification/Detection)

[1]Diverse Embedding Expansion Network and Low-Light Cross-Modality Benchmark for Visible-Infrared Person Re-identification
paper:arxiv.org/abs/2303.1448 code:github.com/zyk100/llcm

医学影像(Medical Imaging)

[1]Label-Free Liver Tumor Segmentation
paper:arxiv.org/abs/2303.1486 code:github.com/mrgiovanni/s

[2]Image Quality-aware Diagnosis via Meta-knowledge Co-embedding
paper:arxiv.org/abs/2303.1503

图像生成/图像合成(Image Generation/Image Synthesis)

[1]Unsupervised Domain Adaption with Pixel-level Discriminator for Image-aware Layout Generation
paper:arxiv.org/abs/2303.1437

[2]Freestyle Layout-to-Image Synthesis
paper:arxiv.org/abs/2303.1441 code:github.com/essunny310/f

点云(Point Cloud)

[1]Unsupervised Inference of Signed Distance Functions from Single Sparse Point Clouds without Learning Priors
paper:arxiv.org/abs/2303.1450

[2]NeuralPCI: Spatio-temporal Neural Field for 3D Point Cloud Multi-frame Non-linear Interpolation
paper:arxiv.org/abs/2303.1512 code:github.com/ispc-lab/neu

[3]Recognizing Rigid Patterns of Unlabeled Point Clouds by Complete and Continuous Isometry Invariants with no False Negatives and no False Positives
paper:arxiv.org/abs/2303.1538

三维重建(3D Reconstruction)

[1]PAniC-3D: Stylized Single-view 3D Reconstruction from Portraits of Anime Characters
paper:arxiv.org/abs/2303.1458 code:github.com/shuhongchen/

场景重建/视图合成/新视角合成(Novel View Synthesis)

[1]DyLiN: Making Light Field Networks Dynamic
paper:arxiv.org/abs/2303.1424

[2]FlexNeRF: Photorealistic Free-viewpoint Rendering of Moving Humans from Sparse Views
paper:arxiv.org/abs/2303.1436

[3]NeRF-DS: Neural Radiance Fields for Dynamic Specular Objects
paper:arxiv.org/abs/2303.1443 code:github.com/jokeryan/ner

[4]SUDS: Scalable Urban Dynamic Scenes
paper:arxiv.org/abs/2303.1453

[5]JAWS: Just A Wild Shot for Cinematic Transfer in Neural Radiance Fields
paper:arxiv.org/abs/2303.1542

知识蒸馏(Knowledge Distillation)

[1]Generalization Matters: Loss Minima Flattening via Parameter Hybridization for Efficient Online Knowledge Distillation
paper:arxiv.org/abs/2303.1466

神经网络结构设计(Neural Network Structure Design)

[1]Bridging Precision and Confidence: A Train-Time Loss for Calibrating Object Detection
paper:arxiv.org/abs/2303.1440 code:github.com/akhtarvision

[2]Compacting Binary Neural Networks by Sparse Kernel Selection
paper:arxiv.org/abs/2303.1447

图神经网络(GNN)

[1]Mind the Label Shift of Augmentation-based Graph OOD Generalization
paper:arxiv.org/abs/2303.1485

图像压缩(Image Compression)

[1]Learned Image Compression with Mixed Transformer-CNN Architectures
paper:arxiv.org/abs/2303.1497 code:github.com/jmliu206/lic

模型训练/泛化(Model Training/Generalization)

[1]Active Finetuning: Exploiting Annotation Budget in the Pretraining-Finetuning Paradigm
paper:arxiv.org/abs/2303.1438 code:github.com/yichen928/ac

[2]CFA: Class-wise Calibrated Fair Adversarial Training
paper:arxiv.org/abs/2303.1446 code:github.com/pku-ml/cfa

视觉-语言(Vision-language)

[1]VILA: Learning Image Aesthetics from User Comments with Vision-Language Pretraining
paper:arxiv.org/abs/2303.1430

[2]Video-Text as Game Players: Hierarchical Banzhaf Interaction for Cross-Modal Representation Learning
paper:arxiv.org/abs/2303.1436 code:github.com/jpthu17/HBI

[3]IFSeg: Image-free Semantic Segmentation via Vision-Language Model
paper:arxiv.org/abs/2303.1439 code:github.com/alinlab/ifse

[4]Blind Image Quality Assessment via Vision-Language Correspondence: A Multitask Learning Perspective
paper:arxiv.org/abs/2303.1496 code:github.com/zwx8981/liqe

数据集(Dataset)

[1]CelebV-Text: A Large-Scale Facial Text-Video Dataset
paper:arxiv.org/abs/2303.1471 code:github.com/CelebV-Text/

[2]On the Importance of Accurate Geometry Data for Dense 3D Vision Tasks
paper:arxiv.org/abs/2303.1484 code:github.com/junggy/hamme

[3]Towards Artistic Image Aesthetics Assessment: a Large-scale Dataset and a New Method
paper:arxiv.org/abs/2303.1516 code:github.com/dreemurr-t/b

[4]Recovering 3D Hand Mesh Sequence from a Single Blurry Image: A New Dataset and Temporal Unfolding
paper:arxiv.org/abs/2303.1541 code:github.com/jaehakim97/b

小样本学习/零样本学习(Few-shot Learning/Zero-shot Learning)

[1]Hierarchical Dense Correlation Distillation for Few-Shot Segmentation
paper:arxiv.org/abs/2303.1465

[2]ZBS: Zero-shot Background Subtraction via Instance-level Background Modeling and Foreground Selection
paper:arxiv.org/abs/2303.1467 code:github.com/casia-iva-la

[3]Learning Attention as Disentangler for Compositional Zero-shot Learning
paper:arxiv.org/abs/2303.1511 code:github.com/haoosz/ade-c

[4]Progressive Semantic-Visual Mutual Adaption for Generalized Zero-Shot Learning
paper:arxiv.org/abs/2303.1532 code:github.com/manliucoder/

持续学习(Continual Learning/Life-long Learning)

[1]Preserving Linear Separability in Continual Learning by Backward Feature Projection

paper:arxiv.org/abs/2303.1459

场景图预测(Scene Graph Prediction)

[1]VL-SAT: Visual-Linguistic Semantics Assisted Training for 3D Semantic Scene Graph Prediction in Point Cloud
paper:arxiv.org/abs/2303.1440 code:github.com/wz7in/cvpr20

视觉定位/位姿估计(Visual Localization/Pose Estimation)

[1]Gazeformer: Scalable, Effective and Fast Prediction of Goal-Directed Human Attention
paper:arxiv.org/abs/2303.1527

视觉推理/视觉问答(Visual Reasoning/VQA)

[1]MD-VQA: Multi-Dimensional Quality Assessment for UGC Live Videos
paper:arxiv.org/abs/2303.1493 code:github.com/zzc-1998/md-

迁移学习/domain/自适应(Transfer Learning/Domain Adaptation)

[1]BlackVIP: Black-Box Visual Prompting for Robust Transfer Learning
paper:arxiv.org/abs/2303.1477 code:github.com/changdaeoh/b

对比学习(Contrastive Learning)

[1]Revisiting Multimodal Representation in Contrastive Learning: From Patch and Token Embeddings to Finite Discrete Tokens
paper:arxiv.org/abs/2303.1486

半监督学习/弱监督学习/无监督学习/自监督学习(Self-supervised Learning/Semi-supervised Learning)

[1]Detecting Backdoors in Pre-trained Encoders
paper:arxiv.org/abs/2303.1518 code:github.com/giantseaweed

神经网络可解释性(Neural Network Interpretability)

[1]IDGI: A Framework to Eliminate Explanation Noise from Integrated Gradients
paper:arxiv.org/abs/2303.1424 code:github.com/yangruo1226/

联邦学习(Federated Learning)

[1]The Resource Problem of Using Linear Layer Leakage Attack in Federated Learning
paper:arxiv.org/abs/2303.1486

其他

[1]DeepVecFont-v2: Exploiting Transformers to Synthesize Vector Fonts with Higher Quality
paper:arxiv.org/abs/2303.1458 code:github.com/yizhiwang96/

[2]PDPP:Projected Diffusion for Procedure Planning in Instructional Videos
paper:arxiv.org/abs/2303.1467

[3]Disentangling Writer and Character Styles for Handwriting Generation
paper:arxiv.org/abs/2303.1473 code:github.com/dailenson/sd

[4]Continuous Intermediate Token Learning with Implicit Motion Manifold for Keyframe Based Motion Interpolation
paper:arxiv.org/abs/2303.1492

[5]DANI-Net: Uncalibrated Photometric Stereo by Differentiable Shadow Handling, Anisotropic Reflectance Modeling, and Neural Inverse Rendering
paper:arxiv.org/abs/2303.1510 code:github.com/lmozart/cvpr

[6]Multi-Granularity Archaeological Dating of Chinese Bronze Dings Based on a Knowledge-Guided Relation Graph
paper:arxiv.org/abs/2303.1526 code:github.com/zhourixin/br

[7]Handwritten Text Generation from Visual Archetypes
paper:arxiv.org/abs/2303.1526 code:github.com/aimagelab/va


CVPR'23 最新 89 篇打包下载|涵盖视频目标检测、关键点检测、异常检测等的评论 (共 条)

分享到微博请遵守国家法律