Computer Vision – ECCV 2022

Computer Vision – ECCV 2022（Lecture Notes in Computer Science）

第17届欧洲计算机视觉会议 2022 / 会议录第35部分
计算机科学技术基础学科

原价：

￥ 1082.5

售价：

优惠

平台大促低至8折优惠

发货周期：通常付款后3-5周到货！

作者

Shai Avidan Gabriel Brostow Moustapha Cissé

出版社

Springer Berlin Heidelberg

出版时间

2022年11月21日

装帧

平装

ＩＳＢＮ

9783031198328

复制

页码

745

语种

英文

综合评分

暂无评分

图书详情
目次
买家须知
书评（0）
权威书评（0）

Efficient One-Stage Video Object Detection by Exploiting Temporal Consistency.- Leveraging Action Affinity and Continuity for Semi-Supervised Temporal Action Segmentation.- Spotting Temporally Precise, Fine-Grained Events in Video.- Unified Fully and Timestamp Supervised Temporal Action Segmentation via Sequence to Sequence Translation.- Efficient Video Transformers with Spatial-Temporal Token Selection.- Long Movie Clip Classification with State-Space Video Models.- Prompting Visual-Language Models for Efficient Video Understanding.- Asymmetric Relation Consistency Reasoning for Video Relation Grounding.- Self-Supervised Social Relation Representation for Human Group Detection.- K-Centered Patch Sampling for Efficient Video Recognition.- A Deep Moving-Camera Background Model.- GraphVid: It Only Takes a Few Nodes to Understand a Video.- Delta Distillation for Efficient Video Processing.- MorphMLP: An Efficient MLP-Like Backbone for Spatial-Temporal Representation Learning.- COMPOSER: Compositional Reasoning of Group Activity in Videos with Keypoint-Only Modality.- E-NeRV: Expedite Neural Video Representation with Disentangled Spatial-Temporal Context.- TDViT: Temporal Dilated Video Transformer for Dense Video Tasks.- Semi-Supervised Learning of Optical Flow by Flow Supervisor.- Flow Graph to Video Grounding for Weakly-Supervised Multi-step Localization.- Deep 360° Optical Flow Estimation Based on Multi-Projection Fusion.- MaCLR: Motion-Aware Contrastive Learning of Representations for Videos.- Learning Long-Term Spatial-Temporal Graphs for Active Speaker Detection.- Frozen CLIP Models Are Efficient Video Learners.- PIP: Physical Interaction Prediction via Mental Simulation with Span Selection.- Panoramic Vision Transformer for Saliency Detection in 360° Videos.- Bayesian Tracking of Video Graphs Using Joint Kalman Smoothing and Registration.- Motion Sensitive Contrastive Learning for Self-Supervised Video Representation.- Dynamic Temporal Filtering In Video Models.- Tip-Adapter: Training-Free Adaption of CLIP for Few-Shot Classification.- Temporal Lift Pooling for Continuous Sign Language Recognition.- MORE: Multi-Order RElation Mining for Dense Captioning in 3D Scenes.- SiRi: A Simple Selective Retraining Mechanism for Transformer-Based Visual Grounding.- Cross-Modal Prototype Driven Network for Radiology Report Generation.- TM2T: Stochastic and Tokenized Modeling for the Reciprocal Generation of 3D Human Motions and Texts.- SeqTR: A Simple Yet Universal Network for Visual Grounding.- VTC: Improving Video-Text Retrieval with User Comments.- FashionViL: Fashion-Focused Vision-and-Language Representation Learning.- Weakly Supervised Grounding for VQA in Vision-Language Transformers.- Automatic Dense Annotation of Large-Vocabulary Sign Language Videos.- MILES: Visual BERT Pre-training with Injected Language Semantics for Video-Text Retrieval.- GEB : A Benchmark for Generic Event Boundary Captioning, Grounding and Retrieval.- A Simple and Robust Correlation Filtering Method for Text-Based Person Search.

Trade Policy 买家须知