Quick Overview: Adaptive Spatial-Temporal Window: Unlocking the Potential of Event Cameras in Heterogeneous Velocity Scenarios Zhipeng Sui, ... Disentangle-then-Align: Non-Iterative Hybrid Multimodal Image Registration via Cross-Scale Feature Disentanglement. OMG-Bench: A New Challenging Benchmark for Skeleton-based Online Micro Hand Gesture Recognition (
Cvpr 2026 Tape - Detailed Overview & Context
Adaptive Spatial-Temporal Window: Unlocking the Potential of Event Cameras in Heterogeneous Velocity Scenarios Zhipeng Sui, ... Disentangle-then-Align: Non-Iterative Hybrid Multimodal Image Registration via Cross-Scale Feature Disentanglement. OMG-Bench: A New Challenging Benchmark for Skeleton-based Online Micro Hand Gesture Recognition ( In this video, we introduce a novel video object detection framework called D2FANet. D2FANet is the first framework to jointly ... NeuroFlow: Toward Unified Visual Encoding and Decoding from Neural Activity. MixerCSeg: An Efficient Mixer Architecture for Crack Segmentation via Decoupled Mamba Attention.
[CVPR 2026] Hear What You See: Video-to-Audio Generation with Diffusion Transformer and STAR-DPO Title:MU-GeNeRF: Multi-view Uncertainty-guided Generalizable Neural Radiance Fields for Distractor-aware Scene ... AnchorSplat: Feed-Forward 3D Gaussian Splatting with 3D Geometric Priors Paper: Authors: Matteo Ballegeer, Dries F. Benoit Paper: Google Scholar: ... Towards Open-Vocabulary Industrial Defect Understanding with a Large-Scale Multimodal Dataset. Video2Robo: 3DGS-based Synthetic Data from One Video Enables Scalable Robot Learning Project page: ...