Physics-aware Video Instance Removal Please download the dataset from GIVE-Challenge-Dataset
This challenge is jointly organized by Texas A&M University, Visko Platform, and Abaka AI.
📅 Important Dates
🔍 Challenge Overview
The 1st Workshop on Video Generative Models: Benchmarks and Evaluation (VGBE) will be held in June 2026 in conjunction with CVPR 2026.
Recent advances in video generative models, such as Sora, Veo, and Wan, have demonstrated an unprecedented ability to generate high-fidelity content. However, moving toward practical workflows requires pushing video editing beyond simple object deletion. In real-world scenarios, removing an object is complex because objects interact dynamically with their surroundings. A truly realistic removal requires modeling and regenerating these environmental interactions—such as shadows, reflections, water ripples, or secondary motion propagation—to maintain physical plausibility.
This challenge focuses on physics-aware video restoration. Unlike traditional inpainting, participants must ensure that the "void" left by a removed object is filled with content that is not only visually consistent but also physically coherent with the rest of the scene. This requires a deep semantic understanding of how objects influence their environment through lighting, physics, and geometry.
Hosting this challenge accelerates the development of models capable of sophisticated, physically-grounded video manipulation. It provides a standardized benchmark to evaluate how effectively these systems can restore complex environments while maintaining perfect temporal stability.
The top-ranked participants will be awarded and invited to describe their solution to the associated VGBE workshop at CVPR 2026. The results of the challenge will be published in the VGBE 2026 workshop (CVPR Proceedings).
📋 Task Definition
Task: Physics-aware Video Instance Removal
Given an Input Video, a Text Prompt describing the object, and a Segmentation Mask, the model must generate a video that:
- Physically Aware: Realistically reflects physical changes (shadows, ripples, etc.) caused by the object's removal.
- Temporally Coherent: Maintains stability and visual realism across all frames without flickering.
- Exclusive in Editing: Preserves all unrelated regions of the video perfectly.
Output Specifications
To ensure fairness and standardized evaluation, all submissions must adhere to the following technical constraints:
- Frames: The generated video sequence must have strictly the same number of frames as the original video.
- Resolution:
- Minimum: 480p (e.g., $854 \times 480$).
- Recommended: 720p (e.g., $1280 \times 720$) or higher.
- Aspect Ratio: The output video must preserve the aspect ratio of the input video. Cropping or distorting the input aspect ratio will result in significant score deductions.
Recommended Baselines / Architectures
We encourage participants to explore or build upon recent efficient architectures, such as:
- DiffuEraser: A Diffusion Model for Video Inpainting
- ROSE: Remove Objects with Side Effects in Videos
- Any closed-source or open-source model / pipeline is welcome.
📊 Evaluation
The evaluation process consists of two primary components:
- Automated Evaluation (VBench): We utilize VBench to provide an objective assessment of video quality and perceptual fidelity.
- Human Evaluation: A panel of experts will score each entry across four key
dimensions:
- Physical Awareness (55%): Realism of physical restoration (shadows, reflections, etc.).
- Instruction Following (15%): Is the correct object removed as specified?
- Rendering Quality (15%): Is the video visually and temporally coherent?
- Exclusivity of Edit (15%): Are unrelated regions preserved without artifacts?
Human Evaluation Score: Calculated as the weighted sum of the four dimensions above.
Final Score Calculation
To balance objective performance with human-centric quality, the final ranking is determined by:
🏆 Awards
We have established a total prize pool of $1,000 USD:
Highest Score Award (Champion)
$500 USD
+ Award Certificate
Innovation Award
$500 USD
+ Award Certificate
Recognizes technically novel or methodologically inspiring contributions. A technical report is required.
📧 Issues & Contact
- Technical Discussions: Please utilize the community forum on the official challenge page.
- Inquiries: Contact the organizing committee at tcve-cvpr-2026@googlegroups.com.