Papers With Code 2 | ML Benchmarks, SotA Results & Code

OpenS2V-Eval introduces 180 prompts from seven major categories of S2V, which incorporate both real and synthetic test data. Furthermore, to accurately align human preferences with S2V benchmarks, we propose three automatic metrics: NexusScore, NaturalScore, GmeScore to separately quantify subject consistency, naturalness, and text relevance in generated videos. Building on this, we conduct a comprehensive evaluation of 14 representative S2V models, highlighting their strengths and weaknesses across different content.