More
    HomeAI Papers

    AI Papers

    The Creativity Paradox: Unlocking AI Diversity by Bypassing Human Bias

    How a simple prompting trick called Verbalized Sampling overcomes the "Typicality Bias" that makes LLMs predictable. The Root Cause: Research identifies "Typicality Bias"—a cognitive psychological tendency...

    Beyond Perception: GLM-4.6V Bridges the Gap Between Visual Understanding and Executable Action

    Introducing a new era of open-source multimodal models featuring native tool use, massive context windows, and real-world agentic capabilities. A Dual-Model Release: The GLM-4.6V series launches...

    Thinking Before Editing: How VideoCoF is Revolutionizing AI Video Generation

    Bridging the gap between precision and flexibility with a novel "Chain-of-Frames" approach. The Precision Paradox: Current video editing AI faces a critical trade-off between expert models...

    Reshaping Reality Through Your Eyes: The EgoEdit Revolution in AI-Driven Augmented Reality

    Moving beyond handcrafted graphics to real-time, text-guided world building. Bridging the AR Gap: EgoEdit addresses the unique challenges of first-person (egocentric) footage—such as rapid motion and...

    Vidi2: The AI That Sees, Edits, and Understands Video Better Than the Giants

    Bytedance’s new model redefines video creation with unprecedented spatio-temporal precision. Beyond Simple Editing: Vidi2 can ingest hours of raw footage and a simple prompt to autonomously...

    SteadyDancer: The Future of Flawless Human Image Animation

    Revolutionizing video generation by solving the "first-frame" problem and harmonizing motion with identity. Paradigm Shift: SteadyDancer moves away from the flawed Reference-to-Video (R2V) model to an...

    Dynalang and the Power of Language-Driven World Modeling

    Bridging Words and Visions to Create Smarter, More Adaptive Agents in a Complex World Unified Prediction Framework: Dynalang redefines AI agents by using language not...

    AI Generation: The Dawn of Self-Regulating Language Models

    How AutoDeco Eliminates Manual Tweaks and Ushers in Truly End-to-End AI Creativity Challenging the Status Quo: Current large language models (LLMs) aren't truly "end-to-end" due...

    LLMs Can Get “Brain Rot”! How Junk Web Data is Poisoning AI’s Mind

    Unraveling the Hidden Dangers of Low-Quality Training Data in the Age of AI The Brain Rot Hypothesis: Researchers propose and test a theory that exposing...

    How Chunk-GRPO Transforms Generation from Step-by-Step to Smarter Chunks

    Unlocking Superior Image Quality and Alignment in Flow-Matching Models Overcoming GRPO's Core Flaws: Traditional Group Relative Policy Optimization (GRPO) excels in flow-matching-based text-to-image (T2I) generation...

    How Open-o3 Video Brings Precision to Dynamic Scenes

    Unlocking Spatio-Temporal Intelligence for Smarter Video Understanding Bridging the Evidence Gap: Open-o3 Video introduces explicit spatio-temporal grounding, highlighting timestamps, objects, and bounding boxes to make...

    Can AI Gamble Away Its Future? Uncovering Addiction-Like Behaviors in Large Language Models

    Exploring How Advanced AI Might Mirror Human Gambling Flaws in High-Stakes Financial Worlds Cognitive Echoes of Addiction: Large language models (LLMs) replicate human gambling distortions...