More
    HomeAI Papers

    AI Papers

    The Blueprint of Life: MIT’s New AI Predicts Embryo Development Minute by Minute

    A breakthrough deep-learning model tracks 5,000 fruit fly cells with 90% accuracy, paving the way for early disease detection in human tissues. A "Dual-Graph" Innovation: MIT...

    ByteDance’s Dolphin-v2 Revolutionizes Document Parsing

    A new 3B parameter model uses a novel "analyze-then-parse" approach to master complex layouts with pixel-level precision. Universal Understanding: Dolphin-v2 is a lightweight (3B parameter) model...

    Through Their Eyes: How EgoX Turns Third-Person Video into Immersive First-Person Reality

    Unlocking the power of immersive storytelling, robotics, and AR by synthesizing realistic egocentric perspectives from standard footage. Immersive Transformation: EgoX is a groundbreaking framework that generates...

    The Creativity Paradox: Unlocking AI Diversity by Bypassing Human Bias

    How a simple prompting trick called Verbalized Sampling overcomes the "Typicality Bias" that makes LLMs predictable. The Root Cause: Research identifies "Typicality Bias"—a cognitive psychological tendency...

    Beyond Perception: GLM-4.6V Bridges the Gap Between Visual Understanding and Executable Action

    Introducing a new era of open-source multimodal models featuring native tool use, massive context windows, and real-world agentic capabilities. A Dual-Model Release: The GLM-4.6V series launches...

    Thinking Before Editing: How VideoCoF is Revolutionizing AI Video Generation

    Bridging the gap between precision and flexibility with a novel "Chain-of-Frames" approach. The Precision Paradox: Current video editing AI faces a critical trade-off between expert models...

    Reshaping Reality Through Your Eyes: The EgoEdit Revolution in AI-Driven Augmented Reality

    Moving beyond handcrafted graphics to real-time, text-guided world building. Bridging the AR Gap: EgoEdit addresses the unique challenges of first-person (egocentric) footage—such as rapid motion and...

    Vidi2: The AI That Sees, Edits, and Understands Video Better Than the Giants

    Bytedance’s new model redefines video creation with unprecedented spatio-temporal precision. Beyond Simple Editing: Vidi2 can ingest hours of raw footage and a simple prompt to autonomously...

    SteadyDancer: The Future of Flawless Human Image Animation

    Revolutionizing video generation by solving the "first-frame" problem and harmonizing motion with identity. Paradigm Shift: SteadyDancer moves away from the flawed Reference-to-Video (R2V) model to an...

    Dynalang and the Power of Language-Driven World Modeling

    Bridging Words and Visions to Create Smarter, More Adaptive Agents in a Complex World Unified Prediction Framework: Dynalang redefines AI agents by using language not...

    AI Generation: The Dawn of Self-Regulating Language Models

    How AutoDeco Eliminates Manual Tweaks and Ushers in Truly End-to-End AI Creativity Challenging the Status Quo: Current large language models (LLMs) aren't truly "end-to-end" due...

    LLMs Can Get “Brain Rot”! How Junk Web Data is Poisoning AI’s Mind

    Unraveling the Hidden Dangers of Low-Quality Training Data in the Age of AI The Brain Rot Hypothesis: Researchers propose and test a theory that exposing...