More
    HomeAI Papers

    AI Papers

    AI Generation: The Dawn of Self-Regulating Language Models

    How AutoDeco Eliminates Manual Tweaks and Ushers in Truly End-to-End AI Creativity Challenging the Status Quo: Current large language models (LLMs) aren't truly "end-to-end" due...

    LLMs Can Get “Brain Rot”! How Junk Web Data is Poisoning AI’s Mind

    Unraveling the Hidden Dangers of Low-Quality Training Data in the Age of AI The Brain Rot Hypothesis: Researchers propose and test a theory that exposing...

    How Chunk-GRPO Transforms Generation from Step-by-Step to Smarter Chunks

    Unlocking Superior Image Quality and Alignment in Flow-Matching Models Overcoming GRPO's Core Flaws: Traditional Group Relative Policy Optimization (GRPO) excels in flow-matching-based text-to-image (T2I) generation...

    How Open-o3 Video Brings Precision to Dynamic Scenes

    Unlocking Spatio-Temporal Intelligence for Smarter Video Understanding Bridging the Evidence Gap: Open-o3 Video introduces explicit spatio-temporal grounding, highlighting timestamps, objects, and bounding boxes to make...

    Can AI Gamble Away Its Future? Uncovering Addiction-Like Behaviors in Large Language Models

    Exploring How Advanced AI Might Mirror Human Gambling Flaws in High-Stakes Financial Worlds Cognitive Echoes of Addiction: Large language models (LLMs) replicate human gambling distortions...

    Apriel-1.5-15B-Thinker: Mid-Training is All You Need

    Revolutionizing AI Reasoning with Smarter Design, Not Bigger Scale Progressive Training Pipeline: Starting from the Pixtral-12B base, it employs depth upscaling, staged continual pre-training on...

    MCPMark Puts Large Language Models to the Ultimate Test

    Pushing the Boundaries of AI Interaction in a World of Complex Workflows Realistic Benchmarking for Real-World Challenges: MCPMark introduces 127 expertly crafted tasks that simulate...

    Video Storytelling: LONGLIVE Ushers in Real-Time Interactive Long Video Generation

    Breaking Barriers in AI-Driven Content Creation with Autoregressive Efficiency and User-Controlled Narratives Overcoming Key Challenges: LONGLIVE addresses the efficiency and quality hurdles in long video...

    AI’s Grasp on Reality: OmniWorld Ushers in a New Era of 4D World Modeling

    Discover how a groundbreaking dataset is bridging the gap between static data and dynamic worlds, empowering machines to predict, reconstruct, and interact with our...

    Does DINOv3 Revolutionize Medical Imaging?

    A Comprehensive Benchmark on 2D/3D Classification and Segmentation Impressive Transferability: DINOv3, trained solely on natural images, delivers outstanding performance in medical vision tasks like CT...

    AuriStream: Echoing the Human Ear in AI Speech Revolution

    Bridging Biology and Bytes: How a New Model is Redefining Speech Processing with Cochlear Magic Mimicking Nature's Blueprint: AuriStream introduces a two-stage framework inspired by...

    Digital Hair: Sketch Your Way to Realistic Strands

    Unleashing Creativity in Computer Graphics with Intuitive, Precision-Driven Hair Generation Precision Meets User-Friendliness: Traditional text or image-based methods for generating hair strands often fall short...