More
    HomeAI Papers

    AI Papers

    Yume: Dreaming Worlds into Existence

    AI-Driven Exploration with Interactive, Infinite Realities from a Single Image Innovative Framework: Yume introduces a preview version of an interactive world generation model that transforms...

    Everyday Assistance: GR-3 and the Dawn of Generalist Robots

    ByteDance's Cutting-Edge VLA Model Promises Smarter, More Adaptable Machines for Real-World Tasks Breakthrough in Generalization: GR-3 excels at handling novel objects, environments, and abstract instructions,...

    Digital Faces: FantasyPortrait’s Leap in Expressive Multi-Character Animation

    Unveiling a New Era of AI-Powered Portrait Magic with Diffusion Transformers Overcoming Animation Hurdles: FantasyPortrait tackles the longstanding challenges in creating expressive facial animations from...

    Gemini 2.5: Redefining AI with Cutting-Edge Brilliance

    Unleashing Advanced Reasoning, Multimodality, and Agentic Power in the Next-Gen AI Frontier The Gemini 2.X family, including Gemini 2.5 Pro and Flash, alongside Gemini 2.0...

    HalluSegBench: Unmasking the Mirage in Visual Segmentation

    A New Benchmark to Challenge Vision-Language Models with Counterfactual Reasoning HalluSegBench introduces a pioneering benchmark to evaluate hallucinations in vision-language segmentation models, using a novel...

    Matrix-Game: Revolutionizing Interactive Game World Generation

    A Breakthrough World Foundation Model for Controllable Minecraft Environments Innovative Model Introduction: Matrix-Game is a cutting-edge interactive world foundation model with over 17 billion parameters, designed...

    Unraveling the Digital Playground: Generative AI’s Impact on Children

    Navigating the Promises and Perils of AI in Young Lives Generative AI is increasingly integrated into children's lives through tools like ChatGPT and Dall-E, with...

    Astra: Robot Navigation with Smart, Adaptive Tech

    How Hierarchical Multimodal Learning Powers the Future of Mobile Robots Astra introduces a groundbreaking dual-model architecture, Astra-Global and Astra-Local, to tackle the challenges of robot...

    Crafting the Future: PartCrafter’s Pioneering Approach to 3D Mesh Innovation

    Transforming Single Images into Decomposable 3D Models with Unprecedented Precision PartCrafter introduces a groundbreaking approach to 3D modeling by generating multiple semantically meaningful and geometrically...

    ComfyUI-Copilot: AI Art Creation with Intelligent Automation

    Streamlining Workflow Development for Beginners and Experts Alike ComfyUI-Copilot is an innovative, large language model (LLM)-powered plugin designed to simplify the complexities of ComfyUI, an...

    Video Creation: The Power of Temporal In-Context Fine-Tuning

    Unlocking Versatile Control in Video Diffusion Models with TIC-FT Temporal In-Context Fine-Tuning (TIC-FT) introduces a groundbreaking, efficient method for adapting pretrained video diffusion models to...

    DexUMI: Robotics with the Human Hand as the Ultimate Interface

    Bridging the Embodiment Gap for Dexterous Robot Manipulation DexUMI is a groundbreaking framework that uses the human hand as a universal interface to transfer dexterous...