HomeAI Papers

AI Papers

Video Storytelling: LONGLIVE Ushers in Real-Time Interactive Long Video Generation

Breaking Barriers in AI-Driven Content Creation with Autoregressive Efficiency and User-Controlled Narratives Overcoming Key Challenges: LONGLIVE addresses the efficiency and quality hurdles in long video...

AI’s Grasp on Reality: OmniWorld Ushers in a New Era of 4D World Modeling

Discover how a groundbreaking dataset is bridging the gap between static data and dynamic worlds, empowering machines to predict, reconstruct, and interact with our...

Does DINOv3 Revolutionize Medical Imaging?

A Comprehensive Benchmark on 2D/3D Classification and Segmentation Impressive Transferability: DINOv3, trained solely on natural images, delivers outstanding performance in medical vision tasks like CT...

AuriStream: Echoing the Human Ear in AI Speech Revolution

Bridging Biology and Bytes: How a New Model is Redefining Speech Processing with Cochlear Magic Mimicking Nature's Blueprint: AuriStream introduces a two-stage framework inspired by...

Digital Hair: Sketch Your Way to Realistic Strands

Unleashing Creativity in Computer Graphics with Intuitive, Precision-Driven Hair Generation Precision Meets User-Friendliness: Traditional text or image-based methods for generating hair strands often fall short...

The GHz Spiking Photonic Chip That Mimics the Brain

Breaking Barriers in Neuromorphic Computing with Ultra-Fast, Event-Driven Intelligence Pioneering Photonic Innovation: This breakthrough introduces the first silicon-compatible photonic spiking neural network (PSNN) chip, featuring...

Magentic-UI: AI Agents with Human Oversight

Bridging the Gap Between AI Autonomy and Human Control for Safer, Smarter Collaboration Empowering Human-AI Synergy: Magentic-UI introduces a human-in-the-loop approach that combines AI's efficiency...

Digital Realms: HunyuanWorld 1.0 Brings Words and Images to Life in 3D

A Game-Changing Framework That Merges Creativity and Technology for Immersive, Explorable Worlds Bridging the Gap in 3D Generation: HunyuanWorld 1.0 overcomes the limitations of traditional...

Yume: Dreaming Worlds into Existence

AI-Driven Exploration with Interactive, Infinite Realities from a Single Image Innovative Framework: Yume introduces a preview version of an interactive world generation model that transforms...

Everyday Assistance: GR-3 and the Dawn of Generalist Robots

ByteDance's Cutting-Edge VLA Model Promises Smarter, More Adaptable Machines for Real-World Tasks Breakthrough in Generalization: GR-3 excels at handling novel objects, environments, and abstract instructions,...

Digital Faces: FantasyPortrait’s Leap in Expressive Multi-Character Animation

Unveiling a New Era of AI-Powered Portrait Magic with Diffusion Transformers Overcoming Animation Hurdles: FantasyPortrait tackles the longstanding challenges in creating expressive facial animations from...

Gemini 2.5: Redefining AI with Cutting-Edge Brilliance

Unleashing Advanced Reasoning, Multimodality, and Agentic Power in the Next-Gen AI Frontier The Gemini 2.X family, including Gemini 2.5 Pro and Flash, alongside Gemini 2.0...