HomeAI Papers

AI Papers

Meta: Transforming Text-to-Image Customization with Synthetic Data

How Multi-Image Synthetic Data and Shared Attention Mechanisms Are Redefining AI-Generated Imagery Synthetic Dataset Innovation: A new Synthetic Customization Dataset (SynCD) leverages 3D assets and...

Virus: The Silent Threat to AI Safety – How Harmful Fine-Tuning Attacks Bypass Guardrails

Large Language Models Are Vulnerable to Stealthy Attacks That Undermine Safety Alignment Guardrails Aren’t Enough: Despite guardrail moderation systems designed to filter harmful data, a...

DIFFSPLAT: 3D Content Creation by Bridging 2D and 3D Worlds

How Image Diffusion Models Are Now Powering Scalable, High-Fidelity 3D Gaussian Splat Generation 2D to 3D Leap: DiffSplat repurposes massive image diffusion models to generate...

AI Takes the Director’s Chair: How Multi-Agent Systems Are Film Production

From Script to Screen: The Rise of LLM-Powered Virtual Filmmaking Automated Creativity: FilmAgent leverages AI agents to mimic traditional film crew roles, streamlining tasks from...

Video Depth Anything: A Breakthrough in Long-Video Depth Estimation

ByteDance introduces a state-of-the-art solution for consistent, high-quality depth estimation in extended videos. Long-Video Capability: Video Depth Anything tackles the challenge of estimating depth for videos...

Amphion: Unlocking Creativity in Audio and Music

An open-source toolkit making advanced audio generation accessible to all. Accessible Innovation: Amphion simplifies audio, music, and speech generation for beginners and experts alike. Comprehensive Tools: It supports...

DeepSeek R1: Open-Source AI That Challenges the Status Quo

China's DeepSeek R1 delivers state-of-the-art reasoning at a fraction of the cost, setting a new standard for open-source AI innovation. Revolutionary Reasoning: DeepSeek R1, trained...

Textoon: Crafting 2D Cartoon Magic in Seconds with Text Prompts

A breakthrough framework for generating vibrant Live2D cartoon characters using the power of AI-driven text-to-image technology. Innovative Character Generation: Textoon transforms textual descriptions into vivid...

X-Dyna: Redefining Human Animation with Expressive Dynamics and Realistic Motion

A zero-shot pipeline brings lifelike human animation with dynamic details and enhanced realism. Innovative Animation Pipeline: X-Dyna integrates facial expressions, body movements, and environmental dynamics...

The Future of Virtual Try-On: A Game-Changing Single-Network Approach

How a Modality-Specific Normalization Strategy is Redefining Scalable and High-Quality Virtual Try-On Revolutionizing Virtual Try-On (VTON): A new single-network paradigm eliminates the need for dual networks,...

SynthLight: Adobe’s AI Breakthrough in Portrait Relighting

Using diffusion models and synthetic data, Adobe's SynthLight redefines portrait relighting with realistic, dynamic lighting effects. Revolutionary Relighting: SynthLight uses AI-driven diffusion models to simulate...

AnyStory: Alibaba’s Breakthrough in Unified Text-to-Image Personalization

A revolutionary approach to single and multi-subject text-to-image generation that retains fidelity and aligns seamlessly with textual input. Unified Approach: AnyStory introduces a unified method...