More
    HomeAI Papers

    AI Papers

    Fast Video Generation with Sliding Tile Attention

    Revolutionizing Video Diffusion with Efficiency and Speed Sliding Tile Attention (STA) drastically reduces the computational cost of video generation in Diffusion Transformers (DiTs) by focusing...

    OmniHuman-1: The Future of AI-Generated Human Animation

    Can ByteDance’s Breakthrough Outperform OpenAI’s Sora and Google’s Veo? OmniHuman-1 is a revolutionary AI model that transforms a single image into a lifelike video of a...

    AI Predicts Cancer Outcomes Using Clinical Notes and Genomic Data

    How Artificial Intelligence is Transforming Cancer Prognosis and Treatment AI-powered models using clinical notes and genomic data can predict cancer survival and treatment outcomes with...

    ACECODER: Acing Coder RL via Automated Test-Case Synthesis

    Code Model Training with Reinforcement Learning and Automated Test-Case Generation Unlocking RL Potential in Code Models: ACECODER addresses the untapped potential of reinforcement learning (RL)...

    Meta: Transforming Text-to-Image Customization with Synthetic Data

    How Multi-Image Synthetic Data and Shared Attention Mechanisms Are Redefining AI-Generated Imagery Synthetic Dataset Innovation: A new Synthetic Customization Dataset (SynCD) leverages 3D assets and...

    Virus: The Silent Threat to AI Safety – How Harmful Fine-Tuning Attacks Bypass Guardrails

    Large Language Models Are Vulnerable to Stealthy Attacks That Undermine Safety Alignment Guardrails Aren’t Enough: Despite guardrail moderation systems designed to filter harmful data, a...

    DIFFSPLAT: 3D Content Creation by Bridging 2D and 3D Worlds

    How Image Diffusion Models Are Now Powering Scalable, High-Fidelity 3D Gaussian Splat Generation 2D to 3D Leap: DiffSplat repurposes massive image diffusion models to generate...

    AI Takes the Director’s Chair: How Multi-Agent Systems Are Film Production

    From Script to Screen: The Rise of LLM-Powered Virtual Filmmaking Automated Creativity: FilmAgent leverages AI agents to mimic traditional film crew roles, streamlining tasks from...

    Video Depth Anything: A Breakthrough in Long-Video Depth Estimation

    ByteDance introduces a state-of-the-art solution for consistent, high-quality depth estimation in extended videos. Long-Video Capability: Video Depth Anything tackles the challenge of estimating depth for videos...

    Amphion: Unlocking Creativity in Audio and Music

    An open-source toolkit making advanced audio generation accessible to all. Accessible Innovation: Amphion simplifies audio, music, and speech generation for beginners and experts alike. Comprehensive Tools: It supports...

    DeepSeek R1: Open-Source AI That Challenges the Status Quo

    China's DeepSeek R1 delivers state-of-the-art reasoning at a fraction of the cost, setting a new standard for open-source AI innovation. Revolutionary Reasoning: DeepSeek R1, trained...

    Textoon: Crafting 2D Cartoon Magic in Seconds with Text Prompts

    A breakthrough framework for generating vibrant Live2D cartoon characters using the power of AI-driven text-to-image technology. Innovative Character Generation: Textoon transforms textual descriptions into vivid...