More
    HomeAI Papers

    AI Papers

    EMMA Image Generation with Multi-Modal Prompts

    How the New AI Model Balances Text and Visual Inputs for Superior Results EMMA integrates multi-modal prompts, combining text with additional visual cues for image...

    Can AI Understand Commonsense?

    Challenging Text-to-Image Models with Real-Life Scenarios Commonsense-T2I evaluates if text-to-image models can produce images based on common sense. Current state-of-the-art models struggle with accuracy, highlighting a...

    Physics3D: 3D Object Simulation with Video Diffusion Models

    Bridging the Gap Between Real and Virtual Physics Physics3D integrates physical properties into 3D object modeling for realistic simulations. The method utilizes a video diffusion model...

    Imitative Editing: Transforming Image Editing with MimicBrush

    Image Editing with AI-Powered Imitative Techniques MimicBrush introduces "imitative editing," allowing users to edit images using reference images without the need for detailed text descriptions. The...

    Microsoft presents VALL-E 2: The Next Step in Zero-Shot Text-to-Speech Synthesis

    Achieving Human Parity with Advanced Neural Codec Language Models Human Parity Achieved: VALL-E 2 marks the first instance of achieving human parity in zero-shot text-to-speech synthesis. Enhanced...

    Advancing AI Art: Multistep Consistency Distillation of Latent Diffusion Models

    New Model Enhances Image Synthesis Speed and Quality Unified Model: MLCM offers a single model for various sampling steps, improving efficiency. Progressive Training: Enhances inter-segment consistency, boosting image...

    Snapchat presents SF-V: Single Forward Video Generation Model Video Synthesis

    Adversarial training reduces computational costs while maintaining high-quality video generation. Efficiency Boost: The new SF-V model achieves video generation in a single step, significantly speeding up...

    Future You: AI-Generated Future Self Chats Reduce Anxiety and Boost Wellbeing

    Interactive Conversations with AI-Generated Future Selves Enhance Mental Health AI-Powered Future Self: The "Future You" intervention uses AI to create a realistic, interactive conversation with...

    MatMul-Free Models: A New Frontier in Efficient Language Processing

    Eliminating Matrix Multiplication in Language Models Reduces Computational Costs While Maintaining Performance Significant Memory Savings: MatMul-free models reduce memory usage by up to 61% during...

    SketchDeco: Simplifying Sketch Colorization with AI

    New AI tool SketchDeco simplifies the process of adding color to black-and-white sketches, combining precision with user-friendly design. Intuitive Control with Region Masks and Color...

    VideoTetris: Text-to-Video Generation with Compositional Prompts

    New AI model VideoTetris tackles the challenge of generating complex, long-form videos from text prompts, offering improved spatial and temporal composition. Enhanced Video Generation: VideoTetris...

    Photo-Inspired Diffusion Operators: A New Approach in Visual Content Generation

    Leveraging the Semantic Power of CLIP for Enhanced Image Manipulation Introduction of pOps Framework: pOps trains specific semantic operators directly on CLIP image embeddings, allowing...