More
    HomeAI Papers

    AI Papers

    TexGen: 3D Texture Generation with Multi-view Sampling

    Innovative Framework Enhances Texture Quality and Consistency for 3D Meshes Seamless Textures: TexGen eliminates prominent seams and excessive smoothing in 3D textures using a multi-view sampling...

    Tora: Video Generation with Trajectory-Oriented Diffusion Transformers

    Exploring Tora’s Potential in Motion-Controllable Video Creation Innovative Framework: Tora integrates text, image, and trajectory inputs for precise motion-controlled video generation. High Fidelity: Achieves high-quality video output with...

    The Llama 3 Herd of Models

    Multilinguality, Coding, Reasoning, and Tool Usage in a New Set of AI Foundation Models Llama 3's Capabilities: The Llama 3 models support multilinguality, coding, reasoning, and...

    Cycle3D: High-quality and Consistent Image-to-3D Generation

    Advancing 3D Content Creation through a Generation-Reconstruction Cycle Cycle3D combines 2D diffusion-based generation with 3D reconstruction for superior image-to-3D conversion. The framework enhances the quality and...

    Theia: Distilling Diverse Vision Foundation Models for Robot Learning

    Enhancing Robot Learning with Rich Visual Representations Theia leverages multiple vision foundation models to improve robot learning. The model outperforms previous approaches with less training data...

    Text2Place: Affordance Aware Human Guided Placement

    Advancing Realistic Human Insertion in Diverse Backgrounds Text2Place generates realistic human placements in various scenes using text guidance. The method utilizes semantic masks and subject-conditioned inpainting...

    HoloDreamer: Transforming Text into 3D Panoramic Worlds

    Advancing 3D Scene Generation with Holistic Text-to-Image Models HoloDreamer generates highly consistent 3D panoramic scenes from text descriptions. The framework combines multiple diffusion models with 3D...

    AccDiffusion: An Accurate Method for Higher-Resolution Image Generation

    Solving Object Repetition in High-Resolution Image Generation AccDiffusion addresses the issue of object repetition in patch-wise higher-resolution image generation. The method uses patch-content-aware prompts and dilated...

    Diffree: Text-Guided Shape Free Object Inpainting with Diffusion Model

    Innovating Object Addition in Images with Text Guidance Alone Diffree enables seamless text-guided object addition without compromising background consistency. The model leverages the OABench dataset, enhancing...

    ViPer: Visual Personalization of Generative Models via Individual Preference Learning

    Tailoring AI-Generated Images to Individual Tastes ViPer personalizes image generation by capturing and applying individual visual preferences. The system uses user comments to infer visual likes...

    HumanVid: Demystifying Training Data for Camera-Controllable Human Image Animation

    A New Benchmark for Realistic Human Image Animation and Camera Control HumanVid introduces the first large-scale, high-quality dataset tailored for human image animation, combining real-world...

    T2V-CompBench: Setting a New Standard for Compositional Text-to-Video Generation

    Introducing the First Comprehensive Benchmark for Complex Video Generation from Text Prompts T2V-CompBench offers the first benchmark tailored for compositional text-to-video generation. The benchmark includes diverse...