More
    HomeAI Papers

    AI Papers

    The Mamba in the Llama: Hybrid Models for Speed and Efficiency

    How Distilling Transformers into Linear RNNs Enhances Performance and Speeds Up Inference. Distilling Transformers into Linear RNNs: The research demonstrates that it is possible to...

    MagicMan: 3D Human Creation with AI-Powered View Synthesis

    Discover how MagicMan's innovative AI brings life to 3D humans from just a single image, offering unparalleled quality and consistency in digital reconstruction. Single-Image to...

    Sapiens from Meta: Redefining Human Vision Models for the Future of AI

    How Sapiens is transforming human-centric AI with groundbreaking performance in 2D pose estimation, depth, segmentation, and more. Comprehensive Human Vision Models: Sapiens offers a suite of...

    DreamCinema: The Future of Filmmaking is Here with AI-Powered Cinematic Transfers

    How DreamCinema is revolutionizing film creation by allowing anyone to create high-quality 3D movies with free cameras and AI-generated characters. Cinematic Elements at Your Fingertips: DreamCinema...

    SpaRP: 3D Object Reconstruction from Sparse Views

    Fast, accurate, and user-controlled 3D modeling with SpaRP—making 3D content creation easier than ever. Fast 3D Reconstruction: SpaRP reconstructs 3D textured models from sparse, unposed images...

    MeshFormer: 3D Mesh Generation with Efficient, High-Quality Results

    How MeshFormer pushes the boundaries of 3D reconstruction with smart architecture, fewer resources, and better performance. Efficient 3D Mesh Generation: MeshFormer introduces a unified, single-stage model...

    SurgSAM-2: A New Era of Real-Time Surgical Video Segmentation

    How SurgSAM-2 revolutionizes surgical precision with efficient, real-time video processing and segmentation. Cutting-Edge Efficiency: SurgSAM-2 introduces an Efficient Frame Pruning (EFP) mechanism to improve both speed...

    TurboEdit: Real-Time Image Editing with Text Prompts

    How TurboEdit brings instant, precise image manipulation through cutting-edge diffusion models. Instant Image Editing: TurboEdit uses a few-step diffusion model and an innovative encoder-based inversion technique,...

    Unveiling xGen-MM (BLIP-3): The Future of Open Large Multimodal Models

    How xGen-MM is revolutionizing AI with cutting-edge datasets, powerful multimodal models, and open-source innovation. Advanced AI Framework: xGen-MM (BLIP-3) is a state-of-the-art framework for building Large...

    Google DeepMind Explores a New Frontier in Image Classification with Flexible Visual Memory

    A new approach to dynamic AI that blends neural networks with a database-like memory system for adaptable image classification Dynamic Knowledge Representation: Google DeepMind proposes...

    DeepSeek-Prover V1.5: Enhancing Theorem Proving with Reinforcement Learning and Advanced Search Techniques

    New advancements in AI-powered proof assistants bring a 63.5% success rate in formal theorem proving benchmarks Reinforcement Learning Feedback Boosts Performance: DeepSeek-Prover V1.5 leverages reinforcement learning...

    FancyVideo Aims to Revolutionize Video Generation with Enhanced Temporal Consistency

    New cross-frame textual guidance module promises more dynamic and coherent videos from AI models Temporal Logic Improvements: FancyVideo introduces a new framework to improve temporal consistency...