More
    HomeAI Papers

    AI Papers

    Faster, Sharper, and Smarter: Infinity Outpaces Diffusion Models in Quality and Speed

    Infinity: Redefining High-Resolution Text-to-Image Synthesis with Bitwise AutoRegressive Modeling Innovative Framework: Infinity introduces bitwise token modeling, infinite-vocabulary tokenization, and self-correction mechanisms to overcome traditional AutoRegressive model...

    Game On: DeepMind’s MAV Model Brings Grandmaster-Level AI to Chess and Beyond

    From Hallucination-Free Play to Grandmaster Elo Ratings, MAV Redefines AI Strategy and Planning Integrated Decision-Making: The Multi-Action-Value (MAV) model combines state tracking, planning, and action evaluation...

    SNOOPI: Setting a New Benchmark for One-Step Diffusion Models

    Introducing Dynamic Guidance and Negative Prompt Integration for Superior Image Generation Enhanced Stability: SNOOPI introduces Proper Guidance - SwiftBrush (PG-SB) to stabilize training by dynamically adjusting...

    Google’s Advanced AI Model Delivers Faster, More Accurate Weather Forecasts

    GenCast: Revolutionizing Weather Forecasting with AI Precision State-of-the-Art Forecasting: GenCast, Google’s new AI weather model, predicts weather conditions and risks with unprecedented accuracy up to 15...

    AnchorCrafter: Transforming Product Promotion with Human-Object Interactive Videos

    A New Era of Automation for Anchor-Style Advertising and Consumer Engagement Revolutionizing Product Promotion Videos: AnchorCrafter brings a new level of automation to anchor-style advertising by...

    CAT4D: Bringing Dynamic 3D Scenes to Life from Monocular Videos

    Revolutionizing 4D Scene Generation with Multi-View Video Diffusion Models Reimagining the World in 4D: CAT4D transforms standard monocular videos into dynamic 3D scenes, offering unprecedented realism...

    Breaking the Puzzle from Nvidia: LLM Efficiency for Real-World Applications

    How NVIDIA’s Puzzle Framework Redefines Language Model Optimization for Scalable AI Cost-Effective AI Scalability: NVIDIA’s Puzzle framework tackles the growing issue of high inference costs in...

    QwQ-32B: Alibaba’s Open Answer to OpenAI’s Reasoning Model

    Challenging established norms with a “reasoning-first” AI that reflects its creators’ culture and ambition. A New Contender in Reasoning AI: Alibaba’s QwQ-32B-Preview aims to rival OpenAI’s...

    Meta’s ROICtrl: Transforming Visual Generation with Precise Instance Control

    A game-changing approach to multi-instance generation using ROI-Unpool and diffusion models. Enhanced Instance Control: ROICtrl allows for precise control of multiple instances in visual generation by...

    ShowUI from Microsoft: GUI Interaction with Vision-Language-Action AI

    A breakthrough in digital workflow assistants, bridging human-like perception and action for seamless GUI navigation. Enhanced Human-Like Interaction: ShowUI introduces a novel vision-language-action model, enabling more...

    OmniControl: A Leap in Image-Conditioned Diffusion Transformers

    Streamlined, scalable, and precise—OmniControl reshapes how we generate and control images using AI. OmniControl introduces an efficient framework for image-conditioned control in diffusion models, requiring...

    From Image to 3D in Seconds: Adobe’s DiffusionGS Model

    Adobe introduces DiffusionGS, a breakthrough in fast and scalable image-to-3D creation. Adobe unveils DiffusionGS, a cutting-edge 3D diffusion model, generating consistent 3D outputs from single 2D...