More
    HomeAI Papers

    AI Papers

    QwQ-32B: Alibaba’s Open Answer to OpenAI’s Reasoning Model

    Challenging established norms with a “reasoning-first” AI that reflects its creators’ culture and ambition. A New Contender in Reasoning AI: Alibaba’s QwQ-32B-Preview aims to rival OpenAI’s...

    Meta’s ROICtrl: Transforming Visual Generation with Precise Instance Control

    A game-changing approach to multi-instance generation using ROI-Unpool and diffusion models. Enhanced Instance Control: ROICtrl allows for precise control of multiple instances in visual generation by...

    ShowUI from Microsoft: GUI Interaction with Vision-Language-Action AI

    A breakthrough in digital workflow assistants, bridging human-like perception and action for seamless GUI navigation. Enhanced Human-Like Interaction: ShowUI introduces a novel vision-language-action model, enabling more...

    OmniControl: A Leap in Image-Conditioned Diffusion Transformers

    Streamlined, scalable, and precise—OmniControl reshapes how we generate and control images using AI. OmniControl introduces an efficient framework for image-conditioned control in diffusion models, requiring...

    From Image to 3D in Seconds: Adobe’s DiffusionGS Model

    Adobe introduces DiffusionGS, a breakthrough in fast and scalable image-to-3D creation. Adobe unveils DiffusionGS, a cutting-edge 3D diffusion model, generating consistent 3D outputs from single 2D...

    AIMV2: Apple’s Multimodal Revolution in Vision Encoding

    Redefining AI with scalable pre-training for images and text integration. Apple introduces AIMV2, a family of large-scale vision encoders excelling in multimodal tasks. AIMV2 leverages autoregressive...

    Alibaba’s Marco-o1: Pioneering Open-Ended Reasoning in AI

    With advanced techniques like Chain-of-Thought and Monte Carlo Tree Search, Marco-o1 sets a new standard for tackling complex, ambiguous challenges. Beyond the Metrics: Marco-o1 addresses the...

    FlipSketch: Breathing Life Into Your Doodles

    Sketch animation with AI-powered simplicity and creativity. Effortless Animation: FlipSketch transforms static sketches into smooth animations with just a drawing and a text description. AI Innovation: Combines text-to-video...

    RedPajama: The Future of Transparent and Open-Source Language Model Training

    How RedPajama datasets are redefining AI development with transparency, scalability, and versatility. Transparency in AI Training: RedPajama introduces an unprecedented level of openness in dataset composition,...

    AnimateAnything: Transforming Video Creation with Seamless Control and Precision

    The groundbreaking framework for consistent, customizable video generation opens new doors for filmmakers and VR designers. Versatile Control: AnimateAnything enables precise video manipulation through camera trajectories,...

    SAMPart3D: A Breakthrough in Zero-Shot 3D Object Segmentation for Complex Models

    Achieving scalable and flexible part-level segmentation without text prompts, SAMPart3D enables advanced 3D editing and model customization. Text-Free, Scalable Segmentation: SAMPart3D removes the need for...

    OMNI-EDIT: The Ultimate Image Editor with Multi-Task Capabilities for Any Aspect Ratio

    OMNI-EDIT leverages specialist guidance to tackle seven unique editing tasks, achieving unprecedented accuracy and quality in real-world image editing. Multi-Task Capability: OMNI-EDIT is designed to...