Advancing 3D Scene Generation with Holistic Text-to-Image Models
HoloDreamer generates highly consistent 3D panoramic scenes from text descriptions.
The framework combines multiple diffusion models with 3D...
Solving Object Repetition in High-Resolution Image Generation
AccDiffusion addresses the issue of object repetition in patch-wise higher-resolution image generation.
The method uses patch-content-aware prompts and dilated...
Innovating Object Addition in Images with Text Guidance Alone
Diffree enables seamless text-guided object addition without compromising background consistency.
The model leverages the OABench dataset, enhancing...
Tailoring AI-Generated Images to Individual Tastes
ViPer personalizes image generation by capturing and applying individual visual preferences.
The system uses user comments to infer visual likes...
A New Benchmark for Realistic Human Image Animation and Camera Control
HumanVid introduces the first large-scale, high-quality dataset tailored for human image animation, combining real-world...
Introducing the First Comprehensive Benchmark for Complex Video Generation from Text Prompts
T2V-CompBench offers the first benchmark tailored for compositional text-to-video generation.
The benchmark includes diverse...
Revolutionizing Long-Form Video Generation with MovieDreamer
MovieDreamer combines autoregressive models and diffusion rendering for long-duration video generation.
The framework ensures narrative coherence and character consistency across...
Revolutionizing the Virtual Fashion Experience
OutfitAnyone utilizes a two-stream conditional diffusion model for lifelike virtual try-on experiences.
The technology adapts to various body shapes and poses,...
New Apple AI Models Outperform Competitors Mistral and Hugging Face
Apple releases DCLM models on Hugging Face, featuring 7 billion and 1.4 billion parameter variants.
The...
Real-Time, Fine-Grained 3D Scene Manipulation Made Possible
Click-Gaussian enables rapid and accurate segmentation of 3D Gaussians.
The Global Feature-guided Learning (GFL) method enhances segmentation accuracy.
The method...
Expanding Context Windows in Open-Source Code Models
IBM introduces Granite code models supporting up to 128K token context windows.
Lightweight continual pretraining and instruction tuning enhance...
Study shows AI-generated responses outperform human doctors in empathy but raise readability concerns
Superior Empathy: AI-generated responses rated higher in empathy compared to those written...