Tech

HuatuoGPT-Vision Injects Medical Visual Knowledge into Multimodal Models

New dataset boosts medical capabilities of large language models PubMedVision dataset refines medical image-text pairs to enhance multimodal large language models (MLLMs). HuatuoGPT-Vision, trained on PubMedVision,...

AI Outperforms University Students in Exam Study

AI-generated answers achieve higher grades and evade detection in university exams AI-generated answers scored higher than real students in undergraduate exams. 94% of AI essays went...

AI Voice Clones Threaten Jobs of 5,000 Australian Actors

The rise of AI vocal technology could upend creative fields like audiobooks and voice acting AI voice clones are beginning to replace human voice actors...

Amazon Strengthens AI Capabilities by Hiring Top Executives from Adept and Licensing Its Technology

Amazon bolsters its AGI development with strategic hires and licensing agreements Strategic Hiring and Licensing: Amazon hires key executives from Adept and licenses its AI...

Bridging the Gap in AI: OMG-LLaVA’s Comprehensive Image and Text Reasoning Capabilities

Integrating pixel-level understanding with powerful reasoning for advanced multimodal interactions Unified Model Architecture: OMG-LLaVA combines image-level, object-level, and pixel-level reasoning within a single framework, enhancing...

YOUDREAM: Text-to-3D Animal Generation

A breakthrough in 3D generation with text-to-image diffusion models YOUDREAM generates high-quality, anatomically controllable 3D animals using a text-to-image diffusion model guided by 2D views...

Florence-2: Vision Tasks with Unified Representation

Florence-2 integrates diverse vision and vision-language tasks through a novel prompt-based model. Florence-2 utilizes a unified, prompt-based approach for various vision and vision-language tasks. The model...

Apple’s New AI Models: Boosting On-Device and Server Capabilities

Apple introduces AI advancements with on-device and server models for enhanced user experience Apple unveils a 3 billion parameter on-device language model and a more...

Claude 3.5 Sonnet: A New Era in AI Intelligence

Anthropic’s Latest AI Model Outshines Competitors with Speed, Cost Efficiency, and Versatility Superior Performance: Claude 3.5 Sonnet outperforms competitor models and its predecessor in key evaluations,...

Glyph-ByT5-v2 Sets New Standard for Multilingual Text Rendering

The breakthrough in visual text rendering supports 10 languages with improved aesthetic quality Multilingual Capability: Glyph-ByT5-v2 and Glyph-SDXL-v2 accurately render text in 10 languages. Enhanced Aesthetics: The models...

People Struggle to Differentiate Between Humans and AI in Five-Minute Chats

The Surprising Accuracy of GPT-4 in Mimicking Human Conversation Confounding Conversations: In Turing test experiments, participants mistook GPT-4 for a human 54% of the time. Rising Concerns: The...

ChatGPT’s Impact on the Freelance Market: A Looming Challenge

How AI is Reshaping the Demand for Digital Freelancers Significant Decline: A 21% drop in demand for digital freelancers in writing and coding since ChatGPT's launch. Automation-Prone...