A Collaborative Effort from UC Berkeley, CMU, Stanford, and UC San Diego Surpasses Expectations
A team of researchers from prestigious institutions such as UC Berkeley, CMU, Stanford, and UC San Diego have introduced Vicuna-13B, an open-source chatbot that achieves more than 90%* quality of OpenAI’s ChatGPT and Google Bard. Trained by fine-tuning LLaMA on user-shared conversations from ShareGPT, Vicuna-13B outperforms other models like LLaMA and Stanford Alpaca in over 90%* of cases. Remarkably, the cost of training this impressive chatbot is around $300, and the team has made the training and serving code, as well as an online demo, publicly available for non-commercial use.
The Vicuna-13B project was inspired by the Meta LLaMA and Stanford Alpaca projects and aimed to enhance the dataset and create an easy-to-use, scalable infrastructure. The team collected approximately 70K conversations from ShareGPT.com and fine-tuned the LLaMA base model with a series of optimizations and adjustments, which enabled the chatbot to generate more detailed and well-structured answers compared to Alpaca.
To evaluate the performance of Vicuna-13B, the team utilized GPT-4 as a judge and compared its output with other models. The results showed that Vicuna-13B is on par with ChatGPT in terms of quality. The training process included memory optimizations, multi-round conversation handling, and cost reduction by leveraging spot instances.
The serving system built for Vicuna-13B is capable of serving multiple models with distributed workers and supports the integration of GPU workers from both on-premise clusters and the cloud. Despite some limitations, such as difficulties with reasoning, mathematics, and safety concerns, Vicuna-13B is expected to serve as a starting point for future research in this field.
The team has released the training, serving, and evaluation code on GitHub and plans to share the model weights in the future. They have also set up a Discord server and Twitter account for updates and encourage the community to test the chatbot through their online demo.
As the Vicuna-13B project progresses, the team behind it is committed to refining the chatbot and addressing its current limitations. By continuously improving the model and incorporating feedback from users and the research community, Vicuna-13B has the potential to become an even more powerful AI tool.
Future research directions may include enhancing the chatbot’s capabilities in areas such as reasoning and mathematical problem-solving. Additionally, efforts to optimize safety, mitigate potential toxicity, and reduce biases within the model will be essential to ensure responsible AI development.
Key points:
• Vicuna-13B is an open-source chatbot that achieves exceptional performance
• Fine-tuned using LLaMA and user-shared conversations from ShareGPT
• Outperforms other models and has a low training cost of $300
• Training and serving code and online demo available for non-commercial use
As more researchers and engineers contribute to the project, Vicuna-13B could inspire new open-source chatbot solutions and promote a collaborative approach to AI research. By sharing their work and encouraging feedback, the team hopes to foster innovation and growth within the AI community, pushing the boundaries of what chatbot technology can achieve.
Official Website: https://vicuna.lmsys.org