OpenLLaMA: A Permissively Licensed Open Source Reproduction of LLaMA Language Model

May 4, 2023

Public preview of the 7B OpenLLaMA model trained on 200 billion tokens released, with PyTorch and Jax weights available

OpenLLaMA is an open-source reproduction of Meta AI’s LLaMA large language model, trained on the RedPajama dataset.
Public preview of the 7B OpenLLaMA model has been released, with both PyTorch and Jax weights available for download on HuggingFace Hub.
OpenLLaMA shows comparable performance to the original LLaMA and GPT-J across most tasks, and outperforms them in some instances.

The OpenLLaMA project releases an open-source reproduction of Meta AI’s LLaMA large language model, with a permissive license for users. The public preview of the 7B OpenLLaMA model has been trained on 200 billion tokens, and both PyTorch and Jax weights of pre-trained OpenLLaMA models are provided.

The OpenLLaMA model is trained on the RedPajama dataset released by Together, a reproduction of the LLaMA training dataset containing over 1.2 trillion tokens. It follows the same preprocessing steps and training hyperparameters as the original LLaMA paper, with the only difference being the use of the RedPajama dataset instead of the original LLaMA dataset. The models are trained on cloud TPU-v4s using EasyLM, a JAX-based training pipeline developed for training and fine-tuning language models.

In terms of evaluation, OpenLLaMA was assessed on a wide range of tasks using lm-evaluation-harness, with the results comparing the original LLaMA model and GPT-J, a 6B parameter model trained on the Pile dataset by EleutherAI. OpenLLaMA demonstrates comparable performance to the original LLaMA and GPT-J across most tasks and outperforms them in certain cases. The performance of OpenLLaMA is expected to improve further once training on 1 trillion tokens is completed.

To encourage community feedback, a preview checkpoint of the weights has been released, available for download from HuggingFace Hub. The weights are provided in two formats: an EasyLM format for use with the EasyLM framework and a PyTorch format for use with the HuggingFace Transformers library. Instructions for using the weights in both frameworks can be found in their respective documentation.

The training framework EasyLM and the preview checkpoint weights are licensed permissively under the Apache 2.0 license. The current release is just a preview of the complete OpenLLaMA release, with the focus now on completing the training process on the entire RedPajama dataset. In addition to the 7B model, a smaller 3B model is in development to cater to low-resource use cases. Stay tuned for future updates and releases.

Github

Exhausted Coder Outsmarts AI in Epic Showdown: Humanity’s Last Stand?

Everyday Assistance: GR-3 and the Dawn of Generalist Robots

White House’s AI Founding Fathers: When History Meets Conservative Slop

Digital Faces: FantasyPortrait’s Leap in Expressive Multi-Character Animation

Netflix’s AI Leap: Storytelling with Generative Tech

Mistral’s New OCR API: A Game Changer for AI-Ready Documents

China’s Autonomous Agent, Manus, Changes Everything: The Dawn of Self-Directed AI

LLM Inference Hardware Calculator

Claude 3.7 Sonnet: The World’s First Hybrid AI Brain Coding and Reasoning

SambaNova Launches the Fastest DeepSeek-R1 671B with Unmatched Efficiency

Celebrities explaining science? Yes, please!

Breaking News: The world is ending, and influencers are live-reacting to the chaos!

THIS WILL BE A DAY LONG REMEMBERED: DARTH VADER’S AI VOICE LANDS IN FORTNITE

Where AI Baby Wisdom Meets Canine Comedy

The Impact of OpenAI’s 4o Image Generation: A Visual Revolution

From Garage Invite to X-Rated Text: When AI Mishears, Chaos Follows

Exhausted Coder Outsmarts AI in Epic Showdown: Humanity’s Last Stand?

Everyday Assistance: GR-3 and the Dawn of Generalist Robots

White House’s AI Founding Fathers: When History Meets Conservative Slop

Digital Faces: FantasyPortrait’s Leap in Expressive Multi-Character Animation

Netflix’s AI Leap: Storytelling with Generative Tech

Mistral’s New OCR API: A Game Changer for AI-Ready Documents

China’s Autonomous Agent, Manus, Changes Everything: The Dawn of Self-Directed AI

LLM Inference Hardware Calculator

Claude 3.7 Sonnet: The World’s First Hybrid AI Brain Coding and Reasoning

SambaNova Launches the Fastest DeepSeek-R1 671B with Unmatched Efficiency

Celebrities explaining science? Yes, please!

Breaking News: The world is ending, and influencers are live-reacting to the chaos!

THIS WILL BE A DAY LONG REMEMBERED: DARTH VADER’S AI VOICE LANDS IN FORTNITE

Where AI Baby Wisdom Meets Canine Comedy

The Impact of OpenAI’s 4o Image Generation: A Visual Revolution

From Garage Invite to X-Rated Text: When AI Mishears, Chaos Follows

Public preview of the 7B OpenLLaMA model trained on 200 billion tokens released, with PyTorch and Jax weights available

Must Read

Adobe Ignites the AI Video Race: Firefly Takes on OpenAI with Studio-Centric Innovation

Simpsons characters come to life with AI

Deadpool vs Gordon Ramsay in Kitchen

Perplexity’s Bold Play: Tracking Every Move for ‘Hyper Personalized’ Ads

South Korea’s AI Textbook Revolution: A Double-Edged Sword?

OpenLLaMA: A Permissively Licensed Open Source Reproduction of LLaMA Language Model

Public preview of the 7B OpenLLaMA model trained on 200 billion tokens released, with PyTorch and Jax weights available

RELATED ARTICLES

Must Read