Dolly 2.0 is a language model with 12 billion parameters that derives from the EleutherAI pythia model series.
Its training was conducted exclusively using a new dataset composed of high-quality human-generated instruction following data.
This dataset was created through crowd-sourcing among employees of Databricks.
Now, they are open-sourcing the entirety of Dolly 2.0, including the training code, the dataset, and the model weights, all suitable for commercial use.
This means that any organization can create, own, and customize powerful LLMs that can talk to people, without paying for API access or sharing data with third parties.
Dolly 2.0 is freely available for researchers and developers to use and has been used in a variety of natural language processing applications, such as question answering, text completion, and language translation.