Starling-7B Open Source Model, beats out rest of open source models
This is essentially just as good as GPT4. At this rate, open source will definitely outpace proprietary models. This one already beats out Claude-2.
The team behind Starling-7B has also released the Nectar dataset, the Starling-RM-7B-alpha reward model, and the Starling-LM-7B-alpha language model on HuggingFace, along with an online demo. The dataset Nectar consists of 183K chat prompts with 7 responses each, leading to 3.8 million pairwise comparisons, aimed at mitigating positional bias and improving model training.
The project's focus is on exploring the potential of RLHF (Reinforcement Learning from Human Feedback) and RLAIF in enhancing LLMs, particularly in chatbot applications. They have conducted extensive evaluations and comparisons with other models, observing improvements in helpfulness and safety features. However, they also note some limitations, such as struggles with reasoning and mathematics tasks and susceptibility to jailbreaking prompts.
Details here: https://starling.cs.berkeley.edu/