With a beta release on Tuesday, February 2nd, GPT-NeoX-20B is now the largest publicly accessible language model available. At 20 billion parameters, GPT-NeoX-20B is a powerhouse that was trained on EleutherAI’s curated collection of datasets, The Pile.
When EleutherAI developed The Pile’s 825GB dataset, no public datasets suitable for training language models of this size and quality existed. The Pile is now widely used as a training dataset for many current cutting edge models, including the Beijing Academy of Artificial Intelligence’s Wu Dao (1.75T parameters, multimodal), AI21’s Jurassic-1 (178B parameters), Anthropic’s language assistant (52B parameters), and Microsoft and NVIDIA’s Megatron-Turing NLG (340B parameters).
Why is the launch of GPT-NeoX-20B significant?
In short, GPT-NeoX-20B is more accessible to developers, researchers, and tech founders because it is fully open-source and less expensive to serve compared to similar models of its size and quality.
- GPT-NeoX-20B codebase will offer a straightforward configuration using YAML files, which enables users to launch training runs across hundreds of GPUs with a single line of bash script.
- GPT-NeoX-20B will be far cheaper to deploy than GPT-3 on a performance adjusted basis.
- For developers who are currently using OpenAI’s GPT-3 API, any applications that rely solely on prompting are likely to work in GPT-NeoX-20B with only minor modifications.
Performance Comparison by Model
The following table shows a comparison of GPT-NeoX-20B's performance relative to other publicly available NLP models to answer factual questions in a variety of domains. GPT-NeoX-20B out performs its peers by a statistically significant margin:
The EleutherAI team has always been very excited about the potential and importance of language models such as GPT3, but have been concerned that the closed source nature and expensive training costs of such models represent significant hurdles to researchers interested in studying and using these models.
EleutherAI believes that AI safety is massively important for society to tackle today, and hope that open access to cutting edge models will allow more such research to be done on state of the art systems.
“From spam and astroturfing to chatbot addiction, there are clear harms that can manifest from the use of these models already today, and we expect the alignment of future models to be of critical importance. We think the acceleration of safety research is extremely important; and the benefits of having an open source model of this size and quality available for that research outweigh the risks.” – Connor Leahy, EleutherAI
GPT-NeoX-20B is a glimpse into the next generation of what powerful AI systems could look like, and EleutherAI hopes to remove the current barriers to research on the understanding and safety of such models.
How can you get started on Eleuther’s GPT-NeoX-20B?
If you are looking to serve GPT-NeoX-20B without managing infrastructure, the model is available to be served via a fully managed API endpoint on GooseAI today. GooseAI delivers feature parity with the OpenAI API up to 70% lower cost.
If you want to manage your own infrastructure, you can leverage the benefits of the industry's broadest range of NVIDIA GPUs, fastest spin-up times, and most responsive auto-scaling on CoreWeave Cloud.