Serve GPT-J with the Industry’s Fastest Spin-up Times & Most Responsive Autoscaling
The Key Advantages of GPT-J
Trained on the Pile, EleutherAI’s 825 GiB language modelling data set, GPT-J is one of the largest open-source NLP models available.
- The quality of EleutherAI’s GPT-J training data helps it compete with and at times produce higher quality results than GPT-3 and its 175B parameters, while operating at just 6B parameters.
- GPT-J is a more practical model to serve at scale, requiring far less memory and horsepower than GPT-3.
"EleutherAI has long had the ambition to create the world's largest open-source language model. We feel this is important for both AI safety and research. For large models, diverse data sets dramatically improve the quality of large models, which is why we chose to train our models on The Pile, a 825GB diverse, open source data set consisting of 22 smaller, high-quality datasets. GPT-J-6B, our largest model to date, is proving to be incredibly powerful given its size and quality." – EleutherAI
CoreWeave Clients Leverage GPT-J for Significant Gains
AI Dungeon CEO Nick Walton leveraged GPT-3 before running into huge frustrations with planning, costs and latency. He turned to CoreWeave to help take on their compute resource infrastructure needs and was able to lower costs by 75% and reduce latency by 50%.
Given the quality of EleutherAI’s dataset and the infrastructure advantages on CoreWeave, AI Dungeon expects to realize massive performance gains as they migrate production traffic to GPT-J on CoreWeave.
After benchmarking across CoreWeave's wide variety of compute, AI Dungeon found the best performance-adjusted cost with the A5000 as an incredibly efficient GPU for these workloads.
"We're able to serve GPT-J at a much better performance-adjusted cost compared to other language model services given its relative size and quality. CoreWeave's infrastructure and breadth of compute has allowed us to lean into scaling AI Dungeon further than ever before while providing our users an incredible experience with GPT-J at a price they can afford." – Nick Walton
NovelAI is currently serving its ‘Sigurd’ storytelling model, which is based on GPT-J that was trained by EleutherAI and fine tuned on novels and short stories. NovelAI CEO Eren Doğan has been impressed by how fast the sharedFS is at loading the GPT-J model.
“We are able to serve requests 3x faster after migrating to CoreWeave, leading to a much better user experience while saving 75% in cloud costs. For the users, this means the generation speeds will never slow down, even when there is peak load.” – Eren Doğan, NovelAI CEO.
Key Feature #1: One-Click Deployment
CoreWeave’s GPT-J Inference Service offers one-click GPT-J deployment, accessible within CoreWeave Apps.
One-click deployment of an API inference service eliminates infrastructure overhead, giving clients access to the full range of CoreWeave GPUs, including NVIDIA A5000s which deliver an unmatched performance adjusted cost. CoreWeave’s fleet of approximately 50,000 GPUs, accessible by deploying containerized workloads via Kubernetes, is always ready to work for you. See our full arsenal of hardware and the industry’s best pricing here.
Key Feature #2: Blazing Fast Spin-Up & Responsive Autoscaling
CoreWeave’s flexible, K8s-native infrastructure delivers the industry’s fastest spin-up times and most responsive autoscaling.
When scaled down, the CoreWeave Inference Service will consume zero resources and incur zero billing.
Scale-to-zero, combined with the industry’s broadest selection of GPUs allows companies to serve GPT-J faster and more efficiently from anywhere in the world.
Key Feature #3: The Best Economics in the Industry
- Competitive pricing for on-demand compute.
- Industry's broadest range of NVIDIA GPUs delivering the best performance adjusted cost. Click here for our latest benchmarks.
- Infrastructure advantages help make sure clients consume compute efficiently, rather than spending money on resources they're not using.
The Future of NLP
CoreWeave is working closely with EleutherAI to help push the boundaries of open source NLP and GPT-J, by providing a massive, state of the art DGX A100 distributed training cluster, with full infiniband connection between nodes.
Throughout any project, CoreWeave engineers are an extension of your team, providing DevOps experience to help optimize your workloads and expertise as you benchmark your workloads and find the right GPU resources to train and serve your models.
At CoreWeave, Machine Learning is in our DNA, and our infrastructure reflects it. Whether you are training or deploying models, CoreWeave Cloud will reduce your set-up and improve performance.
CoreWeave is firmly committed to the NLP community and built CoreWeave Cloud with engineers in mind. We continue to empower companies who rely on NLP workloads, helping serve large language models on the best cloud infrastructure in the industry.
I would love to chat with you to see how CoreWeave can solve your GPT-J or NLP challenges, contact me here.
Head of Business Development