Announcing $106M round led by Salesforce Ventures
I am excited to share that we’ve raised $106M in a new round of financing led by Salesforce Ventures with participation from Coatue, and existing investors, Lux Capital, Kleiner Perkins, Emergence Capital, Prosperity7 Ventures, NEA, Greycroft, Definition Capital, Long Journey Ventures, Factory, Scott Banister, and SV Angel. We are also thrilled to have participation from industry luminaries including Clem Delangue, CEO of HuggingFace, and Soumith Chintala, the creator of PyTorch.
Together AI has become one of the world’s favorite AI platforms for open-source models. Our serverless APIs for inference and fine-tuning have over 45,000 registered developers, with traffic growing 3x month over month. Together AI is now widely integrated into AI application development frameworks including LangChain, Vercel, LlamaIndex, MongoDB, and EmbedChain. Customers across industries are choosing our platform for its superior performance, reliability, economics, and our deep expertise in building AI systems that can achieve the highest levels of accuracy and scale.
The new funding will go towards continued expansion of our roadmap, including features to support large enterprises by providing a platform for teams across the organization to easily, effectively and securely build and deploy generative AI solutions on open-source and custom models. We also plan to expand our compute capacity internationally.
"Drawing from our extensive experience and thematic work in the AI infrastructure realm and fostering collaborations with enterprise clients alongside the Salesforce Research team, it's evident: companies are moving towards an open-source approach in AI adoption, often augmented by closed-source technologies,” said Paul Drews, Managing Partner, Salesforce Ventures. “Together AI has emerged as a leading solution to meet this open-source demand, and we're fully confident in the team's ability to bring this ambitious vision to fruition."
Generative AI saw a major shift towards open-source in 2023. Models like Llama-2, Mixtral, Qwen, Gemma, and StripedHyena rapidly narrowed the benchmark gap between best closed and open LLMs, and models for speech, audio, vision have in many cases exceeded the capabilities of the best closed models. A strong set of open-source tools and cloud services have emerged that allow millions of developers to build and shape AI systems. Together AI offers a platform to easily develop with and deploy open-source models and productionize them in enterprise-grade applications. Our capabilities grow with the best of what open-source AI has to offer, a frontier that is moving rapidly.
Generative AI is also reshaping the cloud computing landscape. AI compute is disaggregating from traditional hyperscalers. GPU clouds became a fast growing category in 2023. Together AI works with over 10 GPU cloud platforms today and offers a seamless AI-centric cloud experience built over a multi-cloud substrate. Our approach to disaggregated cloud computing insulates our customers from GPU shortages, and allows them to scale rapidly around the globe. Today, our cloud network includes Crusoe Cloud, Applied Digital, Lambda Labs, Vultr, Oracle Cloud, and ClusterPower. We are expanding capacity with current partners and adding new ones in 2024.
"We believe that the future of generative AI depends on open research, open science, and trust between developers and enterprises," shared Jade Lai, Partner, Coatue. "Together AI has emerged as an open-source leader through its contributions to the community, including research on compute optimizations, novel model architectures, and data curation. Combined with their focus on the developer experience, Together AI has thoughtfully designed the right API abstractions that will position them well to be a de facto platform for developers to build production AI apps in the future."
At our core, we are a research driven company, and we continue to advance the state of the art in open-source generative AI. We released StripedHyena and Mamba models demonstrating a clear path for generative AI to move to more efficient architectures. We collaborated with the Arc Institute to create Evo, a long-context biological generative AI model based on the StripedHyena architecture that generalizes across the fundamental languages of biology: DNA, RNA, and proteins. Our Monarch Mixture (M2-BERT) set of long-context retrieval models support very long context length and outperform transformer-based BERT models, and are being deployed to build stronger RAG applications.
We also take a research-driven approach to building our AI platform and products. Our unique focus on optimization at all levels of the stack results in the industry’s fastest and most efficient platform for training, fine-tuning, and serving generative AI models. Just in the past few months we released several major product updates including introducing Embeddings, function calling and JSON mode, multiple RAG integrations, Vercel integration, Together Inference Engine support for Mixtral, LlamaGuard, and new models like Gemma, Qwen, and more. The Together AI Research and Engineering process has been built for rapidly bringing the latest innovations in generative AI to our customers with production-grade quality and scalability.
As organizations of all sizes develop their generative AI strategy they are leveraging open-source models due to the greater transparency, control, and ownership of what they build. Many of our customers are seeing success with fine-tuning an existing open-source model with private data and then deploying the model in conjunction with additional data sources through RAG, or agent style, multi-step, multi-model workflows.
Generative AI is a new computing platform, and we aim to help developers everywhere build and deploy the next generation of applications. We’re honored to be a part of building an open and collaborative future together with all of you!
— Vipul Ved Prakash, Co-founder and CEO
- Lower
Cost20% - faster
training4x - network
compression117x
Q: Should I use the RedPajama-V2 Dataset out of the box?
RedPajama-V2 is conceptualized as a pool of data that serves as a foundation for creating high quality datasets. The dataset is thus not intended to be used out of the box and, depending on the application, data should be filtered out using the quality signals that accompany the data. With this dataset, we take the view that the optimal filtering of data is dependent on the intended use. Our goal is to provide all the signals and tooling that enables this.
article