This website uses cookies to anonymously analyze website traffic using Google Analytics.


Advancing the frontier of
open-source AI.

Our research team contributes cutting-edge models, datasets, and optimizations to the open-source community.


RedPajama provides a set of leading open-source foundation models built on the largest-ever open pre-training dataset.

  • 01 RedPajama-Data-30T

    The largest open-source pre-training dataset, used by over 500 leading generative AI models. This dataset and the open research approach used to create the RedPajama models is helping to advance the frontier of open-source AI.

    Learn more
  • 02 RedPajama-7B

    A suite of fully open-source base, instruction-tuned, and chat models.

    The instruct model is the highest scoring open model on HELM benchmarks, making it ideal for a wide range of tasks. It outperforms LLaMA-7B and state-of-the-art open models such as Falcon-7B (Base and Instruct) and MPT-7B (Base and Instruct) on HELM by 2-9 points.

    Learn more
  • 03 RedPajama-3B

    The smaller RedPajama model is ideally suited for running on the edge, with support for running on iPhones, Android smartphones, Raspberry pi, and other devices.

    Learn more


Innovations that make training and inference faster, more scalable, and reliable.

  • 01 FlashAttention-2

    This update to FlashAttention is now broadly used by all transformer models, speeds up training and fine-tuning of LLMs by up to 9x and achieves 72% model FLOPs utilization for training on NVIDIA A100s.

    Learn more
  • 02 Sub-quadratic model architectures

    In collaboration with Hazy Research, we are actively working on the next core architecture for generative AI models that will provide much faster performance, with longer context. Research in this area includes Hyena, Monarch Mixer, and FlashConv.

    Learn more
  • 03 Cocktail SGD

    One of the key challenges in training generative AI models is networking. To enable faster, more reliable training that can run in a distributed environment, we created Cocktail SGD – a set of optimizations that reduces network communication by 117x.

    Learn more


Read the latest research from our team and academic partners

here →