DBRX
DBRX is an open-sourced large language model (LLM) developed by Mosaic under its parent company Databricks, released on March 27, 2024. It is a mixture-of-experts transformer model, with 132 billion parameters in total. 36 billion parameters (4 out of 16 experts) are active for each token. The released model comes in either a base foundation model version or an instruction-tuned variant.
At the time of its release, DBRX outperformed other prominent open-source models such as Meta's LLaMA 2, Mistral AI's Mixtral, and xAI's Grok, in several benchmarks ranging from language understanding, programming ability and mathematics.
It was trained for 2.5 months on 3,072 Nvidia H100s connected by 3.2 terabytes per second bandwidth (InfiniBand), for a training cost of US$10M USD.