Open-source language models match proprietary giants with efficiency breakthroughs
-
AI21 Labs created Jamba, an efficient open-source language model combining Transformers and state space models that matches larger proprietary models.
-
Jamba fits on a single GPU and uses just 4GB of memory but can process 256K tokens of context - the longest for any open-source model.
-
Databricks introduced DBRX, an open-source "mixture of experts" model beating GPT-3.5 on benchmarks despite having far fewer parameters.
-
DBRX generates text faster than comparable sized models like LLaMA2-70B.
-
Both models lack multimodal abilities of proprietary models but demonstrate open source can create efficient, competitive language models.