Posted 3/27/2024, 9:05:40 PM
Anthropic's Claude-3 edges out OpenAI's GPT-4 to top new chatbot benchmark leaderboard
- UC Berkeley, UC San Diego, and Carnegie Mellon formed LMSYS to benchmark large language models like chatbots
- LMSYS introduced the Chatbot Arena, which uses crowd-sourced votes and the Elo system to rank chatbots
- Claude-3 Opus from Anthropic beat GPT-4 by a slim margin to take the #1 spot
- Claude-3 Haiku reached GPT-4 level performance despite being much smaller
- OpenAI is soon launching GPT-5, which sources say will be far better than GPT-4