Posted 3/27/2024, 9:05:40 PM

Anthropic's Claude-3 edges out OpenAI's GPT-4 to top new chatbot benchmark leaderboard

UC Berkeley, UC San Diego, and Carnegie Mellon formed LMSYS to benchmark large language models like chatbots
LMSYS introduced the Chatbot Arena, which uses crowd-sourced votes and the Elo system to rank chatbots
Claude-3 Opus from Anthropic beat GPT-4 by a slim margin to take the #1 spot
Claude-3 Haiku reached GPT-4 level performance despite being much smaller
OpenAI is soon launching GPT-5, which sources say will be far better than GPT-4

techspot.com