LLMs Battle in Street Fighter III Tournament to Benchmark AI Decision-Making
• New Street Fighter III AI benchmark pits LLMs against each other in arcade fighting game • Developed at recent hackathon; tests LLMs' ability to make real-time strategic decisions • GPT 3.5 Turbo won 8-LLM tournament; Anthropic's claude_3_haiku won 14-LLM tournament • Fighting performance sometimes impacted by LLM quirks like hallucinations • Questions remain if beat-em-up game is useful benchmark or just an interesting distraction