Posted 4/15/2024, 3:14:08 PM

Lack of Standards for Measuring AI Capabilities Creates Confusion for Users

There is no standard way to measure the capabilities of AI systems like ChatGPT, making it hard to evaluate them.
AI companies use vague claims about "improved capabilities" without evidence.
Existing AI tests have doubts about their reliability.
This lack of measurement poses problems - people don't know which AI tools to use for different tasks.
There is no "Good Housekeeping seal" for verifying the abilities of AI systems.

nytimes.com