Websites Wrestle with Keeping Value from AI Crawlers That Don't Play by Old Rules

robots.txt is a 30-year-old text file that lets websites control which crawlers can access their sites, but has no legal authority
It was created as a mutual agreement between sites and search engines to balance value and problems from crawling
Recently, AI models have changed the equation by extracting huge value from sites' data with no reciprocity
Many major sites like BBC and NY Times now block AI crawlers, seeing them as stealing rather than trading value
Robots.txt relies on goodwill so lacks teeth against unscrupulous crawlers, leading some to call for stronger crawler governance