AI still not great at generating clean code in API study
Large Language Models (LLMs) like GPT-3.5 and GPT-4 have been found to have high rates of API misuse when answering Java coding questions from StackOverflow, while the open model Llama 2 exhibited a failure rate of less than one percent due to its lack of code suggestions.