Study Finds That 52 Percent of ChatGPT Answers to Programming Questions Are Wrong

ekZepp@lemmy.world · 1 month ago

Study Finds That 52 Percent of ChatGPT Answers to Programming Questions Are Wrong

BlameThePeacock@lemmy.ca · 1 month ago

AI will help with that too, it’s going to be able to process entire codebases at a time pretty shortly here.

Given the visual capabilities now emerging, it can likely also do human-equivalent testing.

One of the biggest AI tricks we haven’t started seeing much of yet in mainstream use is this kind of automated double-checking. Where it generates an answer, and then validates if the answer is valid before actually giving it to a human. Especially in coding bases, there really isn’t anything stopping it from coming up with an answer compiling, running into an error, re-generating, and repeating until the code passes all unit tests or even potentially visual inspection.

The big limit on this right now is sheer processing cost and context lengths for the models. However, costs for this are dropping faster than any new tech we’ve seen, and it will likely be trivial in just a few years.