Yeah that makes sense. I know people are concerned about recycling AI output into training inputs, but I don’t know that I’m entirely convinced that’s damning.
The theory behind this is that no ML model is perfect. They will always make some errors. So if these errors they make are included in the training data, then future ML models will learn to repeat the same errors of old models + additional errors.
Over time, ML models will get worse and worse because the quality of the training data will get worse. It’s like a game of Chinese whispers.
Yeah I agree garbage in garbage out, but I don’t know that is what will happen. If I create a library, and then use gpt to generate documentation for it, I’m going to review and edit and enrich that as the owner of that library. I think a great many people are painting this cycle in black and white, implying that any involvement from AI is automatically garbage, and that’s fallacious and inaccurate.
Yeah that makes sense. I know people are concerned about recycling AI output into training inputs, but I don’t know that I’m entirely convinced that’s damning.
The theory behind this is that no ML model is perfect. They will always make some errors. So if these errors they make are included in the training data, then future ML models will learn to repeat the same errors of old models + additional errors.
Over time, ML models will get worse and worse because the quality of the training data will get worse. It’s like a game of Chinese whispers.
No matter how good your photocopier is, a copy of a copy is worse, and gets worse everytime you do it.
GIGO.
Yeah I agree garbage in garbage out, but I don’t know that is what will happen. If I create a library, and then use gpt to generate documentation for it, I’m going to review and edit and enrich that as the owner of that library. I think a great many people are painting this cycle in black and white, implying that any involvement from AI is automatically garbage, and that’s fallacious and inaccurate.