This is probably very unlikely and I got no idea what I’m talking about: But what if feeding it even small amounts of its own content, text produced by a chatgpt instance, poisons it? That it gets confused from being fed text that adheres perfectly to its own rules, and locks that text down as perfect and not needing small variations.
I remember some article warning about this in a big scale, and I’m thinking why must it be big? If its only a probability tree, even small changes to the probability would cause issues further up the branches.
I don’t know if small amounts of text could do that, but I could imagine if LLMs keeps get trained on data generated by itself and other LLMs (which is likely to become a major source of content on the internet in the future), the quality of output can decrease significantly over time.
This is probably very unlikely and I got no idea what I’m talking about: But what if feeding it even small amounts of its own content, text produced by a chatgpt instance, poisons it? That it gets confused from being fed text that adheres perfectly to its own rules, and locks that text down as perfect and not needing small variations.
I remember some article warning about this in a big scale, and I’m thinking why must it be big? If its only a probability tree, even small changes to the probability would cause issues further up the branches.
But blind speculation.
I don’t know if small amounts of text could do that, but I could imagine if LLMs keeps get trained on data generated by itself and other LLMs (which is likely to become a major source of content on the internet in the future), the quality of output can decrease significantly over time.