• DarkCloud@lemmy.world
    link
    fedilink
    English
    arrow-up
    6
    arrow-down
    4
    ·
    edit-2
    2 days ago

    Assuming Open AI ect only use data from the public domain is stupid (and contrary to most news sources on the matter). He has literally no idea what the AI has trained on (not even developers know, because there’s just too much of it to be reviewed by humans). They’ve undoubtedly bought countless amounts of data that isn’t readily searchable by public engines.

    He sounds very ill informed on the matter of data collection and probably just had his info/data on a cloud service somewhere whose text was part of the trillions of terrabytes LLM have accessed and trained on.