As AI-generated content fills the Internet, it’s corrupting the training data for models to come. What happens when AI eats itself?

  • blivet@kbin.social
    link
    fedilink
    arrow-up
    1
    ·
    1 year ago

    So in order for data to be useful to AIs, AI-generated content will have to be flagged as such. Sounds good to me.

    • admiralteal@kbin.social
      link
      fedilink
      arrow-up
      1
      ·
      1 year ago

      But malicious actors don’t want their generated data to be recognizable to LLMs. They want it to be impersonating real people in order to promote advertising/misinformation goals.

      Which means that even if they started flagging LLM generated content as LLM generated, that would just mean only the most malicious and vile LLM contents will be out there training models in the future.

      I don’t see any solution to this on the horizon. Pandora is out of the box.

  • TimeSquirrel@kbin.social
    link
    fedilink
    arrow-up
    1
    ·
    1 year ago

    It’s basically RE:RE:RE:RE:RE:RE and corrupted jpegs that have been reposted and compressed a thousand times, but for AI.