I hate to break it to you, but if you’re running an LLM based on (for example) Llama the training data (corpus) that went into it was still large parts of the Internet.
The fact that you’re running the prompts locally doesn’t change the fact that it was still trained on data that could be considered protected under copyright law.
It’s going to be interesting to see how the law shakes out on this one, because an artist going to an art museum and doing studies of those works (and let’s say it’s a contemporary art museum where the works wouldn’t be in the public domain) for educational purposes is likely fair use - and possibly encouraged to help artists develop their talents. Musicians practicing (or even performing) other artists’ songs is expected during their development. Consider some high school band practicing in a garage, playing some song to improve their skills.
I know the big difference is that it’s people training vs a machine/LLM training, but that seems to come down to not so much a copyright issue (which it is in an immediate sense) as a “should an algorithm be entitled to the same protections as a person? If not, what if real AI (not just an LLM) is developed? Should those entities be entitled to personhood?”
Linux is a lot, lot, lot easier to use now than the 90s.