It’s much more complicated than this. Given that models have been shown to spit out verbatim copies of some training material, it can be argued that the weights do in fact encode the material, just in some obfuscated way. Additionally, it can be argued that the output of the model is a derivative copy of the original work regardless of whether the original work can be “found inside” the model weights, just by the nature of the process. As of now, there is no precedent that I know of on whether this constitutes redistribution of copyrighted material.
Would distribution in the form of an AI not constitute a different form of seeding? I think it should.
No, you can’t find any copyrighted text inside the model’s weights.
It’s much more complicated than this. Given that models have been shown to spit out verbatim copies of some training material, it can be argued that the weights do in fact encode the material, just in some obfuscated way. Additionally, it can be argued that the output of the model is a derivative copy of the original work regardless of whether the original work can be “found inside” the model weights, just by the nature of the process. As of now, there is no precedent that I know of on whether this constitutes redistribution of copyrighted material.