I think it’s impossible to produce CSAM without training data of CSAM (though this is just an opinion). Young people don’t look like adults when naked so I don’t think there’s anyway an AI would hallucinate CSAM without some examples to train on.
In this hypothetical, the AI would be trained on fully clothed adults and children. As well as what many of those same adults look like unclothed. It might not get things completely right on its initial attempt, but with some minor prompting it should be able to get pretty close. That said, the AI will know the correct head size proportions from just the clothed datasets. It could probably even infer limb proportions from the clothed datasets as well.
It could definitely get head and limb proportions correct, but there are some pretty basic changes that happen with puberty that the AI would not be able to reverse engineer.
This is the part of the conversation where I have to admit that you could be right, but I don’t know enough to say one way or the other. And since I have no plans to become a pediatrician, I don’t intend to go find out.
There are legit, non-CSAM types of images that would still make these changes apparent, though. Not every picture of a naked child is CSAM. Family photos from the beach, photos in biology textbooks, even comic-style illustrated children’s books will allow inferences about what real humans look like. So no, I don’t think that an image generation model has to be trained on any CSAM in order to be able to produce convincing CSAM.
This is a fair point - if we allow a model to be trained on non-sexualizing minor nudity it likely could sexualize those models without actually requiring sexualized minors to do so. I’m still not certain if that’s a good thing, but I do agree with you.
Yeah, it certainly still feels icky, especially since a lot of those materials in all likelihood will still have ended up in the model without the original photo subjects knowing about it or consenting. But that’s at least much better than having a model straight up trained on CSAM, and at least hypothetically, there is a way to make this process entirely “clean”.
I think there are two arguments going on here, though
Most people arguing point 1 would be willing concede point 2, especially since you linked evidence of it.
I think it’s impossible to produce CSAM without training data of CSAM (though this is just an opinion). Young people don’t look like adults when naked so I don’t think there’s anyway an AI would hallucinate CSAM without some examples to train on.
In this hypothetical, the AI would be trained on fully clothed adults and children. As well as what many of those same adults look like unclothed. It might not get things completely right on its initial attempt, but with some minor prompting it should be able to get pretty close. That said, the AI will know the correct head size proportions from just the clothed datasets. It could probably even infer limb proportions from the clothed datasets as well.
It could definitely get head and limb proportions correct, but there are some pretty basic changes that happen with puberty that the AI would not be able to reverse engineer.
This is the part of the conversation where I have to admit that you could be right, but I don’t know enough to say one way or the other. And since I have no plans to become a pediatrician, I don’t intend to go find out.
There are legit, non-CSAM types of images that would still make these changes apparent, though. Not every picture of a naked child is CSAM. Family photos from the beach, photos in biology textbooks, even comic-style illustrated children’s books will allow inferences about what real humans look like. So no, I don’t think that an image generation model has to be trained on any CSAM in order to be able to produce convincing CSAM.
This is a fair point - if we allow a model to be trained on non-sexualizing minor nudity it likely could sexualize those models without actually requiring sexualized minors to do so. I’m still not certain if that’s a good thing, but I do agree with you.
Yeah, it certainly still feels icky, especially since a lot of those materials in all likelihood will still have ended up in the model without the original photo subjects knowing about it or consenting. But that’s at least much better than having a model straight up trained on CSAM, and at least hypothetically, there is a way to make this process entirely “clean”.