Would expect a lot of models to struggle with making the pope female, making the pope black, or making a black female a pope unless they build in some kind of technique to make replacements. Thing is, a neural net reproduces what you put into it, and I assume the bias is largely towards old white men since those images are way more readily found.
Even targeted prompts, like a zebra with rainbow colored stripes, had very limited results 6 monts ago where there would be at least 50% non black and white stripes. I had to generate multiple times with a lot of negative terms just to get close. Currently, the first generation of copilot matches my idea behind the prompt.
Clearly the step made was a big one, and I imagine tuning was done to ensure models capable of returning more diverse results rather just what is in the data set. It just has more unexpected results and less historically accurate images for these kind of prompts. And some that might be quite painful. Still, being always underrepresented in data sets is also quite painful. Hard to get to a perfect product quickly, but there should be a feature somewhere on their backlog to by default prevent some substitutions. Black, female popes when requesting a generated pope? To me that is a horizon broadening feature. Black, female nazis when requesting nazis? Let that not be a default result.
Have you considered the comment not being serious but highlighting the absurdity of antitrans folks being so incredibly focused on young peoples genitals while decrying others as perverted for wearing clothes that differ from their own?