Abstract

Accurate representation in media is known to improve the well-being of the people who consume it. Generative image models trained on large web-crawled datasets such as LAION are known to produce images with harmful stereotypes and misrepresentations of cultures. We improve inclusive representation in generated images by (1) engaging with communities to collect a culturally representative dataset that we call the Cross-Cultural Understanding Benchmark (CCUB) and (2) proposing a novel Self-Contrastive Fine-Tuning (SCoFT) method that leverages the model’s known biases to self-improve. SCoFT is designed to prevent overfitting on small datasets, encode only high-level information from the data, and shift the generated distribution away from misrepresentations encoded in a pretrained model. Our user study conducted on 51 participants from 5 different countries based on their self-selected national cultural affiliation shows that fine-tuning on CCUB consistently generates images with higher cultural relevance and fewer stereotypes when compared to the Stable Diffusion baseline, which is further improved with our SCoFT technique.

Paper: https://arxiv.org/abs/2401.08053

  • webghost0101@sopuli.xyz
    link
    fedilink
    English
    arrow-up
    2
    ·
    edit-2
    9 months ago

    This is really cool and a big improvement but i do see an issue remaining which is how cultures can change over time. Traditional Japanese versus Otaku for an easy contrasting example.

    In modern times there are people in many shades of colors living near me, all of which i would consider as much part of my culture as i myself but older generation do talk about how back in the day someone with a different color was like seeing a unicorn. If i go back hundreds of years ago my own ancestors looked differently enough that they would likely consider me an outside.

    I suppose the best way to solve is is to label training data on both culture and time. But there will always be people in disagreement of what is and what does not represent their modern culture. There are also quite a number of foods and other things that i would consider part of out modern culture that are actually parts of some other cultures older heritage.

    Its harder to argue about historical culture but training to much on that could have a side-effects of increased stigmatization and bias, It could create a fake link where poc are placed more often in context of more primitive society, poorer clothes and cultural real but nowadays outdated practices.