La inteligencia artificial para el diseño de personajes en videojuegos: estudio de sesgos y estereotipos en Midjourney®

Javier Corzo Martínez

https://orcid.org/0009-0001-2013-5899

Spain

Universidad de La Laguna image/svg+xml

Licenciado en Bellas Artes por la Universidad de La Laguna en 2012 y doctorando en la Facultad de Bellas Artes de la Universidad de La Laguna desde 2023. Ilustrador y concept artist en proyectos audiovisuales para clientes como Sony, Rovio, La Sexta TV, Cartoon Network Latam, BBC, Disney y RTVE.

Manuel Drago Díaz Alemán

https://orcid.org/0000-0002-2305-8219

Spain

Universidad de La Laguna image/svg+xml

Licenciado en Bellas Artes por la Universitat Politècnica de València en 1991 y doctorado en Bellas Artes por la Universidad de La Laguna en 1995. Responsable del grupo de investigación Diseño y Fabricación Digital de la ULL.

Jorge de La Torre Cantero

https://orcid.org/0000-0001-5516-0456

Spain

Universidad de La Laguna image/svg+xml

Doctor por la Universitat Politècnica de València, Programa de Doctorado en Diseño, Fabricación y Gestión de Proyectos Industriales. Pertenece al área de expresión gráfica en ingeniería del Departamento de Técnicas y Proyectos en Ingeniería y Arquitectura de la Universidad de La Laguna.
|

Aceptado: 26-02-2025

|

Publicado: 31-03-2025

DOI: https://doi.org/10.4995/aniav.2025.23353
Datos de financiación

Descargas

Palabras clave:

Inteligencia artificial, concept art, sesgos, estereotipos, Midjourney

Agencias de apoyo:

Esta investigación no contó con financiación

Resumen:

Las inteligencias artificiales generativas de imágenes han impactado el arte de los videojuegos y a la industria audiovisual, modificando el trabajo de ilustradores y concept artists. Numerosos estudios han demostrado que estas herramientas reproducen estereotipos de edad, género y etnia, lo que plantea preocupaciones éticas. Este estudio analizó Midjourney®, evaluando las imágenes generadas a partir de cinco categorías de prompts. Los resultados revelaron sesgos estructurales y fallos en los filtros NSFW (not safe for work), que en ocasiones produjeron resultados opuestos a su propia finalidad. Dado que Midjourney® refuerza estereotipos sociales y culturales, se enfatiza la necesidad de supervisión humana en la creación de personajes y se invita a que los desarrolladores de este tipo de software atiendan a las inequidades que sus productos ayudan a reforzar.

Ver más Ver menos

Citas:

Alenichev, A., Kingori, P. y Grietens, K. P. (2023). Reflections before the storm: the AI reproduction of biased imagery in global health Visuals. The Lancet Global Health, 11(10), e1496-e1498. https://doi.org/10.1016/S2214-109X(23)00329-7

Barocas, S., Hardt, M. y Narayanan, A. (2023). Fairness and Machine learning. Limitations and opportunities. Introduction. Fairmlbook.org. MIT Press. https://fairmlbook.org/introduction.html

Bianchi, F., Kalluri, P., Durmus, E., Ladhak, F., Cheng, M., Nozza, D., Hashimoto, T., Jurafsky, D., Zou, J. y Caliskan, A. (2023). Easily accessible text-to-image generation amplifies demographic stereotypes at large scale. Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency, 1493-1504. https://doi.org/10.1145/3593013.3594095

Birhane, A., Prabhu, V. U. y Kahembwe, E. (2021). Multimodal datasets: misogyny, pornography, and malignant stereotypes. arXiv preprint arXiv:2110.01963. https://doi.org/10.48550/arXiv.2110.01963

Common Crawl. (2023). Open Repository of Web Crawl Data. https://commoncrawl.org/

Davis, W. (2023). Valve won’t approve Steam games that use copyright-infringing AI artwork. The Verge. https://www.theverge.com/2023/7/1/23781339/valve-steam-ai-artwork-rejecting-banning-pc-games

Dhariwal, P. y Nichol, A. (2021). Diffusion models beat GANs on image synthesis. Advances in Neural Information Processing Systems, 34, 8780-8794. https://doi.org/10.48550/arXiv.2105.05233

Dunlap, M. (2023). StableDiffusion2-Image-to-Text. GitHub. https://github.com/mddunlap924/StableDiffusion2-Image-to-Text

Dwyer, B. y Brems, M. (1 de septiembre de 2024). What is CLIP? Contrastive Language-Image Pre-Processing Explained. Roboflow. https://blog.roboflow.com/openai-clip/

Elastic.co (s. f.). What is a large language model (LLM)? Elastic.co. https://www.elastic.co/what-is/large-language-models

Elgar, M. (2018). Are redheads with blue eyes really going extinct? Find an expert. University of Melbourne. https://findanexpert.unimelb.edu.au/news/1167-are-redheads-with-blue-eyes-really-going-extinct%3F

Friedman, B. y Nissenbaum, H. (1996). Bias in computer systems. ACM Transactions on Information Systems (TOIS), 14(3), 330-347. https://dl.acm.org/doi/10.1145/230538.230561

Frolov, S., Hinz, T., Raue, F., Hees, J. y Dengel, A. (2021). Adversarial text-to-image synthesis: A review. Neural Networks, 144, 187-209. https://doi.org/10.1016/j.neunet.2021.07.019

Guiness, H. (18 de septiembre de 2024). The Best AI Image Generators in 2024. Zapier. https://zapier.com/blog/best-ai-image-generator/

Heuer, C. A., McClure, K. J. y Puhl, R. M. (2011). Obesity stigma in online news: a visual content analysis. Journal of Health Communication, 16(9), 976-987. https://doi.org/10.1080/10810730.2011.561915

Hoffmann, A. L. (30 de abril de 2018). Data Violence and How Bad Engineering Choices Can Damage Society. Medium. https://medium.com/@annaeveryday/data-violence-and-how-bad-engineering-choices-can-damage-society-39e44150e1d4

Jiang, H. H., Brown, L., Cheng, J., Khan, M., Gupta, A., Workman, D., Hanna, A., Flowers, J. y Gebru, T. (2023). AI Art and its Impact on Artists. Proceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society, 363-374. https://doi.org/10.1145/3600211.3604681

King, M. (2022). Harmful biases in artificial intelligence. The Lancet Psychiatry, 9(11), e48. https://doi.org/10.1016/S2215-0366(22)00312-1

Lanz, J. M. (5 de septiembre 2023). Epic Games Embraces AI In Games, In Break with Steam. Decrypt.co. https://decrypt.co/155023/epic-games-embraces-ai-in-games-in-break-with-steam

Lin, T. Y., Maire, M., Belongie, Bourdev, L., Girshick, R., S., Hays, J., Perona, P., Ramanan, D., Zitnick, C. L. y Dollár, P. (2014). Microsoft COCO: Common Objects in Context. Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13, 740-755. Springer International Publishing. https://doi.org/10.48550/arXiv.1405.0312

Luccioni, A. S. y Viviano, J. D. (2021). What's in the Box? A Preliminary Analysis of Undesirable Content in the Common Crawl Corpus. arXiv preprint arXiv:2105.02732. https://doi.org/10.48550/arXiv.2105.02732

Mendelovich, Y. (15 de febrero de 2023). Midjourney Is Being Class-Action Sued for Severe Copyright Infringements. Y. M. Cinema Magazine. https://ymcinema.com/2023/02/15/midjourney-is-being-class-action-sued-for-severe-copyright-infringements/

Midjourney (s. f). Model Version 5. Midjourney https://docs.midjourney.com/docs/en/model-version-5

Milmo, D. y Hern, A. (28 de febrero de 2024). Google chief admits “biased” AI tool’s photo diversity offended users. The Guardian. https://www.theguardian.com/technology/2024/feb/28/google-chief-ai-tools-photo-diversity-offended-users

Nguyen, D. (2023). The Effects of AI on Digital Artist [Tesis de grado]. Haaga-Helia University of Applied Sciences Business Information Technology – BITE. Finlandia. https://urn.fi/URN:NBN:fi:amk-202305088212

Nilsback, M. E. y Zisserman, A. (2008). Automated flower classification over a large number of classes. 2008 Sixth Indian conference on computer vision, graphics & image processing, 722-729. IEEE. https://doi.org/10.1109/ICVGIP.2008.47

O’Connor, R. (12 de mayo de 2022). Introduction to Diffusion Models for Machine Learning. AssemblyAI. https://www.assemblyai.com/blog/diffusion-models-for-machine-learning-introduction/

O’Connor, R. (19 de abril de 2023). How physics advanced Generative AI. AssemblyAI. https://www.assemblyai.com/blog/how-physics-advanced-generative-ai/#generative-ai-with-thermodynamics

Parrish, A. (16 de enero de 2024). Square Enix says it used AI art in upcoming Foamstars game. The Verge. https://www.theverge.com/2024/1/16/24040124/square-enix-foamstars-ai-art-midjourney

Patton, T. O. (2006). Hey girl, am I more than my hair?: African American women and their struggles with beauty, body image, and hair. NWSA Journal, 18(2), 24-51. https://www.jstor.org/stable/4317206

Porokh, A. (19 de octubre de 2023). How will AI disrupt video game industry in 2023? Kevuru Games. https://kevurugames.com/blog/how-ai-is-disrupting-the-video-game-industry/

Radford, A., Kim, J. W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., Krueger, G. y Sutskever, I. (2021). Learning transferable visual models from natural language supervision. International conference on machine learning, 8748-8763. PMLR. https://doi.org/10.48550/arXiv.2103.00020

Rashtchian, C., Young, P., Hodosh, M. y Hockenmaier, J. (2010). Collecting Image Annotations Using Amazon's Mechanical Turk. Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk. https://vision.cs.uiuc.edu/pascal-sentences/

Rombach, R., Blattmann, A., Lorenz, D., Esser, P. y Ommer, B. (2022). High-resolution image synthesis with latent diffusion models. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 10684-10695. https://doi.org/10.48550/arXiv.2112.10752

Said, E. W. (1985). Orientalism Reconsidered. Cultural Critique, 1, 89–107. https://doi.org/10.2307/1354282

Schuhmann, C., Beaumont, R., Vencu, R., Gordon, C., Wightman, R., Cherti, M., Coombes, T., Katta, A., Mullis, C., Wortsman, M., Schramowski, P., Kundurthy, S., Crowson, K., Schmidt, L., Kaczmaczyk, R. y Jitsev, J. (2022). Laion-5b: An open large-scale dataset for training next generation image-text models. Advances in Neural Information Processing Systems, 35, 25278-25294. https://doi.org/10.48550/arXiv.2210.08402

Schuhmann, C., Vencu, R., Beaumont, R., Kaczmarczyk, R., Mullis, C., Katta, A., Coombes, T., Jitsev, J. y Komatsuzaki, A. (2021). Laion-400m: Open dataset of clip-filtered 400 million image-text pairs. arXiv preprint arXiv:2111.02114. https://doi.org/10.48550/arXiv.2111.02114

Selbst, A. D., Boyd, D., Friedler, S. A., Venkatasubramanian, S. y Vertesi, J. (2019). Fairness and abstraction in sociotechnical systems. Proceedings of the conference on Fairness, Accountability, and Transparency, 59-68. https://doi.org/10.1145/3287560.3287598

Stable Diffusion (2023). Frequently asked questions. Stable Diffusion https://stablediffusionweb.com/

Steam. (10 de enero de 2024). AI content on Steam. Steam Community. https://steamcommunity.com/groups/steamworks/announcements/detail/3862463747997849619

The Artificial Intelligence Channel. (10 de diciembre de 2017). The Trouble with Bias - NIPS 2017 Keynote - Kate Crawford #NIPS2017 [Archivo de Vídeo]. YouTube. https://www.youtube.com/watch?v=fMym_BKWQzk

The Media Majlis (2018). Interview with Viola Shafik. [Archivo de Vídeo]. The Media Majlis at Northwestern University in Qatar. https://mediamajlis.northwestern.edu/en/explore/explore-content/2018-18-5-interview-interview-with-viola-shafik-shafik-viola-the-media-majlis-at-northwestern-university-in-qatar-2018-2018-18

Traylor. J. (27 de julio de 2022). No quick fix: How OpenAI's DALL·E 2 illustrated the challenges of bias in AI. NBC News. https://www.nbcnews.com/tech/tech-news/no-quick-fix-openais-dalle-2-illustrated-challenges-bias-ai-rcna39918

Vahdat, A. y Kreis, K. (26 de abril de 2022). Improving diffusion models as an alternative to GANs, Part 1. NVIDIA Technical Blog. https://developer.nvidia.com/blog/improving-diffusion-models-as-an-alternative-to-gans-part-1/

Wah, C., Branson, S., Welinder, P., Perona, P. y Belongie, S. (2011). Caltech-UCSD Birds-200-2011 (CUB-200-2011). California Institute of Technology. https://www.vision.caltech.edu/datasets/cub_200_2011/

Wolfe, R. y Caliskan, A. (2022). Markedness in visual semantic AI. Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency, 1269-1279. https://doi.org/10.1145/3531146.3533183

Young, P., Lai, A., Hodosh, M. y Hockenmaier, J. (2014). From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions. Transactions of the Association for Computational Linguistics, 2, 67-78. https://doi.org/10.1162/tacl_a_00166

Zhang, H. (6 de noviembre de 2023). Xbox and Inworld AI partner to empower game creators with the potential of generative AI. Microsoft Game Dev. https://developer.microsoft.com/en-us/games/articles/2023/11/xbox-and-inworld-ai-partnership-announcement/

Ver más Ver menos