ArtELingo: A Million Emotion Annotations of WikiArt with Emphasis on Diversity over Language and Culture

Youssef Mohamed, Mohamed Abdelfattah, Shyma Alhuwaider, Feifan Li, Xiangliang Zhang, Kenneth Ward Church, Mohamed Elhoseiny

EMNLP 22


Abstract

This paper introduces ArtELingo, a new benchmark and dataset, designed to encourage work on diversity across languages and cultures. Following ArtEmis, a collection of 80k artworks from WikiArt with 0.45M emotion labels and English-only captions, ArtELingo adds another 0.79M annotations in Arabic and Chinese, plus 4.8K in Spanish to evaluate “cultural-transfer” performance. More than 51K artworks have 5 annotations or more in 3 languages. This diversity makes it possible to study similarities and differences across languages and cultures. Further, we investigate captioning tasks, and find diversity improves the performance of baseline models. ArtELingo is publicly available1 with standard splits and baseline models. We hope our work will help ease future research on multilinguality and culturally-aware AI.



Paper

paper ArXiv 

Code and Dataset

Web Page ArtELingo 

Citation

@InProceedings{mohamed2022artelingo,
      title={ArtELingo: A Million Emotion Annotations of WikiArt with Emphasis
                  on Diversity over Language and Culture},
      author={Mohamed, Youssef and Abdelfattah, Mohamed and Alhuwaider, Shyma and Li, Feifan
                  and Zhang, Xiangliang and Church, Kenneth Ward and Elhoseiny, Mohamed},
      booktitle = {Proceedings of the 2022 Conference on Empirical Methods
                        in Natural Language Processing (EMNLP)}
      year={2022}}