Attending self-attention: A case study of visually grounded supervision in vision-and-language transformers
Type
Publication
Proc. Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: Student Research Workshop