Pré-publication, Document de travail, STL, Informatique, Vision par ordinateur et reconnaissance de formes
Entity-aware cross-modal pretraining for Knowledge-Based Visual Question Answering
Omar Adjali, Olivier Ferret, Sahar Ghannay, Hervé Le Borgne. Entity-aware cross-modal pretraining for Knowledge-Based Visual Question Answering. 2024. ⟨cea-04910767⟩
Publié le