Pré-publication, Document de travail, STL, Computer Science, Computer Vision and Pattern Recognition, Vision par ordinateur et reconnaissance de formes
Entity-aware cross-modal pretraining for Knowledge-Based Visual Question Answering
Omar Adjali, Olivier Ferret, Sahar Ghannay, Hervé Le Borgne. Entity-aware cross-modal pretraining for Knowledge-Based Visual Question Answering. 2024. ⟨cea-04910767⟩