LIPS

Language Interaction Speech and Signe (LIPS)

The LIPS team, made up of researchers in linguistics and language processing, conducts multidisciplinary research into oral -spoken and signed- languages. It cooperates extensively with other teams in the STL department, as well as with other departments in the laboratory.

The team’s scientific challenges concern oral, spoken and signed, languages, with the aim of linguistic description and modelling. The team brings together researchers in natural language processing and linguists to focus on the situated dimension of language: we use a variety of data, of different sizes and from different sources, illustrating linguistic variation in all its dimensions, from minimal units to meaning. Multimodal processing involving the written and aural variety of spoken languages as well as other visual information (e.g. occulometry), or owritten and aural varieties of different languages (e.g. sign language videos with French subtitles), is also at the heart of our concerns. Our work gives rise to a variety of applications: speech and sign language recognition and synthesis, dialogue systems. Our research is interdisciplinary by nature and requires skills in signal processing, linguistics and computer science. Our research is interdisciplinary by nature and requires skills in signal processing, linguistics and computer science.

The team’s activities are organised around three themes:

Information retrieval in dialogues

Work on multimodal and conversational information retrieval is centered around two main pillars: incorpo-
rating multimodality into information retrieval systems and studying dialogic interactions. In more detail, this
research is focused on how to represent multimodal data, taking into account contexts and various multi-
modal aspects in the developed representations, and addressing the challenge posed by the scarcity of avail-
able data. The artificial intelligence methods implemented also tackle issues related to handling degraded
data, continuous and interactive learning, while aiming to make model predictions understandable, with an
eye towards explainability.

Sign language modeling and processing

Sign languages, which are poorly endowed languages, have a linguistic system resulting from their visuo-gestural nature: a large amount of information is expressed simultaneously and organized spatially, and iconicity
plays a central role. Computer modeling of SL requires the design of representations with little
available data, and where pre-existing models, which are essentially linear, have been developed for written
or spoken languages and do not cover all aspects of LS. Through projects and PhD theses and in collaboration with signers of these languages (e.g. deaf translators and journalists), we are tackling the following research question: How can SL be analysed, represented and processed? How can we take into account the linguistic specifics linked to their visual-gestural nature (multilinearity, spatialization, iconicity)? What types of approach are possible with little LSF data? Current projects are detailed on this page.

Speech processing and multilingual variation modeling

Research in this theme aims to understand the variation phenomena that underlie temporal and spatial
changes in spoken language and to develop models for use in automatic speech processing. One of our objectives is to structure the information in audio documents by developing models and algorithms
that rely on diverse information sources and can serve to detect the presence of speech, to identify the lan-
guage being spoken and to characterize the speaker(s), to transcribe the speech into text in the same or a
different language or identify specific entities or acoustic events. Concerning speech recognition, our research aims to complete the word sequence with punctuation and with paralinguistic information such as hesitations, laughter or breath noises. We also study frugal learning techniques and applied them to speech recognition for low e-resourced languages and tasks.

News

Coordination

Team members

Publications

  • Communication dans un congrès

    Théo Gigant, Camille Guinaudeau, Marc Decombas, Frédéric Dufaux. Mitigating the Impact of Reference Quality on Evaluation of Summarization Systems with Reference-Free Metrics. The 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP 2024), Nov 2024, Miami (FL), United States. ⟨hal-04720645⟩

    STL

    Year of publication

    Available in free access

  • Communication dans un congrès

    Emmanuella Martinod, Michael Filhol. Formal Representation of Interrogation in French Sign Language. Proceedings of the 11th Workshop on representation and processing of Sign Languages, May 2024, Turin, Italy. ⟨hal-04712681⟩

    STL

    Year of publication

    Available in free access

  • Communication dans un congrès

    Michael Filhol, Thomas von Ascheberg. A software editor for the AZVD graphical Sign Language representation system. Workshop on the representation and processing Sign Language, May 2024, Turin, Italy. ⟨hal-04712674⟩

    STL

    Year of publication

    Available in free access

  • Communication dans un congrès

    Emmanuella Martinod, Michael Filhol. Examining interrogative marking in French Sign Language with the AZee approach. Clause-type marking in the visual modality, workshop at the Annual Conference of the German Linguistics Society, German Linguistics Society, Feb 2024, Bochum, Germany. ⟨hal-04709019⟩

    STL

    Year of publication

    Available in free access

  • Communication dans un congrès

    Paritosh Sharma, Camille Challant, Michael Filhol. Facial Expressions for Sign Language Synthesis using FACSHuman and AZee. 11th Workshop on the Representation and Processing of Sign Languages: Evaluation of Sign Language Resources, May 2024, Turin, Italy. ⟨hal-04709105⟩

    STL

    Year of publication

    Available in free access

  • Communication dans un congrès

    Paritosh Sharma, Michael Filhol. Sign Language Synthesis using Pose Priors. MOCO ’24: 9th International Conference on Movement and Computing, May 2024, Utrecht Netherlands, France. pp.1-4, ⟨10.1145/3658852.3659080⟩. ⟨hal-04709203⟩

    STL

    Year of publication

    Available in free access

  • Article dans une revue

    Pierre La Rocca, Gaël Guennebaud, Aurélie Bugeau, Anne-Laure Ligozat. Estimating The Carbon Footprint Of Digital Agriculture Deployment: A Parametric Bottom-Up Modelling Approach.. Journal of Industrial Ecology, In press, ⟨10.1111/jiec.13568⟩. ⟨hal-04708774⟩

    STL

    Year of publication

    Available in free access

  • Article dans une revue

    Fanny Ducel, Aurélie Névéol, Karën Fort. La recherche sur les biais dans les modèles de langue est biaisée : état de l’art en abyme. Revue TALTraitement Automatique des langues : traitement automatique des langues, 2024, 64 (3). ⟨hal-04710191⟩

    STL

    Year of publication

    Available in free access

  • Communication dans un congrès, Communication dans un congrès

    Carlos Cuevas Villarmin, Sarah Cohen-Boulakia, Nona Naderi. Reproducibility in Named Entity Recognition: A Case Study Analysis. 2024 IEEE 20th International Conference on e-Science (e-Science), Sep 2024, Osaka, Japan. pp.1-10, ⟨10.1109/e-Science62913.2024.10678721⟩. ⟨hal-04706673⟩

    BioInfo, BioInfo, STL

    Year of publication

  • Communication dans un congrès

    Rémi Uro, Marie Tahon, David Doukhan, Antoine Laurent, Albert Rilliard. Detecting the terminality of speech-turn boundary for spoken interactions in French TV and Radio content. Interspeech 2024, Itshak Lapidot; Sharon Gannot, Sep 2024, Kos, Greece. pp.3560 – 3564, ⟨10.21437/interspeech.2024-1163⟩. ⟨hal-04694968⟩

    STL

    Year of publication

    Available in free access

  • Communication dans un congrès

    Donna Erickson, Albert Rilliard, Malin Svensson Lundmark, Adelaide Silva, Leticia Rebollo Couto, et al.. Collecting Mandible Movement in Brazilian Portuguese. Interspeech 2024, Itshak Lapidot; Sharon Gannot, Sep 2024, Kos, Greece. pp.3145-3149, ⟨10.21437/interspeech.2024-1216⟩. ⟨hal-04694958⟩

    STL

    Year of publication

    Available in free access

  • Communication dans un congrès

    Benjamin Elie, David Doukhan, Rémi Uro, Lucas Ondel Yang, Albert Rilliard, et al.. Articulatory Configurations across Genders and Periods in French Radio and TV archives. Interspeech 2024, Itshak Lapidot; Sharon Gannot, Sep 2024, Kos, Greece. pp.3085-3089, ⟨10.21437/interspeech.2024-1177⟩. ⟨hal-04694868⟩

    STL

    Year of publication

    Available in free access

  • Communication dans un congrès

    Rémi Uro, Marie Tahon, Jane Wottawa, David Doukhan, Albert Rilliard, et al.. Annotation of Transition-Relevance Places and Interruptions for the Description of Turn-Taking in Conversations in French Media Content. Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), Sep 2024, Torino, Italy. pp.1225–1232. ⟨hal-04694997⟩

    STL

    Year of publication

    Available in free access

  • Communication dans un congrès, Communication dans un congrès

    Luc Mottin, Nona Naderi, Anaïs Mottaz, Pierre-André Michel, Gerieke Been, et al.. Comparing Sequence-Based and Literature-Based Pathogenicity Scoring Methods for Human Variants. 34th Medical Informatics Europe Conference, Aug 2024, Athens (Greece), Greece. ⟨10.3233/SHTI240747⟩. ⟨hal-04682928⟩

    STL

    Year of publication

    Available in free access

  • Communication dans un congrès

    Annelies Braffort, Patrice Dalle. Sign language processing: models, representations, tools for video analysis, for signing avatars and for communication. 2nd International Society for Gesture Studies (ISGS 2005) conference: “Interacting bodies”, 2005, Lyon, France. ⟨hal-04678548⟩

    STL

    Year of publication

  • Communication dans un congrès

    Mathilde Aguiar, Pierre Zweigenbaum, Nona Naderi. Récentes avancées de l’inférence en langue naturelle pour les essais cliniques. Journée Santé et IA 2024, AFIA; L3I; La Rochelle Université, Jul 2024, La Rochelle, France. ⟨hal-04667736⟩

    STL

    Year of publication

    Available in free access

  • Article dans une revue

    Leticia Rebollo Couto, Albert Rilliard. Variación pragmática, traducción audiovisual y estrategias conversacionales para el doblaje: léxico coloquial y palabras tabús. Cadernos de Tradução , 2024, Sex, Taboo, and Swearing: Forbidden Words in Audiovisual Translation, 44 (2), pp.1-28. ⟨10.5007/2175-7968.2024.e99158⟩. ⟨hal-04668979⟩

    STL

    Year of publication

    Available in free access

  • Poster de conférence

    Sylvain Kahane, Claudel Pierre-Louis, Sandra Jagodzińska, Agata Savary. The first Haitian Creole treebank. Peer reviewed poster in the 2nd UniDive Workshop, Feb 2024, Naples, Italy. ⟨hal-04667550⟩

    ILES, STL

    Year of publication

  • Communication dans un congrès

    Agata Savary, Daniel Zeman, Verginica Barbu Mititelu, Anabela Barreiro, Olesea Caftanatov, et al.. UniDive: A COST Action on Universality, Diversity and Idiosyncrasy in Language Technology. 3rd Annual Meeting of the Special Interest Group on Under-resourced Languages, May 2024, Torino, Italy. ⟨hal-04667545⟩

    ILES, STL

    Year of publication

    Available in free access

  • Communication dans un congrès

    Najet Hadj Mohamed, Agata Savary, Cherifa Ben Khelil, Jean-Yves Antoine, Iskandar Keskes, et al.. Lexicons Gain the Upper Hand in Arabic MWE Identification. Joint Workshop on Multiword Expressions and Universal Dependencies (MWE-UD) @ LREC-COLING 2024, May 2024, Torino, Italy. ⟨hal-04667546⟩

    ILES, STL

    Year of publication

    Available in free access

  • Autre publication scientifique

    Louis Estève, Agata Savary, Thomas Lavergne. Entropy Behaviour upon Dataset Size Update. 2024. ⟨hal-04666672⟩

    STL

    Year of publication

    Available in free access

  • Communication dans un congrès

    Bui Van-Tuan, Agata Savary. Cross-type French Multiword Expression Identification with Pre-trained Masked Language Models. 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), May 2024, Turin, Italy. pp.4198-4204. ⟨hal-04667119⟩

    ASARD, ILES, STL

    Year of publication

    Available in free access

  • Thèse

    Natalia Kalashnikova. Towards detection of nudges in Human-Human and Human-Machine interactions. Computation and Language [cs.CL]. Université Paris-Saclay, 2024. English. ⟨NNT : 2024UPASG031⟩. ⟨tel-04663129⟩

    STL, STL

    Year of publication

    Available in free access

  • Communication dans un congrès

    Louis Estève, Agata Savary, Thomas Lavergne. Vector Spaces for Quantifying Disparity of Multiword Expressions in Annotated Text. Association for Computational Linguistics – Student Research Workshop, Aug 2024, Bangkok, Thailand. ⟨hal-04660179⟩

    STL

    Year of publication

    Available in free access

  • Article dans une revue

    Annelies Braffort. L’héritage scientifique de Patrice Dalle : le traitement automatique des langues des signes au service de l’enseignement en LSF. La main de Thôt : théories, enjeux et pratiques de la traduction, 2024, 11. ⟨hal-04256752⟩

    STL

    Year of publication

    Available in free access

  • Communication dans un congrès

    Clément Morand, Aurélie Névéol, Anne-Laure Ligozat. MLCA: a tool for Machine Learning Life Cycle Assessment. 2024 International Conference on ICT for Sustainability (ICT4S), Jun 2024, Stockholm, Sweden. ⟨hal-04643414⟩

    STL

    Year of publication

    Available in free access

  • Chapitre d'ouvrage

    Philippe Boula de Mareüil, Antonio Romano, Marc Evrard, Alexandre François. Cartografia di innovazioni rispetto al latino attraverso un atlante sonoro dell’Europa. Erica Autelli. Il patrimonio linguistico storico della Liguria 2, InSedicesimo, pp.51-62, 2024. ⟨hal-04644943⟩

    STL

    Year of publication

    Available in free access

  • Article dans une revue

    Nassim Naderi, Nona Naderi, Huey Chern Boo, Kuan-Huei Lee, Po-Ju Chen. Editorial: Food tourism: culture, technology, and sustainability. Frontiers in Nutrition, 2024, 11 (1), pp.e42630. ⟨10.3389/fnut.2024.1390676⟩. ⟨hal-04644101⟩

    STL

    Year of publication

    Available in free access

  • Pré-publication, Document de travail

    Jenny Copara, Nona Naderi, Gilles Falquet, Douglas Teodoro. A data-driven assessment of biomedical terminology evolution using information theoretical and network analysis approaches. 2024. ⟨hal-04644071⟩

    STL

    Year of publication

  • Communication dans un congrès

    Constant Bonard, Gustave Cortal. Improving Language Models for Emotion Analysis: Insights from Cognitive Science. Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics, Association for Computational Linguistics, Aug 2024, Bangkok, Thailand. pp.264-277. ⟨hal-04624340v3⟩

    STL

    Year of publication

    Available in free access

  • Communication dans un congrès

    Camille Challant, Michael Filhol. Extension d’AZee avec des règles de production concernant les gestes non-manuels pour la langue des signes française. 35èmes Journées d’Études sur la Parole (JEP 2024) 31ème Conférence sur le Traitement Automatique des Langues Naturelles (TALN 2024) 26ème Rencontre des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RECITAL 2024), Jul 2024, Toulouse, France. pp.410-421. ⟨hal-04623032⟩

    STL

    Year of publication

    Available in free access

  • Communication dans un congrès

    Clémence Sebe, Sarah Cohen-Boulakia, Olivier Ferret, Aurélie Névéol. Extraction d’entités nommées décrivant des chaînes de traitement bioinformatiques dans des articles scientifiques en anglais. 35èmes Journées d’Études sur la Parole (JEP 2024) 31ème Conférence sur le Traitement Automatique des Langues Naturelles (TALN 2024) 26ème Rencontre des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RECITAL 2024), Jul 2024, Toulouse, France. pp.422-434. ⟨hal-04623033⟩

    BioInfo, STL, STL

    Year of publication

    Available in free access

  • Communication dans un congrès

    Rémi Uro, Albert Rilliard, David Doukhan, Marie Tahon, Antoine Laurent. Évaluation perceptive de l’anticipation de la prise de parole lors d’interactions dialogiques en français. 35èmes Journées d’Études sur la Parole (JEP 2024) 31ème Conférence sur le Traitement Automatique des Langues Naturelles (TALN 2024) 26ème Rencontre des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RECITAL 2024), Mathieu Balaguer; Nihed Bendahman; Lydia-Mai Ho-dac; Julie Mauclair; Jose G Moreno; Julien Pinquier., Jul 2024, Toulouse, France. pp.390-400. ⟨hal-04623090⟩

    STL

    Year of publication

    Available in free access

  • Communication dans un congrès

    Marco Naguib, Aurélie Névéol, Xavier Tannier. Reconnaissance d’entités cliniques en few-shot en trois langues. 35èmes Journées d’Études sur la Parole (JEP 2024) 31ème Conférence sur le Traitement Automatique des Langues Naturelles (TALN 2024) 26ème Rencontre des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RECITAL 2024), Jul 2024, Toulouse, France. pp.169-197. ⟨hal-04623016v2⟩

    STL

    Year of publication

    Available in free access

  • Communication dans un congrès

    Maxime Fily, Guillaume Wisniewski, Séverine Guillaume, Gilles Adda, Alexis Michaud. Mesure du niveau de proximité entre enregistrements audio et évaluation indirecte du niveau d’abstraction des représentations issues d’un grand modèle de langage. 35èmes Journées d’Études sur la Parole (JEP 2024) 31ème Conférence sur le Traitement Automatique des Langues Naturelles (TALN 2024) 26ème Rencontre des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RECITAL 2024), Jul 2024, Toulouse, France. pp.112-121. ⟨hal-04623064⟩

    STL

    Year of publication

    Available in free access

  • Communication dans un congrès

    François Buet, Camille Guinaudeau, Cyril Grouin, Sahar Ghannay, Shin’Ichi Satoh. Utiliser l’explicabilité des modèles pour mettre en évidence les expressions genrées dans la parole. 35èmes Journées d’Études sur la Parole (JEP 2024) 31ème Conférence sur le Traitement Automatique des Langues Naturelles (TALN 2024) 26ème Rencontre des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RECITAL 2024), Jul 2024, Toulouse, France. pp.695-707. ⟨hal-04623052⟩

    STL

    Year of publication

    Available in free access

  • Communication dans un congrès

    Atilla Kaan Alkan, Felix Grezes, Cyril Grouin, Fabian Schüssler, Pierre Zweigenbaum. astroECR : enrichissement d’un corpus astrophysique en entités nommées, coréférences et relations sémantiques. 35èmes Journées d’Études sur la Parole (JEP 2024) 31ème Conférence sur le Traitement Automatique des Langues Naturelles (TALN 2024) 26ème Rencontre des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RECITAL 2024), Jul 2024, Toulouse, France. pp.720-733. ⟨hal-04623049⟩

    STL

    Year of publication

    Available in free access

  • Communication dans un congrès

    Thomas Gerald, Louis Tamames, Sofiane Ettayeb, Patrick Paroubek, Anne Vilnat. CQuAE : Un nouveau corpus de question-réponse pour l’enseignement. 35èmes Journées d’Études sur la Parole (JEP 2024) 31ème Conférence sur le Traitement Automatique des Langues Naturelles (TALN 2024) 26ème Rencontre des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RECITAL 2024), Jul 2024, Toulouse, France. pp.50-63. ⟨hal-04623009⟩

    STL

    Year of publication

    Available in free access

  • Communication dans un congrès

    Pierre Lepagnol, Thomas Gerald, Sahar Ghannay, Christophe Servan, Sophie Rosset. Les petits modèles sont bons : une étude empirique de classification dans un contexte zero-shot. 35èmes Journées d’Études sur la Parole (JEP 2024) 31ème Conférence sur le Traitement Automatique des Langues Naturelles (TALN 2024) 26ème Rencontre des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RECITAL 2024), Jul 2024, Toulouse, France. pp.113-129. ⟨hal-04623012v2⟩

    STL

    Year of publication

    Available in free access

  • Communication dans un congrès

    Hugo Boulanger, Nicolas Hiebel, Olivier Ferret, Karën Fort, Aurélie Névéol. Génération contrôlée de cas cliniques en français à partir de données médicales structurées. 35èmes Journées d’Études sur la Parole (JEP 2024) 31ème Conférence sur le Traitement Automatique des Langues Naturelles (TALN 2024) 26ème Rencontre des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RECITAL 2024), Jul 2024, Toulouse, France. pp.435-448. ⟨hal-04623034⟩

    STL

    Year of publication

    Available in free access