LIPS

Language Interaction Speech and Signe (LIPS)

The LIPS team, made up of researchers in linguistics and language processing, conducts multidisciplinary research into oral -spoken and signed- languages. It cooperates extensively with other teams in the STL department, as well as with other departments in the laboratory.

The team’s scientific challenges concern oral, spoken and signed, languages, with the aim of linguistic description and modelling. The team brings together researchers in natural language processing and linguists to focus on the situated dimension of language: we use a variety of data, of different sizes and from different sources, illustrating linguistic variation in all its dimensions, from minimal units to meaning. Multimodal processing involving the written and aural variety of spoken languages as well as other visual information (e.g. occulometry), or owritten and aural varieties of different languages (e.g. sign language videos with French subtitles), is also at the heart of our concerns. Our work gives rise to a variety of applications: speech and sign language recognition and synthesis, dialogue systems. Our research is interdisciplinary by nature and requires skills in signal processing, linguistics and computer science. Our research is interdisciplinary by nature and requires skills in signal processing, linguistics and computer science.

The team’s activities are organised around three themes:

Information retrieval in dialogues

Work on multimodal and conversational information retrieval is centered around two main pillars: incorpo-
rating multimodality into information retrieval systems and studying dialogic interactions. In more detail, this
research is focused on how to represent multimodal data, taking into account contexts and various multi-
modal aspects in the developed representations, and addressing the challenge posed by the scarcity of avail-
able data. The artificial intelligence methods implemented also tackle issues related to handling degraded
data, continuous and interactive learning, while aiming to make model predictions understandable, with an
eye towards explainability.

Sign language modeling and processing

Sign languages, which are poorly endowed languages, have a linguistic system resulting from their visuo-gestural nature: a large amount of information is expressed simultaneously and organized spatially, and iconicity
plays a central role. Computer modeling of SL requires the design of representations with little
available data, and where pre-existing models, which are essentially linear, have been developed for written
or spoken languages and do not cover all aspects of LS. Through projects and PhD theses and in collaboration with signers of these languages (e.g. deaf translators and journalists), we are tackling the following research question: How can SL be analysed, represented and processed? How can we take into account the linguistic specifics linked to their visual-gestural nature (multilinearity, spatialization, iconicity)? What types of approach are possible with little LSF data? Current projects are detailed on this page.

Speech processing and multilingual variation modeling

Research in this theme aims to understand the variation phenomena that underlie temporal and spatial
changes in spoken language and to develop models for use in automatic speech processing. One of our objectives is to structure the information in audio documents by developing models and algorithms
that rely on diverse information sources and can serve to detect the presence of speech, to identify the lan-
guage being spoken and to characterize the speaker(s), to transcribe the speech into text in the same or a
different language or identify specific entities or acoustic events. Concerning speech recognition, our research aims to complete the word sequence with punctuation and with paralinguistic information such as hesitations, laughter or breath noises. We also study frugal learning techniques and applied them to speech recognition for low e-resourced languages and tasks.

News

Colloquium, Sciences et Technologies des langues

LT4All 2025 : Technologies de la langue pour tous
Sciences et Technologies des langues

EcAuTAL, l'école d'automne en TAL du département STL
Colloquium, Sciences et Technologies des langues

Nudge and conversational agents: ethical, emotional and societal issues" conference

All News Articles

Coordination

Sciences et Technologies des Langues

LIPSLangue Interaction Parole et Signes

Vasilescu Ioana

Research director

Team coordinator

Corpus linguistics, spoken language variation, multilingual corpora

Personal page
Sciences et Technologies des Langues

LIPSLangue Interaction Parole et Signes

Braffort Annelies

Senior researcher

LIPSLangue Interaction Parole et Signes team co-head

Sign language modelling and processing

Email

Personal page

Team members

Sciences et Technologies des Langues

LIPSLangue Interaction Parole et Signes

Ascorbe Fernández Pablo
Sciences et Technologies des Langues

LIPSLangue Interaction Parole et Signes

Boucharenc Iskandar

PhD Student
Sciences et Technologies des Langues

LIPSLangue Interaction Parole et Signes

Braffort Annelies

Senior researcher

LIPSLangue Interaction Parole et Signes team co-head

Sign language modelling and processing

Email

Personal page
Sciences et Technologies des Langues

LIPSLangue Interaction Parole et Signes, M3

Evrard Marc

Associate Professor

Personal page
Sciences et Technologies des Langues

LIPSLangue Interaction Parole et Signes

Filhol Michael

CNRS researcher

Sign Language modelling and processing

Email
Sciences et Technologies des Langues

LIPSLangue Interaction Parole et Signes

Gauvain Jean-Luc

Emeritus Researcher

Email
Sciences et Technologies des Langues

LIPSLangue Interaction Parole et Signes

Ghannay Sahar

Email

Personal page
Sciences et Technologies des Langues

LIPSLangue Interaction Parole et Signes

Gigant Théo

PhD Student
Sciences et Technologies des Langues

LIPSLangue Interaction Parole et Signes

Gouiffès Michèle

Teacher-Researcher

Personal page
Sciences et Technologies des Langues

LIPSLangue Interaction Parole et Signes

Guinaudeau Camille

Associate Professor
Sciences et Technologies des Langues

LIPSLangue Interaction Parole et Signes

Halbout Julie

PhD Student
Sciences et Technologies des Langues

LIPSLangue Interaction Parole et Signes

Jara Aygalic
Sciences et Technologies des Langues

LIPSLangue Interaction Parole et Signes

Kim Mincho

Email
Sciences et Technologies des Langues

LIPSLangue Interaction Parole et Signes

Lamel Lori

Senior Researcher

Email

0671016920
Sciences et Technologies des Langues

LIPSLangue Interaction Parole et Signes

Lepagnol Pierre

PhD Student

Email

Personal page
Sciences et Technologies des Langues

LIPSLangue Interaction Parole et Signes, M3

Lienard Jean-Sylvain

Fellow Researcher
Sciences et Technologies des Langues

LIPSLangue Interaction Parole et Signes

Lincker Elise

PhD Student
Sciences et Technologies des Langues

Direction, LIPSLangue Interaction Parole et Signes

Rosset Sophie

Senior Researcher

LISN Director

0169155858

Email

Personal page
Sciences et Technologies des Langues

LIPSLangue Interaction Parole et Signes

Vasilescu Ioana

Research director

Team coordinator

Corpus linguistics, spoken language variation, multilingual corpora

Personal page

Publications

Communication dans un congrès

Anne-Laure Ligozat. Côté obscur de l’IA : quels bénéfices réels de l’IA pour faire face aux crises environnementales ?. GreenDays 2023, Mar 2023, Lyon, France. ⟨hal-05317071⟩

STL

Year of publication 2023

Available in free access

HAL publication
Communication dans un congrès

Diandra Fabre, Julie Lascar, Julie Halbout, Yanis Ouakrim, Annelies Braffort, et al.. Exploring Sign-level Strategies to Enhance Automatic Translation of French Sign Language. ACM International Conference on Intelligent Virtual Agents, Sep 2025, Berlin, Germany. ⟨10.1145/3742886.3756733⟩. ⟨hal-05280328⟩

AMIArchitectures et modèles pour l'Interaction, STL

Year of publication 2025

Available in free access

HAL publication
Thèse

Marco Naguib. Extraction d’information clinique : méthodes et ressources pour l’adaptation en domaine. Informatique [cs]. Université Paris-Saclay, 2025. Français. ⟨NNT : 2025UPASG054⟩. ⟨tel-05289152⟩

STL

Year of publication 2025

Available in free access

HAL publication
Communication dans un congrès

Armand Stricker, Patrick Paroubek. Chitchat as Interference: Adding User Backstories to Task-Oriented Dialogues. The 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), ELRA; ICCL, May 2024, Torino, Italy. pp.3203–3214. ⟨hal-05242362⟩

STL

Year of publication 2024

Available in free access

HAL publication
Communication dans un congrès

Fanny Ducel, Jeffrey André, Aurélie Névéol, Karën Fort. Introducing MascuLead: the First Gender Bias Leaderboard. EALM 2025 – Ethic and Alignment of (Large) Language Models, Jun 2025, Marseille, France. pp.12-19. ⟨hal-05282981⟩

STL

Year of publication 2025

Available in free access

HAL publication
Communication dans un congrès

Fanny Ducel, Nicolas Hiebel, Olivier Ferret, Karën Fort, Aurélie Névéol. « Les femmes ne font pas de crise cardiaque ! » Étude des biais de genre dans les cas cliniques synthétiques en français. 32ème Conférence sur le Traitement Automatique des Langues Naturelles (TALN 2025), Jul 2025, Marseille, France. pp.1. ⟨hal-05282965⟩

STL

Year of publication 2025

Available in free access

HAL publication
Communication dans un congrès

Clémentine Bleuze, Fanny Ducel, Maxime Amblard, Karën Fort. « De nos jours, ce sont les résultats qui comptent » : création et étude diachronique d’un corpus de revendications issues d’articles de TALTraitement Automatique des langues. TALN 2025 – 32ème Conférence sur le Traitement Automatique des Langues Naturelles, Jul 2025, Marseille, France. ⟨hal-05282966⟩

STL

Year of publication 2025

Available in free access

HAL publication
Thèse

Yajing Feng. Continuous Recognition of Client Emotions from Speech and Text in Real-World Call Center Conversations : a Context-Aware Dataset and Empirical Study. Artificial Intelligence [cs.AI]. Université Paris-Saclay, 2025. English. ⟨NNT : 2025UPASG042⟩. ⟨tel-05241382⟩

STL

Year of publication 2025

Available in free access

HAL publication
Pré-publication, Document de travail

Alexander Goldberg, Ihsan Ullah, Thanh Gia Hieu Khuong, Benedictus Kent Rachmat, Zhen Xu, et al.. Usefulness of LLMs as an Author Checklist Assistant for Scientific Papers: NeurIPS’24 Experiment. 2025. ⟨hal-05230379⟩

AO, STL

Year of publication 2025

Available in free access

HAL publication
Communication dans un congrès

Floris Thiant, Olivia Penas, Yann Leroy, Anne-Laure Ligozat. System analysis of digital service system perimeter and its interdependencies in Life Cycle Assessment. 2025 IEEE International Symposium on Systems Engineering (ISSE), Oct 2025, Palaiseau, France. ⟨hal-05240543⟩

STL, STL

Year of publication 2025

HAL publication
Article dans une revue

Thomas Gerald, Louis Tamames, Sofiane Ettayeb, Ha-Quang Le, Patrick Paroubek, et al.. CQuAE: A new Contextualized QUestion Answering corpus on Education domain. Data and Knowledge Engineering, 2024, 151, pp.102305. ⟨10.1016/j.datak.2024.102305⟩. ⟨hal-05242257⟩

STL

Year of publication 2024

HAL publication
Chapitre d'ouvrage

Tommaso Raso, Saulo Mendes Santos, Albert Rilliard, João A. Moraes. Defining and Identifying Discourse Markers in Spontaneous Speech. Miguel Oliveira, Jr. Prosodic Interfaces – Interdisciplinary Perspectives on Sound Patterns and Human Interaction, De Gruyter, pp.65-102, 2025, 978-3-11-105990-7. ⟨10.1515/9783111060309-003⟩. ⟨hal-05230528⟩

STL

Year of publication 2025

Available in free access

HAL publication
Communication dans un congrès

Clémence Sebe, Sarah Cohen-Boulakia, Olivier Ferret, Aurélie Névéol. Extracting Information in a Low-resource Setting: Case Study on Bioinformatics Workflows. Symposium on Intelligent Data Analysis (IDA 2025), May 2025, Konstanz, Germany. pp.274-287, ⟨10.1007/978-3-031-91398-3_21⟩. ⟨hal-05244222⟩

BioInfo, STL

Year of publication 2025

Available in free access

HAL publication
Article dans une revue

Philippe Boula de Mareüil, Paolo Roseano. A speaking atlas of the languages of the Iberian Peninsula: focus on rhythm and varieties in contact. Dialectologia, 2025, 35, pp.27-54. ⟨10.1344/dialectologia.35.2⟩. ⟨hal-05263043⟩

STL

Year of publication 2025

Available in free access

HAL publication
Communication dans un congrès

Gaël Guennebaud, Anne-Laure Ligozat, Anne-Cécile Orgerie, Matthieu Simonin. Evaluating and Reporting the Carbon Footprint of Shared Computing Platforms: Choices and Limits. ISPDC 2025 – 24th IEEE International Symposium on Parallel and Distributed Computing, Jul 2025, Rennes, France. pp.1-7. ⟨hal-05195576⟩

STL

Year of publication 2025

Available in free access

HAL publication
Communication dans un congrès

Haohua Dong, Ana Manzano Rodríguez, Camille Guinaudeau, Shin’Ichi Satoh. Fairness Without Labels: Pseudo-Balancing for Bias Mitigation in Face Gender Classification. Second workshop on Fairness and ethics towards transparent AI: facing the chalLEnge through model Debiasing (FAILED) at the 2025 International Conference on Computer Vision, Oct 2025, Honolulu, HI, United States. ⟨hal-05210445⟩

STL

Year of publication 2025

Available in free access

HAL publication
Thèse

Nicolas Hiebel. Création éthique de données textuelles artificielles : application au domaine biomédical. Traitement du texte et du document. Université Paris-Saclay, 2025. Français. ⟨NNT : 2025UPASG033⟩. ⟨tel-05185326⟩

STL, STL

Year of publication 2025

Available in free access

HAL publication
Article dans une revue

Philippe Boula de Mareüil, Alexis Pierrard, Albert Rilliard. Acoustic study of /r/ front fricatives in Bolivian Highland Spanish. Estudios de Fonética Experimental , 2025, 34, pp.41 – 56. ⟨10.1344/efe-2025-34-41-56⟩. ⟨hal-05157171⟩

STL

Year of publication 2025

Available in free access

HAL publication
Communication dans un congrès

Ana Manzano Rodríguez, Camille Guinaudeau, Shin Ichi Satoh. Uncovering Gender Biases in Gender Identification Models for Japanese Data Analysis. Workshop on Demographic Diversity in Computer Vision @ CVPR 2025, Jun 2025, Nashville (Tennessee), United States. ⟨hal-05154054⟩

STL

Year of publication 2025

Available in free access

HAL publication
Thèse

Jiahui Hu. Granular Insights into Financial Discourse : Fine-Grained Opinion Analysis of Expert Texts. Document and Text Processing. Université Paris-Saclay, 2023. English. ⟨NNT : 2023UPASG110⟩. ⟨tel-05153905⟩

AO, STL

Year of publication 2023

Available in free access

HAL publication
Article dans une revue

Philippe Boula de Mareüil, Marc Evrard, Alexandre François, Antonio Romano. Computer modelling of innovations relative to Latin in contemporary Romance dialects. Isogloss. Open Journal of Romance Linguistics, 2025, 11 (3), pp.1 – 31. ⟨10.5565/rev/isogloss.423⟩. ⟨hal-05144863⟩

STL

Year of publication 2025

Available in free access

HAL publication
Article dans une revue

Anne Baillot, Anne-Laure Ligozat. Introduction. Sobriété numérique. Humanités numériques, 2025, 11, ⟨10.4000/1498x⟩. ⟨hal-05143071⟩

STL

Year of publication 2025

Available in free access

HAL publication
Communication dans un congrès

Pierre Lepagnol, Sahar Ghannay, Thomas Gerald, Christophe Servan, Sophie Rosset. Leveraging Information Retrieval to Enhance Spoken Language Understanding Prompts in Few-Shot Learning. Interspeech 2025, Aug 2025, Rotterdam, Netherlands. ⟨10.21437/Interspeech.2025-175⟩. ⟨hal-05095796⟩

STL, STL

Year of publication 2025

Available in free access

HAL publication
Article dans une revue

Agata Savary. NLP-based Study of Universals of Linguistic Idiosyncrasy. Dagstuhl Reports, 2023, 13 (5), pp.64-67. ⟨hal-04323075⟩

ILES, ILES, STL

Year of publication 2023

HAL publication
Thèse

Mathieu Laï-King. Qualité des articles de recherche et modèles de langue neuronaux : applications au domaine biomédical. Intelligence artificielle [cs.AI]. Université Paris-Saclay, 2025. Français. ⟨NNT : 2025UPASG031⟩. ⟨tel-05079724⟩

STL

Year of publication 2025

Available in free access

HAL publication
Pré-publication, Document de travail

Clément Morand, Anne-Laure Ligozat, Aurélie Névéol. Characterizing Goals and Impacts of Digitalization: The Case of Promises in French Healthcare Policies. 2025. ⟨hal-05066176⟩

STL

Year of publication 2025

Available in free access

HAL publication
Communication dans un congrès

Luc Mottin, Julien Gobeill, Jeevanthi Liyana Pathirana, Nona Naderi, Anaïs Mottaz, et al.. Manuscript Classification to Support the Analysis of Biases in Publication Opportunities. The 35th Medical Informatics Europe Conference, May 2025, Glagow, United Kingdom. ⟨10.3233/SHTI250475⟩. ⟨hal-05070636⟩

STL

Year of publication 2025

Available in free access

HAL publication
Rapport

Karin Dassas, Cyrille Bonamy, Bruno Bzeznik, Romaric David, Emmanuelle Frenoux, et al.. Estimer l’impact carbone des activités numériques de l’Observatoire de Paris. EcoInfo. 2025, pp.1-47. ⟨hal-05068666⟩

STL

Year of publication 2025

Available in free access

HAL publication
Article dans une revue

Nicolas Hiebel, Olivier Ferret, Karën Fort, Aurélie Névéol. Clinical text generation: Are we there yet?. Annual Review of Biomedical Data Science, 2025, 8, pp.173-198. ⟨10.1146/annurev-biodatasci-103123-095202⟩. ⟨hal-05055957⟩

STL

Year of publication 2025

Available in free access

HAL publication
Article dans une revue

Arezoo Saedi, Afsaneh Fatemi, Mohammad Ali Nematbakhsh, Sophie Rosset, Anne Vilnat. Entity search based on consumer preferences leveraging user reviews. Expert Systems with Applications, 2025, 275, pp.126990. ⟨10.1016/j.eswa.2025.126990⟩. ⟨hal-05047109⟩

STL

Year of publication 2025

Available in free access

HAL publication
Communication dans un congrès

Foucauld Estignard, Sahar Ghannay, Julien Girard-Satabin, Nicolas Hiebel, Aurélie Névéol. Evaluating the Confidentiality of Synthetic Clinical Texts Generated by Language Models. 23rd International Conference on Artificial Intelligence in Medicine (AIME), Jun 2025, Pavie, Italy. ⟨hal-05046326v2⟩

STL

Year of publication 2025

Available in free access

HAL publication
Communication dans un congrès

Lisa Raithel, Philippe Thomas, Bhuvanesh Verma, Roland Roller, Hui-Syuan Yeh, et al.. Overview of #SMM4H 2024 – Task 2: Cross-Lingual Few-Shot Relation Extraction for Pharmacovigilance in French, German, and Japanese. The 9th Social Media Mining for Health Research and Applications (SMM4H 2024) Workshop and Shared Tasks, Association for Computational Linguistics, Aug 2024, Bangkok, Thailand. pp.170-182. ⟨hal-04781015⟩

STL

Year of publication 2024

Available in free access

HAL publication
Pré-publication, Document de travail

Mathilde Aguiar, Pierre Zweigenbaum, Nona Naderi. Am I eligible? Natural Language Inference for Clinical Trial Patient Recruitment: the Patient’s Point of View. 2025. ⟨hal-04992084⟩

STL

Year of publication 2025

Available in free access

HAL publication
Chapitre d'ouvrage

Mathieu Constant, Marie Candito, Yannick Parmentier, Carlos Ramisch, Agata Savary. Construction, exploitation et exploration de ressources linguistiques pour le traitement automatique des expressions polylexicales en français : le projet PARSEME-FR. Lidia Becker; Julia Kuhn; Christina Ossenkop; Claudia Polzin-Haumann; Elton Prifti. Digitale romanistische Sprachwissenschaft: Stand und Perspektiven, Narr Francke Attempto Verlag GmbH + Co. KG, pp.219-250, 2023, Romanistisches Kolloquium, 978-3-8233-8506-6. ⟨hal-04995189⟩

ILES, ILES, STL

Year of publication 2023

HAL publication
Thèse

Rémi Uro. Détection et caractérisation des interruptions dans les interactions orales pour la description du comportement des femmes et des hommes dans les contenus audiovisuels. Informatique et langage [cs.CL]. Université Paris-Saclay, 2024. Français. ⟨NNT : 2024UPASG055⟩. ⟨tel-04994439⟩

STL, STL

Year of publication 2024

Available in free access

HAL publication
Communication dans un congrès

Amel Fraisse, Patrick Paroubek, Ramit Goyal, Nassreddine Znaidi. Measuring Multilingualism in Online Public Access Catalogs. The ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL), Dec 2024, Hong Kong, China. ⟨10.1145/3677389.3702544⟩. ⟨hal-04986773⟩

ILES, ILES, STL

Year of publication 2024

HAL publication
Communication dans un congrès

Manon Scholivet, Agata Savary, Louis Estève, Marie Candito, Carlos Ramisch. SELEXINI – a large and diverse automatically parsed corpus of French. Building and Using Comparable Corpora (BUCC), Jan 2025, Abu Dhabi, United Arab Emirates. ⟨hal-04978746⟩

ILES, ILES, STL

Year of publication 2025

Available in free access

HAL publication
Thèse

Hui-Syuan Yeh. Prompt-based Relation Extraction for Pharmacovigilance. Computation and Language [cs.CL]. Université Paris-Saclay, 2024. English. ⟨NNT : 2024UPASG097⟩. ⟨tel-04968043⟩

STL, STL

Year of publication 2024

Available in free access

HAL publication
Rapport

Sylvain Bouveret, Aurélie Bugeau, Frenoux Emmanuelle, Julien Lefevre, Laurent Lefèvre, et al.. Quiz sur les impacts environnementaux du numérique. EcoInfo. 2025, pp.1-5. ⟨hal-04960328v2⟩

STL

Year of publication 2025

Available in free access

HAL publication
Thèse

Camille Challant. Représentation formelle avec AZee et contraintes grammaticales pour la langue des signes française. Théorie et langage formel [cs.FL]. Université Paris-Saclay, 2024. Français. ⟨NNT : 2024UPASG086⟩. ⟨tel-04957486⟩

STL, STL

Year of publication 2024

Available in free access

HAL publication

All Publications

Information retrieval in dialogues

Sign language modeling and processing

Speech processing and multilingual variation modeling

News

Coordination

Team members

Publications

Anne-Laure Ligozat. Côté obscur de l’IA : quels bénéfices réels de l’IA pour faire face aux crises environnementales ?. GreenDays 2023, Mar 2023, Lyon, France. ⟨hal-05317071⟩

Marco Naguib. Extraction d’information clinique : méthodes et ressources pour l’adaptation en domaine. Informatique [cs]. Université Paris-Saclay, 2025. Français. ⟨NNT : 2025UPASG054⟩. ⟨tel-05289152⟩

Fanny Ducel, Jeffrey André, Aurélie Névéol, Karën Fort. Introducing MascuLead: the First Gender Bias Leaderboard. EALM 2025 – Ethic and Alignment of (Large) Language Models, Jun 2025, Marseille, France. pp.12-19. ⟨hal-05282981⟩

Yajing Feng. Continuous Recognition of Client Emotions from Speech and Text in Real-World Call Center Conversations : a Context-Aware Dataset and Empirical Study. Artificial Intelligence [cs.AI]. Université Paris-Saclay, 2025. English. ⟨NNT : 2025UPASG042⟩. ⟨tel-05241382⟩

Alexander Goldberg, Ihsan Ullah, Thanh Gia Hieu Khuong, Benedictus Kent Rachmat, Zhen Xu, et al.. Usefulness of LLMs as an Author Checklist Assistant for Scientific Papers: NeurIPS’24 Experiment. 2025. ⟨hal-05230379⟩

Floris Thiant, Olivia Penas, Yann Leroy, Anne-Laure Ligozat. System analysis of digital service system perimeter and its interdependencies in Life Cycle Assessment. 2025 IEEE International Symposium on Systems Engineering (ISSE), Oct 2025, Palaiseau, France. ⟨hal-05240543⟩

Thomas Gerald, Louis Tamames, Sofiane Ettayeb, Ha-Quang Le, Patrick Paroubek, et al.. CQuAE: A new Contextualized QUestion Answering corpus on Education domain. Data and Knowledge Engineering, 2024, 151, pp.102305. ⟨10.1016/j.datak.2024.102305⟩. ⟨hal-05242257⟩

Philippe Boula de Mareüil, Paolo Roseano. A speaking atlas of the languages of the Iberian Peninsula: focus on rhythm and varieties in contact. Dialectologia, 2025, 35, pp.27-54. ⟨10.1344/dialectologia.35.2⟩. ⟨hal-05263043⟩

Nicolas Hiebel. Création éthique de données textuelles artificielles : application au domaine biomédical. Traitement du texte et du document. Université Paris-Saclay, 2025. Français. ⟨NNT : 2025UPASG033⟩. ⟨tel-05185326⟩

Philippe Boula de Mareüil, Alexis Pierrard, Albert Rilliard. Acoustic study of /r/ front fricatives in Bolivian Highland Spanish. Estudios de Fonética Experimental , 2025, 34, pp.41 – 56. ⟨10.1344/efe-2025-34-41-56⟩. ⟨hal-05157171⟩

Ana Manzano Rodríguez, Camille Guinaudeau, Shin Ichi Satoh. Uncovering Gender Biases in Gender Identification Models for Japanese Data Analysis. Workshop on Demographic Diversity in Computer Vision @ CVPR 2025, Jun 2025, Nashville (Tennessee), United States. ⟨hal-05154054⟩

Jiahui Hu. Granular Insights into Financial Discourse : Fine-Grained Opinion Analysis of Expert Texts. Document and Text Processing. Université Paris-Saclay, 2023. English. ⟨NNT : 2023UPASG110⟩. ⟨tel-05153905⟩

Philippe Boula de Mareüil, Marc Evrard, Alexandre François, Antonio Romano. Computer modelling of innovations relative to Latin in contemporary Romance dialects. Isogloss. Open Journal of Romance Linguistics, 2025, 11 (3), pp.1 – 31. ⟨10.5565/rev/isogloss.423⟩. ⟨hal-05144863⟩

Anne Baillot, Anne-Laure Ligozat. Introduction. Sobriété numérique. Humanités numériques, 2025, 11, ⟨10.4000/1498x⟩. ⟨hal-05143071⟩

Pierre Lepagnol, Sahar Ghannay, Thomas Gerald, Christophe Servan, Sophie Rosset. Leveraging Information Retrieval to Enhance Spoken Language Understanding Prompts in Few-Shot Learning. Interspeech 2025, Aug 2025, Rotterdam, Netherlands. ⟨10.21437/Interspeech.2025-175⟩. ⟨hal-05095796⟩

Agata Savary. NLP-based Study of Universals of Linguistic Idiosyncrasy. Dagstuhl Reports, 2023, 13 (5), pp.64-67. ⟨hal-04323075⟩

Mathieu Laï-King. Qualité des articles de recherche et modèles de langue neuronaux : applications au domaine biomédical. Intelligence artificielle [cs.AI]. Université Paris-Saclay, 2025. Français. ⟨NNT : 2025UPASG031⟩. ⟨tel-05079724⟩

Clément Morand, Anne-Laure Ligozat, Aurélie Névéol. Characterizing Goals and Impacts of Digitalization: The Case of Promises in French Healthcare Policies. 2025. ⟨hal-05066176⟩

Karin Dassas, Cyrille Bonamy, Bruno Bzeznik, Romaric David, Emmanuelle Frenoux, et al.. Estimer l’impact carbone des activités numériques de l’Observatoire de Paris. EcoInfo. 2025, pp.1-47. ⟨hal-05068666⟩

Nicolas Hiebel, Olivier Ferret, Karën Fort, Aurélie Névéol. Clinical text generation: Are we there yet?. Annual Review of Biomedical Data Science, 2025, 8, pp.173-198. ⟨10.1146/annurev-biodatasci-103123-095202⟩. ⟨hal-05055957⟩

Arezoo Saedi, Afsaneh Fatemi, Mohammad Ali Nematbakhsh, Sophie Rosset, Anne Vilnat. Entity search based on consumer preferences leveraging user reviews. Expert Systems with Applications, 2025, 275, pp.126990. ⟨10.1016/j.eswa.2025.126990⟩. ⟨hal-05047109⟩

Mathilde Aguiar, Pierre Zweigenbaum, Nona Naderi. Am I eligible? Natural Language Inference for Clinical Trial Patient Recruitment: the Patient’s Point of View. 2025. ⟨hal-04992084⟩

Amel Fraisse, Patrick Paroubek, Ramit Goyal, Nassreddine Znaidi. Measuring Multilingualism in Online Public Access Catalogs. The ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL), Dec 2024, Hong Kong, China. ⟨10.1145/3677389.3702544⟩. ⟨hal-04986773⟩

Manon Scholivet, Agata Savary, Louis Estève, Marie Candito, Carlos Ramisch. SELEXINI – a large and diverse automatically parsed corpus of French. Building and Using Comparable Corpora (BUCC), Jan 2025, Abu Dhabi, United Arab Emirates. ⟨hal-04978746⟩

Hui-Syuan Yeh. Prompt-based Relation Extraction for Pharmacovigilance. Computation and Language [cs.CL]. Université Paris-Saclay, 2024. English. ⟨NNT : 2024UPASG097⟩. ⟨tel-04968043⟩

Sylvain Bouveret, Aurélie Bugeau, Frenoux Emmanuelle, Julien Lefevre, Laurent Lefèvre, et al.. Quiz sur les impacts environnementaux du numérique. EcoInfo. 2025, pp.1-5. ⟨hal-04960328v2⟩

Camille Challant. Représentation formelle avec AZee et contraintes grammaticales pour la langue des signes française. Théorie et langage formel [cs.FL]. Université Paris-Saclay, 2024. Français. ⟨NNT : 2024UPASG086⟩. ⟨tel-04957486⟩