M3

Models, Methods, and Multilingualism (M3)

Coordination : Gilles ADDA

The research focus of the Models, Methods, and Multilingualism team is on developing models and methods to help both the discovery of fundamental properties of language and the implementation of efficient systems to process it. We are interested in language in all its dimensions and modalities but strongly emphasize the multilingual dimension. The methods and models developed by the team are diverse by nature: computational (neural models, stochastic or symbolic methods), linguistic (language typology, linguistic diversity, and universals, affects), or societal (accessibility, nudges, language preservation, processing of underresourced languages and dialects). One of the team’s common aims is to relate language universals to the characteristics of language diversity and variation within a unified vision of linguistic and statistical (“automatic processing”) modeling of languages.

We can illustrate the team’s activity through a structured set of themes and sub-themes:

Universals in multilingual language modeling

Keywords: Linguistic diversity and universality in modeling; Representation of oral languages; Universal phonetic modeling and representation; Unified multilingual modeling and automatic identification of idiomaticity; Large multilingual and multimodal language models; Generative AI; Universal and cultural models of affects; Syntax of oral languages; Quantitative typology; Comparable corpora; Accessibility; Evaluation and resources; Multilingual generic systems: speech recognition, text generation, speech synthesis.

Methods and models for under-resourced languages

Keywords: Documentation of under-resourced languages; Scientific policies for endangered languages, Ethical and societal impact; Automatic processing of under-resourced languages; Massively multilingual models and interlingual transfer; Portability from a well-resourced to an under-resourced language.

Machine learning for NLP

Keywords: Machine learning and inference algorithms for structured prediction; Weak or unsupervised learning; Continuous learning; Representation learning and meta-learning; Learning in context of affective interactions.

Corpus linguistics, interlingual and intralingual variation

Keywords: Accents, dialects, and varieties: dialectometry (geoprosody) and linguistic cartography; Speaking styles; Variation of prosodic codes between languages and cultures (symbolic codes and socially coded attitudes); Expressive and multimodal prosody: illocutions, attitudes, social affects; Voice, voice strength, vocal quality, social uses.

Modeling of affective behaviors

Keywords: Automatic learning and detection of affective behaviors from paralinguistic and linguistic cues; Adaptation of large acoustic and linguistic models to emotion detection; Detection of abnormal behaviors and nudges in interaction; Ethical and societal impact of affects modeling and nudges.

Coordination

  • Sciences et Technologies des Langues

    M3

    Adda Gilles

    Head of M3

    Engineer and researcher

Membres de l’équipe

Publications

  • Communication dans un congrès

    Théo Gigant, Camille Guinaudeau, Marc Decombas, Frédéric Dufaux. Mitigating the Impact of Reference Quality on Evaluation of Summarization Systems with Reference-Free Metrics. The 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP 2024), Nov 2024, Miami (FL), United States. ⟨hal-04720645⟩

    STL

    Year of publication

    Available in free access

  • Communication dans un congrès

    Emmanuella Martinod, Michael Filhol. Formal Representation of Interrogation in French Sign Language. Proceedings of the 11th Workshop on representation and processing of Sign Languages, May 2024, Turin, Italy. ⟨hal-04712681⟩

    STL

    Year of publication

    Available in free access

  • Communication dans un congrès

    Michael Filhol, Thomas von Ascheberg. A software editor for the AZVD graphical Sign Language representation system. Workshop on the representation and processing Sign Language, May 2024, Turin, Italy. ⟨hal-04712674⟩

    STL

    Year of publication

    Available in free access

  • Communication dans un congrès

    Emmanuella Martinod, Michael Filhol. Examining interrogative marking in French Sign Language with the AZee approach. Clause-type marking in the visual modality, workshop at the Annual Conference of the German Linguistics Society, German Linguistics Society, Feb 2024, Bochum, Germany. ⟨hal-04709019⟩

    STL

    Year of publication

    Available in free access

  • Communication dans un congrès

    Paritosh Sharma, Camille Challant, Michael Filhol. Facial Expressions for Sign Language Synthesis using FACSHuman and AZee. 11th Workshop on the Representation and Processing of Sign Languages: Evaluation of Sign Language Resources, May 2024, Turin, Italy. ⟨hal-04709105⟩

    STL

    Year of publication

    Available in free access

  • Communication dans un congrès

    Paritosh Sharma, Michael Filhol. Sign Language Synthesis using Pose Priors. MOCO ’24: 9th International Conference on Movement and Computing, May 2024, Utrecht Netherlands, France. pp.1-4, ⟨10.1145/3658852.3659080⟩. ⟨hal-04709203⟩

    STL

    Year of publication

    Available in free access

  • Article dans une revue

    Pierre La Rocca, Gaël Guennebaud, Aurélie Bugeau, Anne-Laure Ligozat. Estimating The Carbon Footprint Of Digital Agriculture Deployment: A Parametric Bottom-Up Modelling Approach.. Journal of Industrial Ecology, In press, ⟨10.1111/jiec.13568⟩. ⟨hal-04708774⟩

    STL

    Year of publication

    Available in free access

  • Article dans une revue

    Fanny Ducel, Aurélie Névéol, Karën Fort. La recherche sur les biais dans les modèles de langue est biaisée : état de l’art en abyme. Revue TALTraitement Automatique des langues : traitement automatique des langues, 2024, 64 (3). ⟨hal-04710191⟩

    STL

    Year of publication

    Available in free access

  • Communication dans un congrès, Communication dans un congrès

    Carlos Cuevas Villarmin, Sarah Cohen-Boulakia, Nona Naderi. Reproducibility in Named Entity Recognition: A Case Study Analysis. 2024 IEEE 20th International Conference on e-Science (e-Science), Sep 2024, Osaka, Japan. pp.1-10, ⟨10.1109/e-Science62913.2024.10678721⟩. ⟨hal-04706673⟩

    BioInfo, BioInfo, STL

    Year of publication

  • Communication dans un congrès

    Rémi Uro, Marie Tahon, David Doukhan, Antoine Laurent, Albert Rilliard. Detecting the terminality of speech-turn boundary for spoken interactions in French TV and Radio content. Interspeech 2024, Itshak Lapidot; Sharon Gannot, Sep 2024, Kos, Greece. pp.3560 – 3564, ⟨10.21437/interspeech.2024-1163⟩. ⟨hal-04694968⟩

    STL

    Year of publication

    Available in free access