I am a researcher in the ALMAnaCH project-team at Inria Paris, France. I am a specialist of Machine Translation (MT), having worked on contextual MT during my PhD at the LIMSI laboratory (now LISN) and MT for low-resource languages in my post-doc at the University of Edinburgh. I am currently working on a range of topics in MT and multilingual NLP, focusing mainly on language variation, both for historical and contemporary texts (for example user-generated content, dialectal variation), evaluation and resource creation. I am currently a fellow in the PR[AI]RIE-PSAI research institution.
🗣️This is how you pronounce my name: [ˈɹeɪtʃəl ˈbɔːdn̩]
🗞 News
- 🎤 15/06/26: Keynote talk at EAMT 2026: “Large Language Models and Machine Translation: From Low-Resource to Unseen Languages”. I had the opportunity to talk about some of our work on data generation, compositional translation, LLM reasoning and explicit reasoning, research carried out by Armel Zebaze and Malik Marmonier and co-supervised with Benoît Sagot.
- 📄 09/06: 1 paper accepted at AMTA 2026:
- 📄 05/25: 1 paper accepted at ICML 2026:
- 📄 04/26: 3 papers accepted at EAMT 2026:
- When the Gold Standard Isn’t Necessarily Standard: Challenges of Evaluating the Translation of User-Generated Content. Nishimwe et al. (Technical track)
- MetaDocEval: A Contrastive Framework for Evaluating Machine Translation Metrics at the Document-Level. Dahan et al. (Technical track)
- The MaTOS Pipeline for the Translation of Scientific Abstracts on the HAL Platform. Tsolakis et al. (Implementations and case studies track)
- 📄 04/26: 1 paper accepted at ACL Findings 2026:
- Gaperon: A Peppered English-French Generative Language Model Suite. Godey et al.
- 📄 03/26: 2 papers accepted at the workshops colocated with LREC 2026:
- Parallel Corpora of Scholarly Documents for English-French Machine Translation. Peng et al. (BUCC)
- Pre-Editorial Normalization for Automatically Transcribed Medieval Manuscripts in Old French and Latin. Clérice et al. (LT4HALA)
- 📄 02/26: 2 papers accepted at LREC 2026:
- 📄 01/26: 1 paper accepted at the VarDial 2026 workshop:
- 📄 11/25: 4 papers published at EMNLP 2025:
- Compositional Translation: A Novel LLM-based Approach for Low-resource Machine Translation. Zebaze et al.
- TopXGen: Topic-Diverse Parallel Data Generation for Low-Resource Machine Translation. Zebaze et al.
- Explicit Learning and the LLM in Machine Translation. Marmonier et al.
- AFRIDOC-MT: Document-level MT Corpus for African Languages. Alabi et al.
- 📄 11/25: 4 papers published at WMT 2025:
- Findings of the WMT25 General Machine Translation Shared Task: Time to Stop Evaluating on Easy Test Sets. Kocmi et al.
- Self-Retrieval from Distant Contexts for Document-Level Machine Translation. Peng et al.
- A French Version of the OLDI Seed Corpus. Marmonier et al.
- RoCS-MT v2 at WMT 2025: Robust Challenge Set for Machine Translation. Bawden & Sagot.
- 🎓 11/25: Matthieu Futeral successfully defended his PhD on Multilingual and Multimodal Language Modelling, supervised by Benoît Sagot, Cordelia Schmid and me.
- 🎓 07/25: Lydia Nishimwe successfully defended her PhD on Robust Neural Machine Translation of User-Generated Content, supervised by Benoît Sagot and me.
- 📄 07/25: 1 paper published at ACL 2025:
- 🏆 06/25: Oriane Nédey won the best paper award at the RECITAL student conference for her paper La traduction automatique dialectale: état de l’art et étude préliminaire sur le continuum dialectal de l’occitan. Her PhD is supervised by Benoît Sagot, Thibault Clérice and me.
- 📄 06/25: 1 article published at TALN 2025:
- 📄 06/25: 2 articles published at MT Summit 2025:
- Investigating Length Issues in Document-level Machine Translation. Peng et al.
- MaTOS: Machine Translation for Open Science. Bawden et al.
- 📄 04/25: 2 papers published at NAACL 2025: