About me
I am a researcher at Inria, Paris in the ALMAnaCH
team, working on Natural Language Processing (NLP) and
more specifically Machine Translation (MT). I was
previously a research associate at the University of
Edinburgh, working on MT for low-resource languages,
having completed my PhD on the contextual MT of
dialogue at the LIMSI laboratory (now the LISN) in 2018 under the
supervision of Sophie Rosset and Thomas Lavergne.
Research Experience
Please see my CV
for a complete list of my research and teaching experience.
Natural Language Processing
and Machine Translation.
Holder of a "springboard" chair position in the
PRAIRIE
research institute (2021-present)
Working with Alexandra Birch, Rico Sennrich and
Barry Haddow
Boosting for model selection in syntactic parsing
Supervised by Benoit Crabbé
Modelling communicative acts for an embodied conversational agent
Supervised by Frédéric Landragin and Chloé Clavel
Tesnière Lucien (2015), Elements of structural syntax, translated by Timothy Osborne and Sylvain Kahane, John Benjamins, Amsterdam, U. Paris X.
Annotation, development and
semi-automatic correction of a treebank of spoken French for
syntactic dependencies (Rhapsodie Treebank of
Spoken French).
Supervised by Sylvain Kahane
Publications
Phd Thesis
Going beyond the sentence: Contextual Machine Translation of Dialogue
Rachel Bawden
supervised by Sophie Rosset and Thomas Lavergne.
29th November 2018. LIMSI, CNRS, Université Paris-Sud,
Université Paris-Saclay.
Committee members:
- President: Nicolas Sabouret
- Reviewers: Jörg Tiedemann and Loïc
Barrault
- Examiners: Lucia Specia and Andrei Popescu-Belis
Journal articles
Conference papers
Évaluer BLOOM en français.
Rachel Bawden, Hatim Bourfoune, Bertrand Cabot, Nathan Cassereau, Pierre Cornette, Marco Naguib, François Yvon (2024).
In Proceedings of EvalLLM2024 : Atelier sur l'évaluation des modèles génératifs (LLM) et challenge d'extraction d'information few-shot.
Toulouse, France.
An extended version of the paper (technical report) can be
found
here.
Translate your Own: a Post-Editing Experiment in the NLP domain.
Rachel Bawden, Ziqian Peng, Maud
Bénard, Eric Villemonte de La Clergerie, Raphaël
Esamotunu, Mathilde Huguin, Natalie Kübler, Alexandra
Mestivier, Mona Michelot, Laurent Romary, Lichao Zhu and François Yvon (2024).
In Proceedings of The 25th Annual
Conference of the European Association for Machine
Translation EAMT'24. Pages 431–443. Sheffield, UK.
Reconnaissance des écritures dans les imprimés.
Simon Gabay, Thibault Clérice, Pauline Jacsont, Elina
Leblanc, Marie Jeannot-Tirole, Sonia Solfrini, Sophie
Dolto, Floriane Goy, Carmen Carrasco Luján, Maddalena
Zaglio, Myriam Perregaux, Juliette Janès, Benoît Sagot,
Rachel Bawden, Rasul Dent, Oriane Nédey, Alix Chagué (2024).
In Proceedings of Humanistica
2024-Colloque annuel de l'Association francophone des
humanités numériques. Meknès, Morocco.
Findings of the 2023 conference on machine translation (WMT23): LLMs are here but not quite there yet.
Tom Kocmi, Eleftherios Avramidis, Rachel Bawden, Ondřej Bojar, Anton
Dvorkovich, Christian Federmann, Mark Fishel, Markus
Freitag, Thamme Gowda, Roman Grundkiewicz, Barry Haddow,
Philipp Koehn, Benjamin Marie, Christof Monz, Makoto
Morishita, Kenton Murray, Makoto Nagata, Toshiaki
Nakazawa, Martin Popel, Maja Popović and Mariya Shmatova (2023).
In Proceedings of the Eighth Conference on Machine Translation. WMT'23. Pages 1–42. Singapore.
Findings of the WMT 2023 biomedical translation shared task: Evaluation of ChatGPT 3.5 as a comparison system.
Mariana Neves, Antonio Jimeno Yepes, Aurélie Névéol,
Rachel Bawden, Giorgio Maria Di
Nunzio, Roland Roller, Philippe Thomas, Federica Vezzani,
Maika Vicente Navarro, Lana Yeganova, Dina Wiemann and Cristian Grozea (2023).
In Proceedings of the Eighth Conference on Machine Translation. WMT'23. Pages 43–54. Singapore.
MaTOS: traduction automatique pour la science ouverte.
Maud Bénard, Alexandra Mestivier, Natalie Kubler, Lichao
Zhu, Rachel Bawden, Eric De La Clergerie, Laurent Romary,
Mathilde Huguin, Jean-François Nominé, Ziqian Peng and François Yvon (2022).
In Actes de la 30e Conférence sur le
Traitement Automatique des Langues
Naturelles. TALN'23. Paris, France.
Findings of the WMT 2022 Biomedical Translation Shared Task: Monolingual Clinical Case Reports.
Mariana Neves, Antonio Jimeno Yepes, Amy Siu, Roland Roller, Philippe Thomas, Maika Vicente Navarro, Lana Yeganova, Dina Wiemann, Giorgio Maria Di Nunzio, Federica Vezzani, Christel Gérardin, Rachel Bawden, Darryl Johan Estrada, Salvador Lima-López, Eulàlia Farré-Maduell, Martin Krallinger, Cristian Grozea, Aurélie Névéol (2022).
In Proceedings of the Seventh Conference
on Machine Translation. WMT'22. Pages 694-723. Abu
Dhabi, United Arab Emirates.
Findings of the 2022 conference on machine translation (WMT22).
Tom Kocmi, Rachel Bawden, Ondřej Bojar, Anton Dvorkovich,
Christian Federmann, Mark Fishel, Thamme Gowda, Yvette
Graham, Roman Grundkiewicz, Barry Haddow, Rebecca
Knowles, Philipp Koehn, Christof Monz, Makoto Morishita,
Masaaki Nagata, Toshiaki Nakazawa, Michal Novák, Martin
Popel, Maja Popović (2022).
In Proceedings of the Seventh Conference
on Machine Translation. WMT'22. Pages 1-45. Abu
Dhabi, United Arab Emirates.
Automatic Normalisation of Early Modern French.
Rachel Bawden, Jonathan Poinhos, Eleni Kogkitsidou, Philippe Gambette, Benoît Sagot and Simon Gabay (2022).
In Proceedings of the 13th Language
Resources and Evaluation
Conference. LREC'22. Pages 3354–3366. Marseille, France.
Multitask Prompt Tuning Enables Zero-Shot Task Generalization.
Victor Sanh, Albert Webson, Colin Raffel, Stephen H. Bach,
Lintang Sutawika, Zaid Alyafeai, Antoine Chaffin, Arnaud
Stiegler, Teven Le Scao, Arun Raja, Manan Dey, M Saiful
Bari, Canwen Xu, Urmish Thakker, Shanya Sharma Sharma,
Eliza Szczechla, Taewoon Kim, Gunjan Chhablani, Nihal
Nayak, Debajyoti Datta, Jonathan Chang, Mike Tian-Jian
Jiang, Han Wang, Matteo Manica, Sheng Shen, Zheng Xin
Yong, Harshit Pandey, Rachel Bawden, Thomas Wang, Trishala Neeraj, Jos Rozen, Abheesht Sharma, Andrea Santilli, Thibault Fevry, Jason Alan Fries, Ryan Teehan, Tali Bers, Stella Biderman, Leo Gao, Thomas Wolf, Alexander M. Rush (2022).
In Proceedings of the 10th International
Conference on Learning Representations. ICLR'22. Online.
Findings of the WMT 2021 biomedical translation shared task: Summaries of animal experiments as new test set.
Lana Yeganova, Dina Wiemann, Mariana Neves, Federica Vezzani, Amy Siu, Iñigo Jauregi
Unanue, Maite Oronoz, Nancy Mah, Aurélie Névéol, David Martinez, Rachel Bawden, Giorgio Maria Di Nunzio, Roland Roller, Philippe Thomas, Cristian Grozea, Olatz Perez de Viñaspre, Maika Vicente Navarro and Antonio Jimeno Yepes (2021).
In Proceedings of the 6th Conference on
Machine Translation. WMT'2021. Online.
Expanding the content model of annotation Block.
Alexandre Bartz, Juliette Janes, Laurent Romary, Philippe
Gambette, Rachel Bawden, Pedro
Javier Ortiz Suárez, Benoît Sagot, Simon Gabay (2021).
In Proceedings of Next Gen TEI, 2021-TEI Conference and Members’ Meeting.
Findings of the WMT 2020 Biomedical Translation Shared Task: Basque, Italian and Russian as New Additional Languages.
Rachel Bawden, Giorgio Maria Di Nunzio, Cristian Grozea, Iñigo Jauregi Unanue, Antonio Jimeno Yepes, Nancy Mah, David Martinez, Aurélie Névéol, Mariana Neves, Maite Oronoz, Olatz Perez de Viñaspre, Massimo Piccardi, Roland Roller, Amy Siu, Philippe Thomas, Federica Vezzani, Maika Vicente Navarro, Dina Wiemann, Lana Yeganova (2020).
In Proceedings of the 5th Conference on
Machine Translation. WMT'20. Pages 660–687. Online.
Document-level Neural MT: A Systematic Comparison.
António Lopes, M. Amin Farajian, Rachel Bawden, Michael Zhang and André T. Martins (2020).
In Proceedings of the 22nd
Annual Conference of the European Association for
Machine Translation. EAMT'20. Pages 225–234. Lisbon, Portugal.
Dataset
Findings of the WMT 2019 Biomedical Translation Shared Task: Evaluation for MEDLINE Abstracts and Biomedical Terminologies.
Rachel Bawden, Kevin Bretonnel Cohen, Cristian Grozea, Antonio Jimeno Yepes, Madeleine Kittner, Martin Krallinger, Nancy Mah, Aurelie Neveol, Mariana Neves, Felipe Soares, Amy Siu, Karin Verspoor, and Maika Vicente Navarro (2019).
In Proceedings of the Fourth Conference
on Machine Translation. WMT'19. Florence, Italy.
Global under-resourced media translation (GoURMET).
Alexandra Birch, Barry Haddow, Ivan Tito, Antonio Valerio Miceli Barone, Rachel Bawden, Felipe Sánchez-Martínez, Mikel L. Forcada, Miquel Esplà-Gomis, Víctor Sánchez-Cartagena, Juan Antonio Pérez-Ortiz, Wilker Aziz, Andrew Secker, and Peggy van der Kreeft (2019).
In Proceedings of Machine Translation Summit XVII Volume 2: Translator, Project and User Tracks. Dublin, Ireland.
Preprints
Book chapters
Chapter 4. Microsyntactic annotation.
Sylvain Kahane, Kim Gerdes, and Rachel Bawden.
In Rhapsodie – A Prosodic and Syntactic Treebank for Spoken French.
Eds. Anne Lacheret-Dujour, Sylvain Kahane, and Paola Pietrandrea.
John Benjamins, Amsterdam, 2019.
Chapter 7. Annotation tools for syntax.
Kim Gerdes, Sylvain Kahane, Rachel Bawden, Julie Belião, Eric de la Clergerie, and Ilaine Wag.
In Rhapsodie – A Prosodic and Syntactic Treebank for Spoken French.
Eds. Anne Lacheret-Dujour, Sylvain Kahane, and Paola Pietrandrea.
John Benjamins, Amsterdam, 2019.
Chapter 15. Exploration of the Rhapsodie corpus: Data structure, formats and query tools.
Anne Lacheret-Dujour, Sylvain Kahane, Rachel Bawden, Serge Fleury and Ilaine Wang.
In Rhapsodie – A Prosodic and Syntactic Treebank for Spoken French.
Eds. Anne Lacheret-Dujour, Sylvain Kahane, and Paola Pietrandrea.
John Benjamins, Amsterdam, 2019.
Technical report
Other
Supervision
PhD students
Analogy for multilingual NLP.
Co-supervised with Benoît Sagot.
Evaluation of the machine translation
of scientific documents. Co-supervised with François Yvon (CNRS).
Machine translation of scientific
documents. Co-supervised with François Yvon (CNRS).
Robust Neural Machine Translation. Co-supervised with Benoît Sagot.
Multimodal Machine Translation. Co-supervised with
with Ivan Laptev, Benoît Sagot and Cordelia Schmid.
Neural models of language
evolution. Co-supervised with Benoît Sagot and Laurent Romary.
Interns and engineers
Translating with large language models without parallel data for low-resource languages (TraLaLaM project)
Data collection and translation models for a regional language of France (COLaF project)
Domain adaptation for neural machine translation in low-resource settings.
Linguistically inspired language models for closely related languages.
Domain adaptation in NMT.
Contrastive training for NMT models
for lexical disambiguation. Co-supervised with Benoît Sagot.
Investigating the effect of input representations on language sharing in multilingual models.
Exploration of multilingual
and multimodal word embeddings. Co-supervised with Benoît Sagot, Cordelia Schmid and Ivan Laptev.
Automatic tools for improving jurisprudence consistency, Co-supervised with Benoît Sagot and in collaboration with the Cour de Cassation.
Machine Translation of Noisy Texts. Co-supervised with Djamé Seddah.
Improving Low-Resource Neural Machine Translation of Related Languages by Transfer Learning. Co-supervised with Alexandra Birch.
Continuous learning for
Neural Machine Translation from Human
Post-edits. Co-supervised with Alexandra Birch.
Integrating document
structure information into Neural Machine
Translation using cache-based models. Co-supervised with Annie Louis and Bonnie Webber.
Exploiting Predictable Document Substructure in Neural Machine Translation. Co-supervised with Annie Louis and Bonnie Webber.
Teaching
Experience
Lecture on Machine Translation in the context of the
Algorithms for speech and natural language processing course.
4th year lectures, tutorials and practical classes (30hrs/year)
2nd year practical classes (10hrs)
3rd year tutorials and practical classes (27hrs)
3rd year tutorials and practical classes (24hrs/year)
3rd year tutorials and practical classes (28hrs)