About me
I am a researcher at Inria, Paris in the ALMAnaCH
team, working on Natural Language Processing (NLP) and
more specifically Machine Translation (MT). I was
previously a research associate at the University of
Edinburgh, working on MT for low-resource languages,
having completed my PhD on the contextual MT of
dialogue at the LIMSI laboratory (now the LISN) in 2018 under the
supervision of Sophie Rosset and Thomas Lavergne.
Research Experience
Please see my CV
for a complete list of my research and teaching experience.
Natural Language Processing
and Machine Translation.
Holder of a "springboard" chair position in the
PRAIRIE
research institute (2021-present)
Working with Alexandra Birch, Rico Sennrich and
Barry Haddow
Boosting for model selection in syntactic parsing
Supervised by Benoit Crabbé
Modelling communicative acts for an embodied conversational agent
Supervised by Frédéric Landragin and Chloé Clavel
Tesnière Lucien (2015), Elements of structural syntax, translated by Timothy Osborne and Sylvain Kahane, John Benjamins, Amsterdam, U. Paris X.
Annotation, development and
semi-automatic correction of a treebank of spoken French for
syntactic dependencies (Rhapsodie Treebank of
Spoken French).
Supervised by Sylvain Kahane
Publications
Phd Thesis
Going beyond the sentence: Contextual Machine Translation of Dialogue
Rachel Bawden
supervised by Sophie Rosset and Thomas Lavergne.
29th November 2018. LIMSI, CNRS, Université Paris-Sud,
Université Paris-Saclay.
Committee members:
- President: Nicolas Sabouret
- Reviewers: Jörg Tiedemann and Loïc
Barrault
- Examiners: Lucia Specia and Andrei Popescu-Belis
Journal articles
Conference papers
MaTOS: traduction automatique pour la science ouverte.
Maud Bénard, Alexandra Mestivier, Natalie Kubler, Lichao
Zhu, Rachel Bawden, Eric De La Clergerie, Laurent Romary,
Mathilde Huguin, Jean-François Nominé, Ziqian Peng and François Yvon (2022).
In Actes de la 30e Conférence sur le
Traitement Automatique des Langues
Naturelles. TALN'23. Paris, France.
Findings of the WMT 2022 Biomedical Translation Shared Task: Monolingual Clinical Case Reports.
Mariana Neves, Antonio Jimeno Yepes, Amy Siu, Roland Roller, Philippe Thomas, Maika Vicente Navarro, Lana Yeganova, Dina Wiemann, Giorgio Maria Di Nunzio, Federica Vezzani, Christel Gérardin, Rachel Bawden, Darryl Johan Estrada, Salvador Lima-López, Eulàlia Farré-Maduell, Martin Krallinger, Cristian Grozea, Aurélie Névéol (2022).
In Proceedings of the Seventh Conference
on Machine Translation. WMT'22. Pages 694-723. Abu
Dhabi, United Arab Emirates.
Findings of the 2022 conference on machine translation (WMT22).
Tom Kocmi, Rachel Bawden, Ondřej Bojar, Anton Dvorkovich,
Christian Federmann, Mark Fishel, Thamme Gowda, Yvette
Graham, Roman Grundkiewicz, Barry Haddow, Rebecca
Knowles, Philipp Koehn, Christof Monz, Makoto Morishita,
Masaaki Nagata, Toshiaki Nakazawa, Michal Novák, Martin
Popel, Maja Popović (2022).
In Proceedings of the Seventh Conference
on Machine Translation. WMT'22. Pages 1-45. Abu
Dhabi, United Arab Emirates.
Automatic Normalisation of Early Modern French.
Rachel Bawden, Jonathan Poinhos, Eleni Kogkitsidou, Philippe Gambette, Benoît Sagot and Simon Gabay (2022).
In Proceedings of the 13th Language
Resources and Evaluation
Conference. LREC'22. Pages 3354–3366. Marseille, France.
Multitask Prompt Tuning Enables Zero-Shot Task Generalization.
Victor Sanh, Albert Webson, Colin Raffel, Stephen H. Bach,
Lintang Sutawika, Zaid Alyafeai, Antoine Chaffin, Arnaud
Stiegler, Teven Le Scao, Arun Raja, Manan Dey, M Saiful
Bari, Canwen Xu, Urmish Thakker, Shanya Sharma Sharma,
Eliza Szczechla, Taewoon Kim, Gunjan Chhablani, Nihal
Nayak, Debajyoti Datta, Jonathan Chang, Mike Tian-Jian
Jiang, Han Wang, Matteo Manica, Sheng Shen, Zheng Xin
Yong, Harshit Pandey, Rachel Bawden, Thomas Wang, Trishala Neeraj, Jos Rozen, Abheesht Sharma, Andrea Santilli, Thibault Fevry, Jason Alan Fries, Ryan Teehan, Tali Bers, Stella Biderman, Leo Gao, Thomas Wolf, Alexander M. Rush (2022).
In Proceedings of the 10th International
Conference on Learning Representations. ICLR'22. Online.
Findings of the WMT 2021 biomedical translation shared task: Summaries of animal experiments as new test set.
Lana Yeganova, Dina Wiemann, Mariana Neves, Federica Vezzani, Amy Siu, Iñigo Jauregi
Unanue, Maite Oronoz, Nancy Mah, Aurélie Névéol, David Martinez, Rachel Bawden, Giorgio Maria Di Nunzio, Roland Roller, Philippe Thomas, Cristian Grozea, Olatz Perez de Viñaspre, Maika Vicente Navarro and Antonio Jimeno Yepes (2021).
In Proceedings of the 6th Conference on
Machine Translation. WMT'2021. Online.
Expanding the content model of annotation Block.
Alexandre Bartz, Juliette Janes, Laurent Romary, Philippe
Gambette, Rachel Bawden, Pedro
Javier Ortiz Suárez, Benoît Sagot, Simon Gabay (2021).
In Proceedings of Next Gen TEI, 2021-TEI Conference and Members’ Meeting.
Findings of the WMT 2020 Biomedical Translation Shared Task: Basque, Italian and Russian as New Additional Languages.
Rachel Bawden, Giorgio Maria Di Nunzio, Cristian Grozea, Iñigo Jauregi Unanue, Antonio Jimeno Yepes, Nancy Mah, David Martinez, Aurélie Névéol, Mariana Neves, Maite Oronoz, Olatz Perez de Viñaspre, Massimo Piccardi, Roland Roller, Amy Siu, Philippe Thomas, Federica Vezzani, Maika Vicente Navarro, Dina Wiemann, Lana Yeganova (2020).
In Proceedings of the 5th Conference on
Machine Translation. WMT'20. Pages 660–687. Online.
Document-level Neural MT: A Systematic Comparison.
António Lopes, M. Amin Farajian, Rachel Bawden, Michael Zhang and André T. Martins (2020).
In Proceedings of the 22nd
Annual Conference of the European Association for
Machine Translation. EAMT'20. Pages 225–234. Lisbon, Portugal.
Dataset
Findings of the WMT 2019 Biomedical Translation Shared Task: Evaluation for MEDLINE Abstracts and Biomedical Terminologies.
Rachel Bawden, Kevin Bretonnel Cohen, Cristian Grozea, Antonio Jimeno Yepes, Madeleine Kittner, Martin Krallinger, Nancy Mah, Aurelie Neveol, Mariana Neves, Felipe Soares, Amy Siu, Karin Verspoor, and Maika Vicente Navarro (2019).
In Proceedings of the Fourth Conference
on Machine Translation. WMT'19. Florence, Italy.
Global under-resourced media translation (GoURMET).
Alexandra Birch, Barry Haddow, Ivan Tito, Antonio Valerio Miceli Barone, Rachel Bawden, Felipe Sánchez-Martínez, Mikel L. Forcada, Miquel Esplà-Gomis, Víctor Sánchez-Cartagena, Juan Antonio Pérez-Ortiz, Wilker Aziz, Andrew Secker, and Peggy van der Kreeft (2019).
In Proceedings of Machine Translation Summit XVII Volume 2: Translator, Project and User Tracks. Dublin, Ireland.
Preprints
Book chapters
Chapter 4. Microsyntactic annotation.
Sylvain Kahane, Kim Gerdes, and Rachel Bawden.
In Rhapsodie – A Prosodic and Syntactic Treebank for Spoken French.
Eds. Anne Lacheret-Dujour, Sylvain Kahane, and Paola Pietrandrea.
John Benjamins, Amsterdam, 2019.
Chapter 7. Annotation tools for syntax.
Kim Gerdes, Sylvain Kahane, Rachel Bawden, Julie Belião, Eric de la Clergerie, and Ilaine Wag.
In Rhapsodie – A Prosodic and Syntactic Treebank for Spoken French.
Eds. Anne Lacheret-Dujour, Sylvain Kahane, and Paola Pietrandrea.
John Benjamins, Amsterdam, 2019.
Chapter 15. Exploration of the Rhapsodie corpus: Data structure, formats and query tools.
Anne Lacheret-Dujour, Sylvain Kahane, Rachel Bawden, Serge Fleury and Ilaine Wang.
In Rhapsodie – A Prosodic and Syntactic Treebank for Spoken French.
Eds. Anne Lacheret-Dujour, Sylvain Kahane, and Paola Pietrandrea.
John Benjamins, Amsterdam, 2019.
Technical report
Other
Supervision
PhD students
Analogy for multilingual NLP.
Co-supervised with Benoît Sagot.
Evaluation of the machine translation
of scientific documents. Co-supervised with François Yvon (CNRS).
Machine translation of scientific
documents. Co-supervised with François Yvon (CNRS).
Robust Neural Machine Translation. Co-supervised with Benoît Sagot.
Multimodal Machine Translation. Co-supervised with
with Ivan Laptev, Benoît Sagot and Cordelia Schmid.
Neural models of language
evolution. Co-supervised with Benoît Sagot and Laurent Romary.
Interns and engineers
Data collection and translation models for a regional language of France (COLaF project)
Domain adaptation for neural machine translation in low-resource settings.
Linguistically inspired language models for closely related languages.
Domain adaptation in NMT.
Contrastive training for NMT models
for lexical disambiguation. Co-supervised with Benoît Sagot.
Investigating the effect of input representations on language sharing in multilingual models.
Exploration of multilingual
and multimodal word embeddings. Co-supervised with Benoît Sagot, Cordelia Schmid and Ivan Laptev.
Automatic tools for improving jurisprudence consistency, Co-supervised with Benoît Sagot and in collaboration with the Cour de Cassation.
Machine Translation of Noisy Texts. Co-supervised with Djamé Seddah.
Improving Low-Resource Neural Machine Translation of Related Languages by Transfer Learning. Co-supervised with Alexandra Birch.
Continuous learning for
Neural Machine Translation from Human
Post-edits. Co-supervised with Alexandra Birch.
Integrating document
structure information into Neural Machine
Translation using cache-based models. Co-supervised with Annie Louis and Bonnie Webber.
Exploiting Predictable Document Substructure in Neural Machine Translation. Co-supervised with Annie Louis and Bonnie Webber.
Teaching Experience
4th year lectures, tutorials and practical classes (30hrs/year)
2nd year practical classes (10hrs)
3rd year tutorials and practical classes (27hrs)
3rd year tutorials and practical classes (24hrs/year)
3rd year tutorials and practical classes (28hrs)