CNU, Sections 7 (Linguistics) and 27 (Informatics)
2019
PhD in Computer Science
LIMSI, CNRS, Univ. Paris-Sud, Université Paris Saclay, France
2015 - 2018
MA in Computational Linguistics
Université Denis Diderot (Paris VII)
2013 - 2015
1st year of MA in Linguistic Engineering
Université Sorbonne Nouvelle (Paris III)
2012 - 2013
BA in French and Linguistics
Keble College, University of Oxford
2007 - 2011
Languages
English (Native)
French (Bilingual)
About me
I am a researcher at Inria, Paris in the ALMAnaCH
team, working on Natural Language Processing (NLP) and
more specifically Machine Translation (MT). I was
previously a research associate at the University of
Edinburgh, working on MT for low-resource languages,
having completed my PhD on the contextual MT of
dialogue at the LIMSI laboratory (now the LISN) in 2018 under the
supervision of Sophie Rosset and Thomas Lavergne.
Research Experience
Please see my CV
for a complete list of my research and teaching experience.
Natural Language Processing
and Machine Translation.
Holder of a "springboard" chair position in the
PRAIRIE
research institute (2021-present)
Research Associate
2018 - 2020
ILCC, University of Edinburgh
Supervised by Alexandra
Birch. Working on the MTStretch
fellowship (held by Alexandra Birch) and the
GoURMET EU
project (leader of the work package on morphological modelling)
PhD in Computer Science
2015 - 2018
LIMSI, CNRS, Univ. Paris-Sud, Université Paris-Saclay, France
Working with Alexandra Birch, Rico Sennrich and
Barry Haddow
Master's research placement
February - August 2015
Alpage-Inria, Paris, France
Boosting for model selection in syntactic parsing
Supervised by Benoit Crabbé
Master's research placement
May 2014 - August 2014
LaTTiCe (CNRS/Paris III) and Télécom Paris-Tech, Paris, France
Modelling communicative acts for an embodied conversational agent
Supervised by Frédéric Landragin and Chloé Clavel
Proof-reading and revision
2013
Tesnière Lucien (2015), Elements of structural syntax, translated by Timothy Osborne and Sylvain Kahane, John Benjamins, Amsterdam, U. Paris X.
Master's research placement
March - August 2013
MoDyCo Laboratory (CNRS/Paris X), Paris, France
Annotation, development and
semi-automatic correction of a treebank of spoken French for
syntactic dependencies (Rhapsodie Treebank of
Spoken French). Supervised by Sylvain Kahane
Survey of low-resource machine translation
Barry Haddow, Rachel Bawden, Antonio Valerio Miceli
Barone, Jindřich Helcl, Alexandra Birch (2022)
Computational Linguistics. 48(3):673-732..
À propos des difficultés à traduire automatiquement de longs documents.
Ziqian Peng, Rachel Bawden and François Yvon (2024).
In Proceedings of the 31st Conférence sur
le Traitement Automatique des Langues Naturelles, volume
1: articles longs et prises de position
EAMT'24. Pages 2-21. Toulouse, France.
Évaluer BLOOM en français.Rachel Bawden, Hatim Bourfoune, Bertrand Cabot, Nathan Cassereau, Pierre Cornette, Marco Naguib, François Yvon (2024).
In Proceedings of EvalLLM2024 : Atelier sur l'évaluation des modèles génératifs (LLM) et challenge d'extraction d'information few-shot.
Toulouse, France.
An extended version of the paper (technical report) can be
found here.
Translate your Own: a Post-Editing Experiment in the NLP domain.Rachel Bawden, Ziqian Peng, Maud
Bénard, Eric Villemonte de La Clergerie, Raphaël
Esamotunu, Mathilde Huguin, Natalie Kübler, Alexandra
Mestivier, Mona Michelot, Laurent Romary, Lichao Zhu and François Yvon (2024).
In Proceedings of The 25th Annual
Conference of the European Association for Machine
Translation EAMT'24. Pages 431–443. Sheffield, UK.
Making Sentence Embeddings Robust to User-Generated Content.
Lydia Nishimwe, Benoît Sagot and Rachel Bawden (2024).
In Proceedings of the 2024 Joint
International Conference on Computational Linguistics,
Language Resources and Evaluation (LREC-COLING
2024) Pages 10984–10998. Torino, Italia.
Reconnaissance des écritures dans les imprimés.
Simon Gabay, Thibault Clérice, Pauline Jacsont, Elina
Leblanc, Marie Jeannot-Tirole, Sonia Solfrini, Sophie
Dolto, Floriane Goy, Carmen Carrasco Luján, Maddalena
Zaglio, Myriam Perregaux, Juliette Janès, Benoît Sagot,
Rachel Bawden, Rasul Dent, Oriane Nédey, Alix Chagué (2024).
In Proceedings of Humanistica
2024-Colloque annuel de l'Association francophone des
humanités numériques. Meknès, Morocco.
Findings of the 2023 conference on machine translation (WMT23): LLMs are here but not quite there yet.
Tom Kocmi, Eleftherios Avramidis, Rachel Bawden, Ondřej Bojar, Anton
Dvorkovich, Christian Federmann, Mark Fishel, Markus
Freitag, Thamme Gowda, Roman Grundkiewicz, Barry Haddow,
Philipp Koehn, Benjamin Marie, Christof Monz, Makoto
Morishita, Kenton Murray, Makoto Nagata, Toshiaki
Nakazawa, Martin Popel, Maja Popović and Mariya Shmatova (2023).
In Proceedings of the Eighth Conference on Machine Translation. WMT'23. Pages 1–42. Singapore.
MaTOS: traduction automatique pour la science ouverte.
Maud Bénard, Alexandra Mestivier, Natalie Kubler, Lichao
Zhu, Rachel Bawden, Eric De La Clergerie, Laurent Romary,
Mathilde Huguin, Jean-François Nominé, Ziqian Peng and François Yvon (2022).
In Actes de la 30e Conférence sur le
Traitement Automatique des Langues
Naturelles. TALN'23. Paris, France.
Findings of the WMT 2022 Biomedical Translation Shared Task: Monolingual Clinical Case Reports.
Mariana Neves, Antonio Jimeno Yepes, Amy Siu, Roland Roller, Philippe Thomas, Maika Vicente Navarro, Lana Yeganova, Dina Wiemann, Giorgio Maria Di Nunzio, Federica Vezzani, Christel Gérardin, Rachel Bawden, Darryl Johan Estrada, Salvador Lima-López, Eulàlia Farré-Maduell, Martin Krallinger, Cristian Grozea, Aurélie Névéol (2022).
In Proceedings of the Seventh Conference
on Machine Translation. WMT'22. Pages 694-723. Abu
Dhabi, United Arab Emirates.
Findings of the 2022 conference on machine translation (WMT22).
Tom Kocmi, Rachel Bawden, Ondřej Bojar, Anton Dvorkovich,
Christian Federmann, Mark Fishel, Thamme Gowda, Yvette
Graham, Roman Grundkiewicz, Barry Haddow, Rebecca
Knowles, Philipp Koehn, Christof Monz, Makoto Morishita,
Masaaki Nagata, Toshiaki Nakazawa, Michal Novák, Martin
Popel, Maja Popović (2022).
In Proceedings of the Seventh Conference
on Machine Translation. WMT'22. Pages 1-45. Abu
Dhabi, United Arab Emirates.
Automatic Normalisation of Early Modern French.
Rachel Bawden, Jonathan Poinhos, Eleni Kogkitsidou, Philippe Gambette, Benoît Sagot and Simon Gabay (2022).
In Proceedings of the 13th Language
Resources and Evaluation
Conference. LREC'22. Pages 3354–3366. Marseille, France.
Multitask Prompt Tuning Enables Zero-Shot Task Generalization.
Victor Sanh, Albert Webson, Colin Raffel, Stephen H. Bach,
Lintang Sutawika, Zaid Alyafeai, Antoine Chaffin, Arnaud
Stiegler, Teven Le Scao, Arun Raja, Manan Dey, M Saiful
Bari, Canwen Xu, Urmish Thakker, Shanya Sharma Sharma,
Eliza Szczechla, Taewoon Kim, Gunjan Chhablani, Nihal
Nayak, Debajyoti Datta, Jonathan Chang, Mike Tian-Jian
Jiang, Han Wang, Matteo Manica, Sheng Shen, Zheng Xin
Yong, Harshit Pandey, Rachel Bawden, Thomas Wang, Trishala Neeraj, Jos Rozen, Abheesht Sharma, Andrea Santilli, Thibault Fevry, Jason Alan Fries, Ryan Teehan, Tali Bers, Stella Biderman, Leo Gao, Thomas Wolf, Alexander M. Rush (2022).
In Proceedings of the 10th International
Conference on Learning Representations. ICLR'22. Online.
Findings of the WMT 2021 biomedical translation shared task: Summaries of animal experiments as new test set.
Lana Yeganova, Dina Wiemann, Mariana Neves, Federica Vezzani, Amy Siu, Iñigo Jauregi
Unanue, Maite Oronoz, Nancy Mah, Aurélie Névéol, David Martinez, Rachel Bawden, Giorgio Maria Di Nunzio, Roland Roller, Philippe Thomas, Cristian Grozea, Olatz Perez de Viñaspre, Maika Vicente Navarro and Antonio Jimeno Yepes (2021).
In Proceedings of the 6th Conference on
Machine Translation. WMT'2021. Online.
Expanding the content model of annotation Block.
Alexandre Bartz, Juliette Janes, Laurent Romary, Philippe
Gambette, Rachel Bawden, Pedro
Javier Ortiz Suárez, Benoît Sagot, Simon Gabay (2021).
In Proceedings of Next Gen TEI, 2021-TEI Conference and Members’ Meeting.
Few-shot learning through contextual data augmentation.
Farid Arthaud, Rachel Bawden and Alexandra Birch (2021).
In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers. EACL'2021. Online.
Findings of the WMT 2020 Biomedical Translation Shared Task: Basque, Italian and Russian as New Additional Languages.
Rachel Bawden, Giorgio Maria Di Nunzio, Cristian Grozea, Iñigo Jauregi Unanue, Antonio Jimeno Yepes, Nancy Mah, David Martinez, Aurélie Névéol, Mariana Neves, Maite Oronoz, Olatz Perez de Viñaspre, Massimo Piccardi, Roland Roller, Amy Siu, Philippe Thomas, Federica Vezzani, Maika Vicente Navarro, Dina Wiemann, Lana Yeganova (2020).
In Proceedings of the 5th Conference on
Machine Translation. WMT'20. Pages 660–687. Online.
Document-level Neural MT: A Systematic Comparison.
António Lopes, M. Amin Farajian, Rachel Bawden, Michael Zhang and André T. Martins (2020).
In Proceedings of the 22nd
Annual Conference of the European Association for
Machine Translation. EAMT'20. Pages 225–234. Lisbon, Portugal.
Dataset
Global under-resourced media translation (GoURMET).
Alexandra Birch, Barry Haddow, Ivan Tito, Antonio Valerio Miceli Barone, Rachel Bawden, Felipe Sánchez-Martínez, Mikel L. Forcada, Miquel Esplà-Gomis, Víctor Sánchez-Cartagena, Juan Antonio Pérez-Ortiz, Wilker Aziz, Andrew Secker, and Peggy van der Kreeft (2019).
In Proceedings of Machine Translation Summit XVII Volume 2: Translator, Project and User Tracks. Dublin, Ireland.
Investigating gender adaptation for speech translation.
Rachel Bawden, Guillaume Wisniewski and Heéleène Maynard (2016).
In Proceedings of the 23rd Conférence sur le Traitement Automatique des Langues Naturelles. TALN'16. Paris, France.
Chapter 4. Microsyntactic annotation.
Sylvain Kahane, Kim Gerdes, and Rachel Bawden.
In Rhapsodie – A Prosodic and Syntactic Treebank for Spoken French.
Eds. Anne Lacheret-Dujour, Sylvain Kahane, and Paola Pietrandrea.
John Benjamins, Amsterdam, 2019.
Chapter 7. Annotation tools for syntax.
Kim Gerdes, Sylvain Kahane, Rachel Bawden, Julie Belião, Eric de la Clergerie, and Ilaine Wag.
In Rhapsodie – A Prosodic and Syntactic Treebank for Spoken French.
Eds. Anne Lacheret-Dujour, Sylvain Kahane, and Paola Pietrandrea.
John Benjamins, Amsterdam, 2019.
Chapter 15. Exploration of the Rhapsodie corpus: Data structure, formats and query tools.
Anne Lacheret-Dujour, Sylvain Kahane, Rachel Bawden, Serge Fleury and Ilaine Wang.
In Rhapsodie – A Prosodic and Syntactic Treebank for Spoken French.
Eds. Anne Lacheret-Dujour, Sylvain Kahane, and Paola Pietrandrea.
John Benjamins, Amsterdam, 2019.