About me

I am a researcher at Inria, Paris in the ALMAnaCH team, working on Natural Language Processing (NLP) and more specifically Machine Translation (MT). I was previously a research associate at the University of Edinburgh, working on MT for low-resource languages, having completed my PhD on the contextual MT of dialogue at the LIMSI laboratory in 2018 under the supervision of Sophie Rosset and Thomas Lavergne.

Research Experience

Please see my CV for a complete list of my research and teaching experience.

Researcher (Chargée de recherches)

2020 - present
Inria, Paris, within the ALMAnaCH team
Natural Language Processing and Machine Translation.
Holder of a "springboard" chair position in the PRAIRIE research institute (2021-present)

Research Associate

2018 - 2020
ILCC, University of Edinburgh
Supervised by Alexandra Birch. Working on the MTStretch fellowship (held by Alexandra Birch) and the GoURMET EU project (leader of the work package on morphological modelling)

PhD in Computer Science

2015 - 2018
LIMSI, CNRS, Univ. Paris-Sud, Université Paris-Saclay, France
Going beyond the sentence: Machine Translation of Dialogue in Context
Supervised by Sophie Rosset and Thomas Lavergne

Research visit

May - August 2017
ILCC, University of Edinburgh
Working with Alexandra Birch, Rico Sennrich and Barry Haddow

Master's research placement

February - August 2015
Alpage-Inria, Paris, France
Boosting for model selection in syntactic parsing
Supervised by Benoit Crabbé

Master's research placement

May 2014 - August 2014
LaTTiCe (CNRS/Paris III) and Télécom Paris-Tech, Paris, France
Modelling communicative acts for an embodied conversational agent
Supervised by Frédéric Landragin and Chloé Clavel

Proof-reading and revision

2013

Tesnière Lucien (2015), Elements of structural syntax, translated by Timothy Osborne and Sylvain Kahane, John Benjamins, Amsterdam, U. Paris X.

Master's research placement

March - August 2013
MoDyCo Laboratory (CNRS/Paris X), Paris, France
Annotation, development and semi-automatic correction of a treebank of spoken French for syntactic dependencies (Rhapsodie Treebank of Spoken French).
Supervised by Sylvain Kahane

Publications

Phd Thesis

Going beyond the sentence: Contextual Machine Translation of Dialogue
Rachel Bawden supervised by Sophie Rosset and Thomas Lavergne.
29th November 2018. LIMSI, CNRS, Université Paris-Sud, Université Paris-Saclay.
Committee members:
  • President: Nicolas Sabouret
  • Reviewers: Jörg Tiedemann and Loïc Barrault
  • Examiners: Lucia Specia and Andrei Popescu-Belis

Journal articles

DiaBLa: A Corpus of Bilingual Spontaneous Written Dialogues for Machine Translation. Rachel Bawden, Sophie Rosset, Thomas Lavergne, Eric Bilinski (2020). Language Resources and Evaluation. DOI: 10.1007/s10579-020-09514-4 Dataset, interface and website
Towards the generation of dialogue acts in socio-effective ECAs.
Rachel Bawden, Chloé Clavel and Frédéric Landragin (2016). Language Resources and Evaluation. 4(2):821-838. Springer Netherlands. First online 31 July 2015. doi:10.1007/s10579-015-9312-9

Conference proceedings

Can Cognate Prediction Be Modelled as a Low-Resource Machine Translation Task? Clémentine Fourrier, Rachel Bawden and Benoît Sagot (2021). In ACL-IJCNLP 2021-Findings of the Association for Computational Linguistics. (ACL-Findings'2021).
Few-shot learning through contextual data augmentation. Farid Arthaud, Rachel Bawden and Alexandra Birch (2021). In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers. (EACL'2021) Online.
A Study in Improving BLEU Reference Coverage with Diverse Automatic Paraphrasing. Rachel Bawden, Biao Zhang, Lisa Yankovskaya, Andre Tättar and Matt Post (2020). In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Findings. Online.
The University of Edinburgh-Uppsala University's Submission to the WMT 2020 Chat Translation Task. Nikita Moghe, Christian Hardmeier and Rachel Bawden (2020). In Proceedings of the 5th Conference on Machine Translation (WMT'20). Online.
ParBLEU: Augmenting Metrics with Automatic Paraphrases for the WMT'20 Metrics Shared Task. Rachel Bawden, Biao Zhang, Andre Tättar and Matt Post (2020). In Proceedings of the 5th Conference on Machine Translation (WMT'20). Online.
The University of Edinburgh's English-Tamil and English-Inuktitut Submissions to the WMT20 News Translation Task. Rachel Bawden, Alexandra Birch, Radina Dobreva, Arturo Oncevay, Antonio Valerio Miceli Barone and Philip Williams (2020). In Proceedings of the 5th Conference on Machine Translation (WMT'20). Online. Models and code
Findings of the WMT 2020 Biomedical Translation Shared Task: Basque, Italian and Russian as New Additional Languages. Rachel Bawden, Giorgio Maria Di Nunzio, Cristian Grozea, Iñigo Jauregi Unanue, Antonio Jimeno Yepes, Nancy Mah, David Martinez, Aurélie Névéol, Mariana Neves, Maite Oronoz, Olatz Perez de Viñaspre, Massimo Piccardi, Roland Roller, Amy Siu, Philippe Thomas, Federica Vezzani, Maika Vicente Navarro, Dina Wiemann, Lana Yeganova (2020). In Proceedings of the 5th Conference on Machine Translation (WMT'20). Online.
Document-level Neural MT: A Systematic Comparison. António Lopes, M. Amin Farajian, Rachel Bawden, Michael Zhang and André T. Martins (2020). In Proceedings of the 22nd Annual Conference of the European Association for Machine Translation (EAMT'20). Lisbon, Portugal. Dataset
Architecture of a Scalable, Secure and Resilient Translation Platform for Multilingual News Media. Susie Coleman, Andrew Secker, Rachel Bawden, Barry Haddow and Alexandra Birch (2020). In Proceedings of the 1st International Workshop on Language Technology Platforms (IWLPT'20). Marseille, France.
Document Sub-structure in Neural Machine Translation. Radina Dobreva, Jie Zhou, and Rachel Bawden (2020). In Proceedings of the 12th Language Resources and Evaluation Conference (LREC'20). Marseille, France. Datasets
The University of Edinburgh’s Submissions to the WMT19 News Translation Task (Updated version). Rachel Bawden, Nikolay Bogoychev, Ulrich Germann, Roman Grundkiewicz, Faheem Kirefu, Antonio Valerio Miceli Barone, and Alexandra Birch (2019). In Proceedings of the Fourth Conference on Machine Translation (WMT'19). Florence, Italy. Gujarati models
Findings of the WMT 2019 Biomedical Translation Shared Task: Evaluation for MEDLINE Abstracts and Biomedical Terminologies. Rachel Bawden, Kevin Bretonnel Cohen, Cristian Grozea, Antonio Jimeno Yepes, Madeleine Kittner, Martin Krallinger, Nancy Mah, Aurelie Neveol, Mariana Neves, Felipe Soares, Amy Siu, Karin Verspoor, and Maika Vicente Navarro (2019). In Proceedings of the Fourth Conference on Machine Translation (WMT'19). Florence, Italy.
Global under-resourced media translation (GoURMET). Alexandra Birch, Barry Haddow, Ivan Tito, Antonio Valerio Miceli Barone, Rachel Bawden, Felipe Sánchez-Martínez, Mikel L. Forcada, Miquel Esplà-Gomis, Víctor Sánchez-Cartagena, Juan Antonio Pérez-Ortiz, Wilker Aziz, Andrew Secker, and Peggy van der Kreeft (2019). In Proceedings of Machine Translation Summit XVII Volume 2: Translator, Project and User Tracks. Dublin, Ireland.
Evaluating Discourse Phenomena in Neural Machine Translation. Rachel Bawden, Rico Sennrich, Alexandra Birch and Barry Haddow (2018). In Proceedings of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL'18). New Orleans, USA. code and test set.
Detecting context-dependent sentences in parallel corpora. Rachel Bawden, Thomas Lavergne and Sophie Rosset (2018). In Proceedings of the 25th Conférence sur le Traitement Automatique des Langues Naturelles (TALN'18). Rennes, France.
Machine Translation, it's a question of style, innit? The case of English tag questions. Rachel Bawden (2017). In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP'17). Copenhagen, Denmark. code
Machine Translation of Speech-Like Texts: Strategies for the Inclusion of Context. Rachel Bawden (2017). In Proceedings of the 19th REncontres jeunes Chercheurs en Informatique pour le TAL (RECITAL 2017). Orléans, France.
Boosting for Efficient Model Selection for Syntactic Parsing. Rachel Bawden and Benoit Crabbé (2016). In Proceedings of the 26th International Conference on Computational Linguistics (COLING'16). Osaka, Japan.
Investigating gender adaptation for speech translation. Rachel Bawden, Guillaume Wisniewski and Heéleène Maynard (2016). In Proceedings of the 23rd Conférence sur le Traitement Automatique des Langues Naturelles (TALN'16). Paris, France.
Correcting and Validating Syntactic Dependency in the Spoken French Treebank Rhapsodie Rachel Bawden, Marie-Amélie Bottala, Kim Gerdes and Sylvain Kahane (2014). In Proceedings of the 9th International Conference on Language Resources and Evaluation (LREC'14). Reykjavik, Iceland.

Book chapters

Chapter 4. Microsyntactic annotation. Sylvain Kahane, Kim Gerdes, and Rachel Bawden. In Rhapsodie – A Prosodic and Syntactic Treebank for Spoken French. Eds. Anne Lacheret-Dujour, Sylvain Kahane, and Paola Pietrandrea. John Benjamins, Amsterdam, 2019
Chapter 7. Annotation tools for syntax. Kim Gerdes, Sylvain Kahane, Rachel Bawden, Julie Belião, Eric de la Clergerie, and Ilaine Wag. In Rhapsodie – A Prosodic and Syntactic Treebank for Spoken French. Eds. Anne Lacheret-Dujour, Sylvain Kahane, and Paola Pietrandrea. John Benjamins, Amsterdam, 2019
Chapter 15. Exploration of the Rhapsodie corpus: Data structure, formats and query tools. Anne Lacheret-Dujour, Sylvain Kahane, Rachel Bawden, Serge Fleury and Ilaine Wang. In Rhapsodie – A Prosodic and Syntactic Treebank for Spoken French. Eds. Anne Lacheret-Dujour, Sylvain Kahane, and Paola Pietrandrea. John Benjamins, Amsterdam, 2019

Technical report

Protocole de codage microsyntaxique Sylvain Kahane, Kim Gerdes, Pietrandrea Paola, Benzitoun Christophe, Rachel Bawden, Marie-Amélie Botalla and Adèle Désoyer (2013). Translated into English by Rachel Bawden: Protocol for micro-syntactic coding. Link to the Rhapsodie project website.

Other

Boosting for Model Selection in Syntactic Parsing Rachel Bawden (2015). Master's thesis. Alpage-INRIA. Supervised by Benoit Crabbé

Supervision

Sonal Sannigrahi

End June 2021 - present, École Polytechnique
Intern
Investigating the effect of input representations on language sharing in multilingual models.

Matthieu Futeral-Peter

May 2021 - present
Master 2 intern, ENSAE and ENS Paris- Saclay
Exploration of multilingual and multimodal word embeddings. Co-supervised with Benoît Sagot, Cordelia Schmid and Ivan Laptev.

Thibault Charmet

Septembre 2020 - present
Research Engineer
Automatic tools for improving jurisprudence consistency, Co-supervised with Benoît Sagot and in collaboration with the Cour de Cassation.

Clémentine Fourrier

Septembre 2020 - present
Inria-funded PhD
Neural models of language evolution. Co-supervised with Benoît Sagot

Quentin Burthier

Septembre 2020 - present
Master 2 student, ENS Paris-Saclay
Machine Translation of Noisy Texts. Co-supervised with Djamé Seddah.

Ashwani Tanwar

April - August 2020
MSC thesis, University of Edinburgh
Improving Low-Resource Neural Machine Translation of Related Languages by Transfer Learning. Co-supervised with Alexandra Birch.

Farid Arthaud

February - June 2020
Master 1 (ENS, Paris), visiting student at the University of Edinburgh
Continuous learning for Neural Machine Translation from Human Post-edits. Co-supervised with Alexandra Birch.

Radina Dobreva

March - August 2019
MSC thesis, University of Edinburgh
Integrating document structure information into Neural Machine Translation using cache-based models. Co-supervised with Annie Louis and Bonnie Webber.

Jie Zhou

March - August 2019
MSC thesis, University of Edinburgh
Exploiting Predictable Document Substructure in Neural Machine Translation. Co-supervised with Annie Louis and Bonnie Webber.

Teaching Experience

Introduction to NLP

2016-17 and 2017-18
PolyTech Paris-Sud, France

4th year lectures, tutorials and practical classes (30hrs/year)

Introduction to algorithmics and C++

2017-18
PolyTech Paris-Sud, France

2nd year practical classes (10hrs)

Programming with C

2016-17
PolyTech Paris-Sud, France

3rd year tutorials and practical classes (27hrs)

Databases

2015-16 and 2016-17
PolyTech Paris-Sud, France

3rd year tutorials and practical classes (24hrs/year)

Algorithmics and C

2015-16 and 2016-17
PolyTech Paris-Sud, France

3rd year tutorials and practical classes (28hrs)

Tutoring in computer science

2015-16
PolyTech Paris-Sud, France

3rd year (12hrs)

English language assistant

2009-10
4 primariy schools, Le Puy-en-Velay, France

Work placement as an English language assistant

2005
Collège St. Martin, Tours, France