Publications | Rachel Bawden

Théo Lasnier, Armel Zebaze, Djamé Seddah, Rachel Bawden, Benoît Sagot (2026). Disentangling meaning from language in LLM-based machine translation.
Thibault Clérice, Rachel Bawden, Anthony Glaise, Ariane Pinche, David Smith (2026). Pre-Editorial Normalization for Automatically Transcribed Medieval Manuscripts in Old French and Latin.
Jesujoba Oluwadara Alabi, Israel Abebe Azime, Miaoran Zhang, Cristina España-Bonet, Rachel Bawden, Dawei Zhu, David Ifeoluwa Adelani, Clement Oyeleke Odoje, Idris Akinade, Iffat Maab, Davis David, Shamsuddeen Hassan Muhammad, Neo Putini, David O. Ademuyiwa, Andrew Caines, Dietrich Klakow (2025). AFRIDOC-MT: Document-level MT Corpus for African Languages. Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 27770–27806.
Malik Marmonier, Rachel Bawden, Benoît Sagot (2025). Explicit Learning and the LLM in Machine Translation. Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 31372–31422.
Matthieu Futeral, Cordelia Schmid, Benoît Sagot, Rachel Bawden (2025). Towards Zero-Shot Multimodal Machine Translation. Findings of the Association for Computational Linguistics: NAACL 2025, 761–778.
Armel Randy Zebaze, Benoît Sagot, Rachel Bawden (2025). In-Context Example Selection via Similarity Search Improves Low-Resource Machine Translation. Findings of the Association for Computational Linguistics: NAACL 2025, 1222–1252.
Matthieu Futeral, Armel Randy Zebaze, Pedro Ortiz Suarez, Julien Abadji, Rémi Lacroix, Cordelia Schmid, Rachel Bawden, Benoît Sagot (2025). mOSCAR: A Large-scale Multilingual and Multimodal Document-level Corpus. Findings of the Association for Computational Linguistics: ACL 2025, 3461–3494.
Armel Randy Zebaze, Benoît Sagot, Rachel Bawden (2025). Compositional Translation: A Novel LLM-based Approach for Low-resource Machine Translation. Findings of the Association for Computational Linguistics: EMNLP 2025, 22328–22357.
Armel Randy Zebaze, Benoît Sagot, Rachel Bawden (2025). TopXGen: Topic-Diverse Parallel Data Generation for Low-Resource Machine Translation. Findings of the Association for Computational Linguistics: EMNLP 2025, 22358–22381.
Benoît Sagot, Slim Ouni, Sam Bigeard, Lucence Ing, Thibault Clérice, Rachel Bawden, Emmanuel Vincent, Malek Yaich, Panagiotis Tsolakis, Juliette Janès, Rasul Dent, Oriane Nédey, Vincent Colotte, Mostafa Sadeghi (2025). COLaF : Corpus et Outils pour les Langues de France et variétés de français. Actes de la session industrielle de CORIA-TALN 2025, 33–47.
Ziqian Peng, Rachel Bawden, François Yvon (2025). Investigating Length Issues in Document-level Machine Translation. Proceedings of Machine Translation Summit XX: Volume 1, 4–23.
Rachel Bawden, Maud Bénard, Éric Clergerie, José Cornejo Cárcamo, Nicolas Dahan, Manon Delorme, Mathilde Huguin, Natalie Kübler, Paul Lerner, Alexandra Mestivier, Joachim Minder, Jean-François Nominé, Ziqian Peng, Laurent Romary, Panagiotis Tsolakis, Lichao Zhu, François Yvon (2025). MaTOS: Machine Translation for Open Science. Proceedings of Machine Translation Summit XX: Volume 2, 103–104.
Ziqian Peng, Rachel Bawden, François Yvon (2025). Self-Retrieval from Distant Contexts for Document-Level Machine Translation. Proceedings of the Tenth Conference on Machine Translation, 220–240.
Tom Kocmi, Ekaterina Artemova, Eleftherios Avramidis, Rachel Bawden, Ondřej Bojar, Konstantin Dranch, Anton Dvorkovich, Sergey Dukanov, Mark Fishel, Markus Freitag, Thamme Gowda, Roman Grundkiewicz, Barry Haddow, Marzena Karpinska, Philipp Koehn, Howard Lakougna, Jessica Lundin, Christof Monz, Kenton Murray, Masaaki Nagata, Stefano Perrella, Lorenzo Proietti, Martin Popel, Maja Popović, Parker Riley, Mariya Shmatova, Steinthór Steingrímsson, Lisa Yankovskaya, Vilém Zouhar (2025). Findings of the WMT25 General Machine Translation Shared Task: Time to Stop Evaluating on Easy Test Sets. Proceedings of the Tenth Conference on Machine Translation, 355–413.
Rachel Bawden & Benoît Sagot (2025). RoCS-MT v2 at WMT 2025: Robust Challenge Set for Machine Translation. Proceedings of the Tenth Conference on Machine Translation, 834–849.
Malik Marmonier, Benoît Sagot, Rachel Bawden (2025). A French Version of the OLDI Seed Corpus. Proceedings of the Tenth Conference on Machine Translation, 1048–1060.
Tom Kocmi, Eleftherios Avramidis, Rachel Bawden, Ondřej Bojar, Konstantin Dranch, Anton Dvorkovich, Sergey Dukanov, Natalia Fedorova, Mark Fishel, Markus Freitag, Thamme Gowda, Roman Grundkiewicz, Barry Haddow, Marzena Karpinska, Philipp Koehn, Howard Lakougna, Jessica Lundin, Kenton Murray, Masaaki Nagata, Stefano Perrella, Lorenzo Proietti, Martin Popel, Maja Popović, Parker Riley, Mariya Shmatova, Steinþór Steingrímsson, Lisa Yankovskaya, Vilém Zouhar (2025). Preliminary Ranking of WMT25 General Machine Translation Systems.
Ziqian Peng, Rachel Bawden, François Yvon (2025). Model Cards for the MaTOS Project.
Armel Zebaze, Rachel Bawden, Benoît Sagot (2025). LLM Reasoning for Machine Translation: Synthetic Data Generation over Thinking Tokens.
Nathan Godey, Wissam Antoun, Rian Touchent, Rachel Bawden, Éric Clergerie, Benoît Sagot, Djamé Seddah (2025). Gaperon: A Peppered English-French Generative Language Model Suite.
Oriane Nédey, Juliette Janès, Rachel Bawden, Thibault Clérice, Benoît Sagot (2025). ForumOccitania: a Corpus of User-Generated Content for Multiple Occitan Varieties.
Lydia Nishimwe, Benoît Sagot, Rachel Bawden (2025). When the Gold Standard isn’t Necessarily Standard: Challenges of Evaluating the Translation of User-Generated Content.
Ziqian Peng, Rachel Bawden, François Yvon (2024). À propos des difficultés à traduire automatiquement de longs documents. Proceedings of the 31st Conférence sur le Traitement Automatique des Langues Naturelles (TALN 2024). Volume 1: articles longs et prises de position, 2–21.
Ziqian Peng, Rachel Bawden, François Yvon (2024). Évaluer BLOOM en français. Proceedings of EvalLLM2024 : Atelier sur l’évaluation des modèles génératifs (LLM) et challenge d’extraction d’information few-shot.
Jesujoba O. Alabi & Rachel Bawden (2024). Exploring Inline Lexicon Injection for Cross-Domain Transfer in Neural Machine Translation. Proceedings of the First International Workshop on Knowledge-Enhanced Machine Translation, 7–20.
Lydia Nishimwe, Benoı̂t Sagot, Rachel Bawden (2024). Making Sentence Embeddings Robust to User-Generated Content. Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), 10984–10998.
Lydia Nishimwe, Benoît Sagot, Rachel Bawden (2024). Étude sur la normalisation lexicale de contenus produits par les utilisateurs [A Study on the Lexical Normalisation of User-Generated Content]. Traitement Automatique des Langues, 15–41.
Niyati Bafna, Cristina España-Bonet, Josef Genabith, Benoı̂t Sagot, Rachel Bawden (2024). When Your Cousin Has the Right Connections: Unsupervised Bilingual Lexicon Induction for Related Data-Imbalanced Languages. Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), 17544–17556.
Seth Aycock & Rachel Bawden (2024). Topic-guided Example Selection for Domain Adaptation in LLM-based Machine Translation. Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics: Student Research Workshop, 175–195.
Simon Gabay, Thibault Clérice, Pauline Jacsont, Elina Leblanc, Marie Jeannot-Tirole, Sonia Solfrini, Sophie Dolto, Floriane Goy, Carmen Carrasco Luján, Maddalena Zaglio, Myriam Perregaux, Juliette Janes, Benoît Sagot, Rachel Bawden, Rasul Dent, Oriane Nédey, Alix Chagué (2024). Reconnaissance des écritures dans les imprimés. Proceedings of Humanistica 2024.
Tom Kocmi, Eleftherios Avramidis, Rachel Bawden, Ondřej Bojar, Anton Dvorkovich, Christian Federmann, Mark Fishel, Markus Freitag, Thamme Gowda, Roman Grundkiewicz, Barry Haddow, Marzena Karpinska, Philipp Koehn, Benjamin Marie, Christof Monz, Kenton Murray, Masaaki Nagata, Martin Popel, Maja Popović, Mariya Shmatova, Steinthór Steingrímsson, Vilém Zouhar (2024). Findings of the WMT24 General Machine Translation Shared Task: The LLM Era Is Here but MT Is Not Solved Yet. Proceedings of the Ninth Conference on Machine Translation, 1–46.
Mariana Neves, Cristian Grozea, Philippe Thomas, Roland Roller, Rachel Bawden, Aurélie Névéol, Steffen Castle, Vanessa Bonato, Giorgio Maria Di Nunzio, Federica Vezzani, Maika Vicente Navarro, Lana Yeganova, Antonio Jimeno Yepes (2024). Findings of the WMT 2024 Biomedical Translation Shared Task: Test Sets on Abstract Level. Proceedings of the Ninth Conference on Machine Translation, 124–138.
Rachel Bawden, Ziqian Peng, Maud Bénard, Éric Clergerie, Raphaël Esamotunu, Mathilde Huguin, Natalie Kübler, Alexandra Mestivier, Mona Michelot, Laurent Romary, Lichao Zhu, François Yvon (2024). Translate your Own: a Post-Editing Experiment in the NLP domain. Proceedings of the 25th Annual Conference of the European Association for Machine Translation (Volume 1), 431–443.
Armel Randy Zebaze, Benoît Sagot, Rachel Bawden (2024). Tree of Problems: Improving structured problem solving with compositionality. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 18028–18047.
Rachel Bawden & Benoı̂t Sagot (2023). RoCS-MT: Robustness Challenge Set for Machine Translation. Proceedings of the Eighth Conference on Machine Translation, 198–216.
Tom Kocmi, Eleftherios Avramidis, Rachel Bawden, Ondřej Bojar, Anton Dvorkovich, Christian Federmann, Mark Fishel, Markus Freitag, Thamme Gowda, Roman Grundkiewicz, Barry Haddow, Philipp Koehn, Benjamin Marie, Christof Monz, Makoto Morishita, Kenton Murray, Makoto Nagata, Toshiaki Nakazawa, Martin Popel, Maja Popović, Mariya Shmatova (2023). Findings of the 2023 Conference on Machine Translation (WMT23): LLMs Are Here but Not Quite There Yet. Proceedings of the Eighth Conference on Machine Translation, 1–42.
Mariana Neves, Antonio Jimeno Yepes, Aurélie Névéol, Rachel Bawden, Giorgio Maria Di Nunzio, Roland Roller, Philippe Thomas, Federica Vezzani, Maika Vicente Navarro, Lana Yeganova, Dina Wiemann, Cristian Grozea (2023). Findings of the WMT 2023 Biomedical Translation Shared Task: Evaluation of ChatGPT 3.5 as a Comparison System. Proceedings of the Eighth Conference on Machine Translation, 43–54.
Matthieu Futeral, Cordelia Schmid, Ivan Laptev, Benoı̂t Sagot, Rachel Bawden (2023). Tackling Ambiguity with Images: Improved Multimodal Machine Translation and Contrastive Evaluation. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 5394–5413.
Sonal Sannigrahi & Rachel Bawden (2023). Investigating Lexical Sharing in Multilingual Machine Translation for Indian Languages. Proceedings of the 24th Annual Conference of the European Association for Machine Translation, 181–192.
Rachel Bawden & François Yvon (2023). Investigating the Translation Performance of a Large Multilingual Language Model: the Case of BLOOM. Proceedings of the 24th Annual Conference of the European Association for Machine Translation, 157–170.
Niyati Bafna, Cristina España-Bonet, Josef Van Genabith, Benoı̂t Sagot, Rachel Bawden (2023). Cross-lingual Strategies for Low-resource Language Modeling: A Study on Five Indic Dialects. Actes de CORIA-TALN 2023. Actes de la 30e Conférence sur le Traitement Automatique des Langues Naturelles (TALN), volume 1 : travaux de recherche originaux – articles longs, 28–42.
Maud Bénard, Alexandra Mestivier, Natalie Kubler, Lichao Zhu, Rachel Bawden, Eric De La Clergerie, Laurent Romary, Mathilde Huguin, Jean-François Nominé, Ziqian Peng, François Yvon (2023). MaTOS: Traduction automatique pour la science ouverte. Actes de l’atelier “Analyse et Recherche de Textes Scientifiques” (ARTS)@TALN 2023, 8–15.
BigScience Workshop, Teven Le Scao, Angela Fan, Christopher Akiki, Ellie Pavlick, Suzana Ilić, Daniel Hesslow, Roman Castagné, Alexandra Sasha Luccioni, François Yvon, Matthias Gallé, Jonathan Tow, Alexander M. Rush, Stella Biderman, Albert Webson, Pawan Sasanka Ammanamanchi, Thomas Wang, Benoît Sagot, Niklas Muennighoff, Albert Villanova Moral, Olatunji Ruwase, Rachel Bawden, Stas Bekman, Angelina McMillan-Major, Iz Beltagy, Huu Nguyen, Lucile Saulnier, Samson Tan, Pedro Ortiz Suarez, Victor Sanh, Hugo Laurençon, Yacine Jernite, Julien Launay, Margaret Mitchell, Colin Raffel, Aaron Gokaslan, Adi Simhi, Aitor Soroa, Alham Fikri Aji, Amit Alfassy, Anna Rogers, Ariel Kreisberg Nitzav, Canwen Xu, Chenghao Mou, Chris Emezue, Christopher Klamm, Colin Leong, Daniel Strien, David Ifeoluwa Adelani, Dragomir Radev, Eduardo González Ponferrada, Efrat Levkovizh, Ethan Kim, Eyal Bar Natan, Francesco De Toni, Gérard Dupont, Germán Kruszewski, Giada Pistilli, Hady Elsahar, Hamza Benyamina, Hieu Tran, Ian Yu, Idris Abdulmumin, Isaac Johnson, Itziar Gonzalez-Dios, Javier Rosa, Jenny Chim, Jesse Dodge, Jian Zhu, Jonathan Chang, Jörg Frohberg, Joseph Tobing, Joydeep Bhattacharjee, Khalid Almubarak, Kimbo Chen, Kyle Lo, Leandro Von Werra, Leon Weber, Long Phan, Loubna Ben allal, Ludovic Tanguy, Manan Dey, Manuel Romero Muñoz, Maraim Masoud, María Grandury, Mario Šaško, Max Huang, Maximin Coavoux, Mayank Singh, Mike Tian-Jian Jiang, Minh Chien Vu, Mohammad A. Jauhar, Mustafa Ghaleb, Nishant Subramani, Nora Kassner, Nurulaqilla Khamis, Olivier Nguyen, Omar Espejel, Ona Gibert, Paulo Villegas, Peter Henderson, Pierre Colombo, Priscilla Amuok, Quentin Lhoest, Rheza Harliman, Rishi Bommasani, Roberto Luis López, Rui Ribeiro, Salomey Osei, Sampo Pyysalo, Sebastian Nagel, Shamik Bose, Shamsuddeen Hassan Muhammad, Shanya Sharma, Shayne Longpre, Somaieh Nikpoor, Stanislav Silberberg, Suhas Pai, Sydney Zink, Tiago Timponi Torrent, Timo Schick, Tristan Thrush, Valentin Danchev, Vassilina Nikoulina, Veronika Laippala, Violette Lepercq, Vrinda Prabhu, Zaid Alyafeai, Zeerak Talat, Arun Raja, Benjamin Heinzerling, Chenglei Si, Davut Emre Taşar, Elizabeth Salesky, Sabrina J. Mielke, Wilson Y. Lee, Abheesht Sharma, Andrea Santilli, Antoine Chaffin, Arnaud Stiegler, Debajyoti Datta, Eliza Szczechla, Gunjan Chhablani, Han Wang, Harshit Pandey, Hendrik Strobelt, Jason Alan Fries, Jos Rozen, Leo Gao, Lintang Sutawika, M Saiful Bari, Maged S. Al-shaibani, Matteo Manica, Nihal Nayak, Ryan Teehan, Samuel Albanie, Sheng Shen, Srulik Ben-David, Stephen H. Bach, Taewoon Kim, Tali Bers, Thibault Fevry, Trishala Neeraj, Urmish Thakker, Vikas Raunak, Xiangru Tang, Zheng-Xin Yong, Zhiqing Sun, Shaked Brody, Yallow Uri, Hadar Tojarieh, Adam Roberts, Hyung Won Chung, Jaesung Tae, Jason Phang, Ofir Press, Conglong Li, Deepak Narayanan, Hatim Bourfoune, Jared Casper, Jeff Rasley, Max Ryabinin, Mayank Mishra, Minjia Zhang, Mohammad Shoeybi, Myriam Peyrounette, Nicolas Patry, Nouamane Tazi, Omar Sanseviero, Patrick Platen, Pierre Cornette, Pierre François Lavallée, Rémi Lacroix, Samyam Rajbhandari, Sanchit Gandhi, Shaden Smith, Stéphane Requena, Suraj Patil, Tim Dettmers, Ahmed Baruwa, Amanpreet Singh, Anastasia Cheveleva, Anne-Laure Ligozat, Arjun Subramonian, Aurélie Névéol, Charles Lovering, Dan Garrette, Deepak Tunuguntla, Ehud Reiter, Ekaterina Taktasheva, Ekaterina Voloshina, Eli Bogdanov, Genta Indra Winata, Hailey Schoelkopf, Jan-Christoph Kalo, Jekaterina Novikova, Jessica Zosa Forde, Jordan Clive, Jungo Kasai, Ken Kawamura, Liam Hazan, Marine Carpuat, Miruna Clinciu, Najoung Kim, Newton Cheng, Oleg Serikov, Omer Antverg, Oskar Wal, Rui Zhang, Ruochen Zhang, Sebastian Gehrmann, Shachar Mirkin, Shani Pais, Tatiana Shavrina, Thomas Scialom, Tian Yun, Tomasz Limisiewicz, Verena Rieser, Vitaly Protasov, Vladislav Mikhailov, Yada Pruksachatkun, Yonatan Belinkov, Zachary Bamberger, Zdeněk Kasner, Alice Rueda, Amanda Pestana, Amir Feizpour, Ammar Khan, Amy Faranak, Ana Santos, Anthony Hevia, Antigona Unldreaj, Arash Aghagol, Arezoo Abdollahi, Aycha Tammour, Azadeh HajiHosseini, Bahareh Behroozi, Benjamin Ajibade, Bharat Saxena, Carlos Muñoz Ferrandis, Daniel McDuff, Danish Contractor, David Lansky, Davis David, Douwe Kiela, Duong A. Nguyen, Edward Tan, Emi Baylor, Ezinwanne Ozoani, Fatima Mirza, Frankline Ononiwu, Habib Rezanejad, Hessie Jones, Indrani Bhattacharya, Irene Solaiman, Irina Sedenko, Isar Nejadgholi, Jesse Passmore, Josh Seltzer, Julio Bonis Sanz, Livia Dutra, Mairon Samagaio, Maraim Elbadri, Margot Mieskes, Marissa Gerchick, Martha Akinlolu, Michael McKenna, Mike Qiu, Muhammed Ghauri, Mykola Burynok, Nafis Abrar, Nazneen Rajani, Nour Elkott, Nour Fahmy, Olanrewaju Samuel, Ran An, Rasmus Kromann, Ryan Hao, Samira Alizadeh, Sarmad Shubber, Silas Wang, Sourav Roy, Sylvain Viguier, Thanh Le, Tobi Oyebade, Trieu Le, Yoyo Yang, Zach Nguyen, Abhinav Ramesh Kashyap, Alfredo Palasciano, Alison Callahan, Anima Shukla, Antonio Miranda-Escalada, Ayush Singh, Benjamin Beilharz, Bo Wang, Caio Brito, Chenxi Zhou, Chirag Jain, Chuxin Xu, Clémentine Fourrier, Daniel León Periñán, Daniel Molano, Dian Yu, Enrique Manjavacas, Fabio Barth, Florian Fuhrimann, Gabriel Altay, Giyaseddin Bayrak, Gully Burns, Helena U. Vrabec, Imane Bello, Ishani Dash, Jihyun Kang, John Giorgi, Jonas Golde, Jose David Posada, Karthik Rangasai Sivaraman, Lokesh Bulchandani, Lu Liu, Luisa Shinzato, Madeleine Hahn Bykhovetz, Maiko Takeuchi, Marc Pàmies, Maria A Castillo, Marianna Nezhurina, Mario Sänger, Matthias Samwald, Michael Cullan, Michael Weinberg, Michiel De Wolf, Mina Mihaljcic, Minna Liu, Moritz Freidank, Myungsun Kang, Natasha Seelam, Nathan Dahlberg, Nicholas Michio Broad, Nikolaus Muellner, Pascale Fung, Patrick Haller, Ramya Chandrasekhar, Renata Eisenberg, Robert Martin, Rodrigo Canalli, Rosaline Su, Ruisi Su, Samuel Cahyawijaya, Samuele Garda, Shlok S Deshmukh, Shubhanshu Mishra, Sid Kiblawi, Simon Ott, Sinee Sang-aroonsiri, Srishti Kumar, Stefan Schweter, Sushil Bharati, Tanmay Laud, Théo Gigant, Tomoya Kainuma, Wojciech Kusa, Yanis Labrak, Yash Shailesh Bajaj, Yash Venkatraman, Yifan Xu, Yingxin Xu, Yu Xu, Zhe Tan, Zhongli Xie, Zifan Ye, Mathilde Bras, Younes Belkada, Thomas Wolf (2023). BLOOM: A 176B-Parameter Open-Access Multilingual Language Model.
Rachel Bawden, Jonathan Poinhos, Eleni Kogkitsidou, Philippe Gambette, Benoît Sagot, Simon Gabay (2022). Automatic Normalisation of Early Modern French. Proceedings of the Thirteenth Language Resources and Evaluation Conference, 3354–3366.
Simon Gabay, Pedro Ortiz Suarez, Alexandre Bartz, Alix Chagué, Rachel Bawden, Philippe Gambette, Benoît Sagot (2022). From FreEM to D’AlemBERT: a Large Corpus and a Language Model for Early Modern French. Proceedings of the Thirteenth Language Resources and Evaluation Conference, 3367–3374.
Simon Gabay, Pedro Ortiz Suarez, Rachel Bawden, Alexandre Bartz, Philippe Gambette, Benoı̂t Sagot (2022). Le projet FREEM : ressources, outils et enjeux pour l’étude du français d’Ancien Régime (The F RE EM project: Resources, tools and challenges for the study of Ancien Régime French). Actes de la 29e Conférence sur le Traitement Automatique des Langues Naturelles. Volume 1 : conférence principale, 154–165.
Thibault Charmet, Inès Cherichi, Matthieu Allain, Urszula Czerwinska, Amaury Fouret, Benoît Sagot, Rachel Bawden (2022). Complex Labelling and Similarity Prediction in Legal Texts: Automatic Analysis of France’s Court of Cassation Rulings. Proceedings of the Thirteenth Language Resources and Evaluation Conference, 4754–4766.
Victor Sanh, Albert Webson, Colin Raffel, Stephen H. Bach, Lintang Sutawika, Zaid Alyafeai, Antoine Chaffin, Arnaud Stiegler, Arun Raja, Manan Dey, M Saiful Bari, Canwen Xu, Urmish Thakker, Shanya Sharma Sharma, Eliza Szczechla, Taewoon Kim, Gunjan Chhablani, Nihal V. Nayak, Debajyoti Datta, Jonathan Chang, Mike Tian-Jian Jiang, Han Wang, Matteo Manica, Sheng Shen, Zheng Xin Yong, Harshit Pandey, Rachel Bawden, Thomas Wang, Trishala Neeraj, Jos Rozen, Abheesht Sharma, Andrea Santilli, Thibault Févry, Jason Alan Fries, Ryan Teehan, Teven Le Scao, Stella Biderman, Leo Gao, Thomas Wolf, Alexander M. Rush (2022). Multitask Prompted Training Enables Zero-Shot Task Generalization. Proceedings of the 10th International Conference on Learning Representations.
Alexandre Bartz, Juliette Janes, Laurent Romary, Philippe Gambette, Rachel Bawden, Pedro Ortiz Suarez, Benoît Sagot, Simon Gabay (2021). Expanding the content model of annotationBlock. Proceedings of Next Gen TEI, 2021 - TEI Conference and Members’ Meeting.
Lana Yeganova, Dina Wiemann, Mariana Neves, Federica Vezzani, Amy Siu, Inigo Jauregi Unanue, Maite Oronoz, Nancy Mah, Aurélie Névéol, David Martinez, Rachel Bawden, Giorgio Maria Di Nunzio, Roland Roller, Philippe Thomas, Cristian Grozea, Olatz Perez-de-Viñaspre, Maika Vicente Navarro, Antonio Jimeno Yepes (2021). Findings of the WMT 2021 Biomedical Translation Shared Task: Summaries of Animal Experiments as New Test Set. Proceedings of the Sixth Conference on Machine Translation, 664–683.
Clémentine Fourrier, Rachel Bawden, Benoît Sagot (2021). Can Cognate Prediction Be Modelled as a Low-Resource Machine Translation Task?. Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, 847–861.
Farid Arthaud, Rachel Bawden, Alexandra Birch (2021). Few-shot learning through contextual data augmentation. Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, 1049–1062.
Rachel Bawden, Biao Zhang, Lisa Yankovskaya, Andre Tättar, Matt Post (2020). A Study in Improving BLEU Reference Coverage with Diverse Automatic Paraphrasing. Findings of the Association for Computational Linguistics: EMNLP 2020, 918–932.
Rachel Bawden, Biao Zhang, Andre Tättar, Matt Post (2020). ParBLEU: Augmenting Metrics with Automatic Paraphrases for the WMT’20 Metrics Shared Task. Proceedings of the Fifth Conference on Machine Translation, 887–894.
Nikita Moghe, Christian Hardmeier, Rachel Bawden (2020). The University of Edinburgh-Uppsala University’s Submission to the WMT 2020 Chat Translation Task. Proceedings of the Fifth Conference on Machine Translation, 473–478.
Rachel Bawden, Alexandra Birch, Radina Dobreva, Arturo Oncevay, Antonio Valerio Miceli Barone, Philip Williams (2020). The University of Edinburgh’s English-Tamil and English-Inuktitut Submissions to the WMT20 News Translation Task. Proceedings of the Fifth Conference on Machine Translation, 92–99.
Rachel Bawden, Giorgio Maria Di Nunzio, Cristian Grozea, Inigo Jauregi Unanue, Antonio Jimeno Yepes, Nancy Mah, David Martinez, Aurélie Névéol, Mariana Neves, Maite Oronoz, Olatz Perez-de-Viñaspre, Massimo Piccardi, Roland Roller, Amy Siu, Philippe Thomas, Federica Vezzani, Maika Vicente Navarro, Dina Wiemann, Lana Yeganova (2020). Findings of the WMT 2020 Biomedical Translation Shared Task: Basque, Italian and Russian as New Additional Languages. Proceedings of the Fifth Conference on Machine Translation, 660–687.
António Lopes, M. Amin Farajian, Rachel Bawden, Michael Zhang, André F. T. Martins (2020). Document-level Neural MT: A Systematic Comparison. Proceedings of the 22nd Annual Conference of the European Association for Machine Translation, 225–234.
Susie Coleman, Andrew Secker, Rachel Bawden, Barry Haddow, Alexandra Birch (2020). Architecture of a Scalable, Secure and Resilient Translation Platform for Multilingual News Media. Proceedings of the 1st International Workshop on Language Technology Platforms, 16–21.
Radina Dobreva, Jie Zhou, Rachel Bawden (2020). Document Sub-structure in Neural Machine Translation. Proceedings of the Twelfth Language Resources and Evaluation Conference, 3657–3667.
Rachel Bawden, Nikolay Bogoychev, Ulrich Germann, Roman Grundkiewicz, Faheem Kirefu, Antonio Valerio Miceli Barone, Alexandra Birch (2019). The University of Edinburgh’s Submissions to the WMT19 News Translation Task. Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1), 103–115.
Rachel Bawden, Kevin Bretonnel Cohen, Cristian Grozea, Antonio Jimeno Yepes, Madeleine Kittner, Martin Krallinger, Nancy Mah, Aurelie Neveol, Mariana Neves, Felipe Soares, Amy Siu, Karin Verspoor, Maika Vicente Navarro (2019). Findings of the WMT 2019 Biomedical Translation Shared Task: Evaluation for MEDLINE Abstracts and Biomedical Terminologies. Proceedings of the Fourth Conference on Machine Translation (Volume 3: Shared Task Papers, Day 2), 29–53.
Alexandra Birch, Barry Haddow, Ivan Tito, Antonio Valerio Miceli Barone, Rachel Bawden, Felipe Sánchez-Martı́nez, Mikel L. Forcada, Miquel Esplà-Gomis, Vı́ctor Sánchez-Cartagena, Juan Antonio Pérez-Ortiz, Wilker Aziz, Andrew Secker, Peggy Kreeft (2019). Global Under-Resourced Media Translation (GoURMET). Proceedings of Machine Translation Summit XVII: Translator, Project and User Tracks, 122–122.
Rachel Bawden, Rico Sennrich, Alexandra Birch, Barry Haddow (2018). Evaluating Discourse Phenomena in Neural Machine Translation. Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), 1304–1313.
Rachel Bawden, Thomas Lavergne, Sophie Rosset (2018). Detecting context-dependent sentences in parallel corpora. Actes de la Conférence TALN. Volume 1 - Articles longs, articles courts de TALN, 393–400.
Rachel Bawden (2018). PhD thesis. Going beyond the sentence : Contextual Machine Translation of Dialogue. Université Paris Saclay (COmUE). Supervised by Sophie Rosset and Thomas Lavergne.
Rachel Bawden (2017). Machine Translation, it’s a question of style, innit? The case of English tag questions. Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2507–2512.
Rachel Bawden (2017). Machine Translation of Speech-Like Texts: Strategies for the Inclusion of Context. Actes des 24ème Conférence sur le Traitement Automatique des Langues Naturelles. 19es REncontres jeunes Chercheurs en Informatique pour le TAL (RECITAL 2017), 1–14.
Rachel Bawden & Benoît Crabbé (2016). Boosting for Efficient Model Selection for Syntactic Parsing. Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, 1–11.
Rachel Bawden (2016). Cross-lingual Pronoun Prediction with Linguistically Informed Features. Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers, 564–570.
Rachel Bawden, Guillaume Wisniewski, Hélène Maynard (2016). Investigating gender adaptation for speech translation. Actes de la conférence conjointe JEP-TALN-RECITAL 2016. volume 2 : TALN (Posters), 490–497.
Rachel Bawden (2015). Master’s thesis. Boosting for Model Selection in Syntactic Parsing. Universite Paris Diderot-Paris VII. Supervised by Benoit Crabbé.
Rachel Bawden, Marie-Amélie Botalla, Kim Gerdes, Sylvain Kahane (2014). Correcting and Validating Syntactic Dependency in the Spoken French Treebank Rhapsodie. Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC’14), 2320–2325.