- (2025). AFRIDOC-MT: Document-level MT Corpus for African Languages. Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 27770–27806.
- (2025). Explicit Learning and the LLM in Machine Translation. Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 31372–31422.
- (2025). Towards Zero-Shot Multimodal Machine Translation. Findings of the Association for Computational Linguistics: NAACL 2025, 761–778.
- (2025). In-Context Example Selection via Similarity Search Improves Low-Resource Machine Translation. Findings of the Association for Computational Linguistics: NAACL 2025, 1222–1252.
- (2025). mOSCAR: A Large-scale Multilingual and Multimodal Document-level Corpus. Findings of the Association for Computational Linguistics: ACL 2025, 3461–3494.
- (2025). Compositional Translation: A Novel LLM-based Approach for Low-resource Machine Translation. Findings of the Association for Computational Linguistics: EMNLP 2025, 22328–22357.
- (2025). TopXGen: Topic-Diverse Parallel Data Generation for Low-Resource Machine Translation. Findings of the Association for Computational Linguistics: EMNLP 2025, 22358–22381.
- (2025). COLaF : Corpus et Outils pour les Langues de France et variétés de français. Actes de la session industrielle de CORIA-TALN 2025, 33–47.
- (2025). Investigating Length Issues in Document-level Machine Translation. Proceedings of Machine Translation Summit XX: Volume 1, 4–23.
- (2025). MaTOS: Machine Translation for Open Science. Proceedings of Machine Translation Summit XX: Volume 2, 103–104.
- (2025). Self-Retrieval from Distant Contexts for Document-Level Machine Translation. Proceedings of the Tenth Conference on Machine Translation, 220–240.
- (2025). Findings of the WMT25 General Machine Translation Shared Task: Time to Stop Evaluating on Easy Test Sets. Proceedings of the Tenth Conference on Machine Translation, 355–413.
- (2025). RoCS-MT v2 at WMT 2025: Robust Challenge Set for Machine Translation. Proceedings of the Tenth Conference on Machine Translation, 834–849.
- (2025). A French Version of the OLDI Seed Corpus. Proceedings of the Tenth Conference on Machine Translation, 1048–1060.
- (2025). Model Cards for the MaTOS Project.
- (2024). À propos des difficultés à traduire automatiquement de longs documents. Proceedings of the 31st Conférence sur le Traitement Automatique des Langues Naturelles (TALN 2024). Volume 1: articles longs et prises de position, 2–21.
- (2024). Évaluer BLOOM en français. Proceedings of EvalLLM2024 : Atelier sur l’évaluation des modèles génératifs (LLM) et challenge d’extraction d’information few-shot.
- (2024). Exploring Inline Lexicon Injection for Cross-Domain Transfer in Neural Machine Translation. Proceedings of the First International Workshop on Knowledge-Enhanced Machine Translation, 7–20.
- (2024). Making Sentence Embeddings Robust to User-Generated Content. Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), 10984–10998.
- (2024). Étude sur la normalisation lexicale de contenus produits par les utilisateurs [A Study on the Lexical Normalisation of User-Generated Content]. Traitement Automatique des Langues, 15–41.
- (2024). When Your Cousin Has the Right Connections: Unsupervised Bilingual Lexicon Induction for Related Data-Imbalanced Languages. Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), 17544–17556.
- (2024). Topic-guided Example Selection for Domain Adaptation in LLM-based Machine Translation. Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics: Student Research Workshop, 175–195.
- (2024). Reconnaissance des écritures dans les imprimés. Proceedings of Humanistica 2024.
- (2024). Findings of the WMT24 General Machine Translation Shared Task: The LLM Era Is Here but MT Is Not Solved Yet. Proceedings of the Ninth Conference on Machine Translation, 1–46.
- (2024). Findings of the WMT 2024 Biomedical Translation Shared Task: Test Sets on Abstract Level. Proceedings of the Ninth Conference on Machine Translation, 124–138.
- (2024). Translate your Own: a Post-Editing Experiment in the NLP domain. Proceedings of the 25th Annual Conference of the European Association for Machine Translation (Volume 1), 431–443.
- (2024). Tree of Problems: Improving structured problem solving with compositionality. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 18028–18047.
- (2023). RoCS-MT: Robustness Challenge Set for Machine Translation. Proceedings of the Eighth Conference on Machine Translation, 198–216.
- (2023). Findings of the 2023 Conference on Machine Translation (WMT23): LLMs Are Here but Not Quite There Yet. Proceedings of the Eighth Conference on Machine Translation, 1–42.
- (2023). Findings of the WMT 2023 Biomedical Translation Shared Task: Evaluation of ChatGPT 3.5 as a Comparison System. Proceedings of the Eighth Conference on Machine Translation, 43–54.
- (2023). Tackling Ambiguity with Images: Improved Multimodal Machine Translation and Contrastive Evaluation. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 5394–5413.
- (2023). Investigating Lexical Sharing in Multilingual Machine Translation for Indian Languages. Proceedings of the 24th Annual Conference of the European Association for Machine Translation, 181–192.
- (2023). Investigating the Translation Performance of a Large Multilingual Language Model: the Case of BLOOM. Proceedings of the 24th Annual Conference of the European Association for Machine Translation, 157–170.
- (2023). Cross-lingual Strategies for Low-resource Language Modeling: A Study on Five Indic Dialects. Actes de CORIA-TALN 2023. Actes de la 30e Conférence sur le Traitement Automatique des Langues Naturelles (TALN), volume 1 : travaux de recherche originaux – articles longs, 28–42.
- (2023). MaTOS: Traduction automatique pour la science ouverte. Actes de l’atelier “Analyse et Recherche de Textes Scientifiques” (ARTS)@TALN 2023, 8–15.
- (2022). Automatic Normalisation of Early Modern French. Proceedings of the Thirteenth Language Resources and Evaluation Conference, 3354–3366.
- (2022). From FreEM to D’AlemBERT: a Large Corpus and a Language Model for Early Modern French. Proceedings of the Thirteenth Language Resources and Evaluation Conference, 3367–3374.
- (2022). Le projet FREEM : ressources, outils et enjeux pour l’étude du français d’Ancien Régime (The F RE EM project: Resources, tools and challenges for the study of Ancien Régime French). Actes de la 29e Conférence sur le Traitement Automatique des Langues Naturelles. Volume 1 : conférence principale, 154–165.
- (2022). Complex Labelling and Similarity Prediction in Legal Texts: Automatic Analysis of France’s Court of Cassation Rulings. Proceedings of the Thirteenth Language Resources and Evaluation Conference, 4754–4766.
- (2022). Multitask Prompted Training Enables Zero-Shot Task Generalization. Proceedings of the 10th International Conference on Learning Representations.
- (2021). Expanding the content model of annotationBlock. Proceedings of Next Gen TEI, 2021 - TEI Conference and Members’ Meeting.
- (2021). Findings of the WMT 2021 Biomedical Translation Shared Task: Summaries of Animal Experiments as New Test Set. Proceedings of the Sixth Conference on Machine Translation, 664–683.
- (2021). Can Cognate Prediction Be Modelled as a Low-Resource Machine Translation Task?. Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, 847–861.
- (2021). Few-shot learning through contextual data augmentation. Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, 1049–1062.
- (2020). A Study in Improving BLEU Reference Coverage with Diverse Automatic Paraphrasing. Findings of the Association for Computational Linguistics: EMNLP 2020, 918–932.
- (2020). ParBLEU: Augmenting Metrics with Automatic Paraphrases for the WMT’20 Metrics Shared Task. Proceedings of the Fifth Conference on Machine Translation, 887–894.
- (2020). The University of Edinburgh-Uppsala University’s Submission to the WMT 2020 Chat Translation Task. Proceedings of the Fifth Conference on Machine Translation, 473–478.
- (2020). The University of Edinburgh’s English-Tamil and English-Inuktitut Submissions to the WMT20 News Translation Task. Proceedings of the Fifth Conference on Machine Translation, 92–99.
- (2020). Findings of the WMT 2020 Biomedical Translation Shared Task: Basque, Italian and Russian as New Additional Languages. Proceedings of the Fifth Conference on Machine Translation, 660–687.
- (2020). Document-level Neural MT: A Systematic Comparison. Proceedings of the 22nd Annual Conference of the European Association for Machine Translation, 225–234.
- (2020). Architecture of a Scalable, Secure and Resilient Translation Platform for Multilingual News Media. Proceedings of the 1st International Workshop on Language Technology Platforms, 16–21.
- (2020). Document Sub-structure in Neural Machine Translation. Proceedings of the Twelfth Language Resources and Evaluation Conference, 3657–3667.
- (2019). The University of Edinburgh’s Submissions to the WMT19 News Translation Task. Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1), 103–115.
- (2019). Findings of the WMT 2019 Biomedical Translation Shared Task: Evaluation for MEDLINE Abstracts and Biomedical Terminologies. Proceedings of the Fourth Conference on Machine Translation (Volume 3: Shared Task Papers, Day 2), 29–53.
- (2019). Global Under-Resourced Media Translation (GoURMET). Proceedings of Machine Translation Summit XVII: Translator, Project and User Tracks, 122–122.
- (2018). Evaluating Discourse Phenomena in Neural Machine Translation. Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), 1304–1313.
- (2018). Detecting context-dependent sentences in parallel corpora. Actes de la Conférence TALN. Volume 1 - Articles longs, articles courts de TALN, 393–400.
- (2018). PhD thesis. Going beyond the sentence : Contextual Machine Translation of Dialogue. Université Paris Saclay (COmUE). Supervised by Sophie Rosset and Thomas Lavergne.
- (2017). Machine Translation, it’s a question of style, innit? The case of English tag questions. Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2507–2512.
- (2017). Machine Translation of Speech-Like Texts: Strategies for the Inclusion of Context. Actes des 24ème Conférence sur le Traitement Automatique des Langues Naturelles. 19es REncontres jeunes Chercheurs en Informatique pour le TAL (RECITAL 2017), 1–14.
- (2016). Boosting for Efficient Model Selection for Syntactic Parsing. Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, 1–11.
- (2016). Cross-lingual Pronoun Prediction with Linguistically Informed Features. Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers, 564–570.
- (2016). Investigating gender adaptation for speech translation. Actes de la conférence conjointe JEP-TALN-RECITAL 2016. volume 2 : TALN (Posters), 490–497.
- (2015). Master’s thesis. Boosting for Model Selection in Syntactic Parsing. Universite Paris Diderot-Paris VII. Supervised by Benoit Crabbé.
- (2014). Correcting and Validating Syntactic Dependency in the Spoken French Treebank Rhapsodie. Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC’14), 2320–2325.