publications

* denotes equal contribution

An up-to-date list is available on Google Scholar.

2026

  1. IndicIFEval: A Benchmark for Verifiable Instruction-Following Evaluation in 14 Indic Languages
    Thanmay Jayakumar, Mohammed Safi Ur Rahman Khan, Raj Dabre, Ratish Puduppully, and Anoop Kunchukuttan
    arXiv preprint arXiv:2602.22125, 2026

2025

  1. Natural language processing for dialects of a language: A survey
    Aditya Joshi, Raj Dabre, Diptesh Kanojia, Zhuang Li, Haolan Zhan, Gholamreza Haffari, and 1 more author
    ACM Computing Surveys, 2025
  2. Worldcuisines: A massive-scale benchmark for multilingual and multicultural visual question answering on global cuisines
    Genta Indra Winata, Frederikus Hudi, Patrick Amadeus Irawan, David Anugraha, Rifki Afina Putri, Wang Yutong, and 5 more authors
    In Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2025
  3. Cross-lingual auto evaluation for assessing multilingual LLMs
    Sumanth Doddapaneni, Mohammed Safi Ur Rahman Khan, Dilip Venkatesh, Raj Dabre, Anoop Kunchukuttan, and Mitesh M Khapra
    In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025
  4. Are Language Models Agnostic to Linguistically Grounded Perturbations? A Case Study of Indic Languages
    Poulami Ghosh, Raj Dabre, and Pushpak Bhattacharyya
    In Findings of the Association for Computational Linguistics: NAACL 2025, 2025
  5. PrahokBART: A Pre-trained Sequence-to-Sequence Model for Khmer Natural Language Generation
    Hour Kaing, Raj Dabre, Haiyue Song, Van-Hien Tran, Hideki Tanaka, and Masao Utiyama
    In Proceedings of the 31st International Conference on Computational Linguistics, 2025
  6. Exploiting word sense disambiguation in large language models for machine translation
    Van-Hien Tran, Raj Dabre, Hour Kaing, Haiyue Song, Hideki Tanaka, and Masao Utiyama
    In Proceedings of the First Workshop on Language Models for Low-Resource Languages, 2025
  7. Romanlens: The role of latent romanization in multilinguality in llms
    Alan Saji, Jaavid Aktar Husain, Thanmay Jayakumar, Raj Dabre, Anoop Kunchukuttan, and Ratish Puduppully
    In Findings of the Association for Computational Linguistics: ACL 2025, 2025
  8. IteRABRe: Iterative Recovery-Aided Block Reduction
    Haryo Akbarianto Wibowo, Haiyue Song, Hideki Tanaka, Masao Utiyama, Alham Fikri Aji, and Raj Dabre
    arXiv preprint arXiv:2503.06291, 2025
  9. Tikzero: Zero-shot text-guided graphics program synthesis
    Jonas Belouadi, Eddy Ilg, Margret Keuper, Hideki Tanaka, Masao Utiyama, Raj Dabre, and 2 more authors
    In Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025
  10. Limited-Resource Adapters Are Regularizers, Not Linguists
    Marcell Fekete, Nathaniel Romney Robinson, Ernests Lavrinovics, Djeride Jean-Baptiste, Raj Dabre, Johannes Bjerva, and 1 more author
    In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 2025
  11. Cammt: Benchmarking culturally aware multimodal machine translation
    Emilio Villa-Cueva, Sholpan Bolatzhanova, Diana Turmakhan, Kareem Elzeky, Henok Biadglign Ademtew, Alham Fikri Aji, and 5 more authors
    arXiv preprint arXiv:2505.24456, 2025
  12. IndicRAGSuite: Large-Scale Datasets and a Benchmark for Indian Language RAG Systems
    Pasunuti Prasanjith, Prathmesh B More, Anoop Kunchukuttan, and Raj Dabre
    arXiv preprint arXiv:2506.01615, 2025
  13. Towards Building Large Scale Datasets and State-of-the-Art Automatic Speech Translation Systems for 14 Indian Languages
    Ashwin Sankar, Sparsh Jain, Nikhil Narasimhan, Devilal Choudhary, Dhairya Suman, Mohammed Safi Ur Rahman Khan, and 3 more authors
    In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025
  14. Mark My Words: A Robust Multilingual Model for Punctuation in Text and Speech Transcripts
    Sidharth Pulipaka, Ashwin Sankar, and Raj Dabre
    In Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics, 2025
  15. CycleDistill: Bootstrapping Machine Translation using LLMs with Cyclical Distillation
    Deepon Halder, Thanmay Jayakumar, and Raj Dabre
    In Proceedings of the Twelfth Workshop on Asian Translation (WAT 2025), 2025
  16. Quality Estimation and Post-Editing Using LLMs For Indic Languages: How Good Is It?
    Anushka Singh, Aarya Pakhale, Mitesh M Khapra, and Raj Dabre
    In Proceedings of Machine Translation Summit XX: Volume 1, 2025
  17. BYTF: How Good Are Byte Level N-Gram F-Scores for Automatic Machine Translation Evaluation?
    Raj Dabre, Kaing Hour, and Haiyue Song
    In Proceedings of Machine Translation Summit XX: Volume 1, 2025
  18. When Alignment Hurts: Decoupling Representational Spaces in Multilingual Models
    Ahmed Elshabrawy, Hour Kaing, Haiyue Song, Alham Fikri Aji, Hideki Tanaka, Masao Utiyama, and 1 more author
    arXiv preprint arXiv:2508.12803, 2025
  19. Findings of the first shared task for creole language machine translation at wmt25
    Nathaniel Robinson, Claire Bizon Monroc, Rasul Dent, Stefan Watson, Kenton Murray, Raj Dabre, and 2 more authors
    In Proceedings of the Tenth Conference on Machine Translation, 2025
  20. The Reasoning Lingua Franca: A Double-Edged Sword for Multilingual AI
    Alan Saji, Raj Dabre, Anoop Kunchukuttan, and Ratish Puduppully
    arXiv preprint arXiv:2510.20647, 2025
  21. RiddleBench: A New Generative Reasoning Benchmark for LLMs
    Deepon Halder, Alan Saji, Thanmay Jayakumar, Ratish Puduppully, Anoop Kunchukuttan, and Raj Dabre
    arXiv preprint arXiv:2510.24932, 2025
  22. Findings of the IWSLT 2025 evaluation campaign
    Victor Agostinelli, Tanel Alumäe, Antonios Anastasopoulos, Luisa Bentivogli, Ondřej Bojar, Claudia Borg, and 5 more authors
    In Proceedings of the 22nd International Conference on Spoken Language Translation (IWSLT 2025), 2025
  23. Data and Model Centric Approaches for Expansion of Large Language Models to New languages
    Anoop Kunchukuttan, Raj Dabre, Rudra Murthy, Mohammed Safi Ur Rahman Khan, and Thanmay Jayakumar
    In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: Tutorial Abstracts, 2025
  24. PRALEKHA: Cross-Lingual Document Alignment for Indic Languages
    Sanjay Suryanarayanan, Haiyue Song, Mohammed Safi Ur Rahman Khan, Anoop Kunchukuttan, and Raj Dabre
    In Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics, 2025
  25. Multilingual Iterative Model Pruning: What Matters?
    Haryo Akbarianto Wibowo, Haiyue Song, Hideki Tanaka, Masao Utiyama, Alham Fikri Aji, and Raj Dabre
    In Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics, 2025

2024

  1. A comprehensive analysis of adapter efficiency
    Nandini Mundra, Sumanth Doddapaneni, Raj Dabre, Anoop Kunchukuttan, Ratish Puduppully, and Mitesh M Khapra
    In Proceedings of the 7th Joint International Conference on Data Science & Management of Data (11th ACM IKDD CODS and 29th COMAD), 2024
  2. Scicap+: A knowledge augmented dataset to study the challenges of scientific figure captioning
    Zhishen Yang, Raj Dabre, Hideki Tanaka, and Naoaki Okazaki
    Journal of Natural Language Processing, 2024
  3. CreoleVal: Multilingual multitask benchmarks for creoles
    Heather Lent, Kushal Tatariya, Raj Dabre, Yiyi Chen, Marcell Fekete, Esther Ploeger, and 5 more authors
    Transactions of the Association for Computational Linguistics, 2024
  4. Bilingual Corpus Mining and Multistage Fine-Tuning for Improving Machine Translation of Lecture Transcripts
    Haiyue Song, Raj Dabre, Chenhui Chu, Atsushi Fujita, and Sadao Kurohashi
    Journal of Information Processing, 2024
  5. Mos-fad: Improving fake audio detection via automatic mean opinion score prediction
    Wangjin Zhou, Zhengdong Yang, Chenhui Chu, Sheng Li, Raj Dabre, Yi Zhao, and 1 more author
    In ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2024
  6. Airavata: Introducing hindi instruction-tuned llm
    Jay Gala, Thanmay Jayakumar, Jaavid Aktar Husain, Mohammed Safi Ur Rahman Khan, Diptesh Kanojia, Ratish Puduppully, and 5 more authors
    arXiv preprint arXiv:2401.15006, 2024
  7. An empirical study of in-context learning in llms for machine translation
    Pranjal Chitale, Jay Gala, and Raj Dabre
    In Findings of the Association for Computational Linguistics: ACL 2024, 2024
  8. IndicLLMSuite: A Blueprint for Creating Pre-training and Fine-Tuning Datasets for Indian Languages
    Mohammed Safi Ur Rahman Khan, Priyam Mehta, Ananth Sankar, Umashankar Kumaravelan, Sumanth Doddapaneni, Varun Balan G, and 5 more authors
    arXiv e-prints, 2024
  9. DiverSeg: Leveraging Diverse Segmentations with Cross-granularity Alignment for Neural Machine Translation
    Haiyue Song, Zhuoyuan Mao, Raj Dabre, Chenhui Chu, and Sadao Kurohashi
    Journal of Natural Language Processing, 2024
  10. Do Not Worry if You Do Not Have Data: Building Pretrained Language Models Using Translationese
    Meet Doshi, Raj Dabre, and Pushpak Bhattacharyya
    arXiv e-prints, 2024
  11. A morphology-based investigation of positional encodings
    Poulami Ghosh, Shikhar Vashishth, Raj Dabre, and Pushpak Bhattacharyya
    In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024
  12. NGLUEni: Benchmarking and adapting pretrained language models for nguni languages
    Francois Meyer, Haiyue Song, Abhisek Chakrabarty, Jan Buys, Raj Dabre, and Hideki Tanaka
    In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), 2024
  13. Kreyòl-MT: Building MT for Latin American, Caribbean and colonial African creole languages
    Nathaniel Robinson, Raj Dabre, Ammon Shurtz, Rasul Dent, Onenamiyi Onesi, Claire Monroc, and 5 more authors
    In Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024
  14. How Good is Zero-Shot MT Evaluation for Low Resource Indian Languages?
    Anushka Singh, Ananya Sai, Raj Dabre, Ratish Puduppully, Anoop Kunchukuttan, and Mitesh M Khapra
    In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 2024
  15. How effective is Multi-source pivoting for Translation of Low Resource Indian Languages?
    Pranav Gaikwad, Meet Doshi, Raj Dabre, and Pushpak Bhattacharyya
    arXiv preprint arXiv:2406.13332, 2024
  16. An empirical comparison of vocabulary expansion and initialization approaches for language models
    Nandini Mundra, Aditya Nanda Kishore Khandavally, Raj Dabre, Ratish Puduppully, Anoop Kunchukuttan, and Mitesh M Khapra
    In Proceedings of the 28th Conference on Computational Natural Language Learning, 2024
  17. PUB: A Pragmatics Understanding Benchmark for Assessing LLMs’ Pragmatics Capabilities
    Sravanthi Settaluri, Meet Doshi, Pavan Kalyan Tankala, Rudra Murthy Venkataramana, Raj Dabre, and Pushpak Bhattacharyya
    In Annual Meeting of the Association for Computational Linguistics, 2024
  18. SubMerge: Merging Equivalent Subword Tokenizations for Subword Regularized Models in Neural Machine Translation
    Haiyue Song, Francois Meyer, Raj Dabre, Hideki Tanaka, Chenhui Chu, and Sadao Kurohashi
    In Proceedings of the 25th Annual Conference of the European Association for Machine Translation (Volume 1), 2024
  19. NICT’s Cascaded and End-To-End Speech Translation Systems using Whisper and IndicTrans2 for the Indic Task
    Raj Dabre, and Haiyue Song
    In Proceedings of the 21st International Conference on Spoken Language Translation (IWSLT 2024), 2024
  20. Romansetu: Efficiently unlocking multilingual capabilities of large language models via romanization
    J Jaavid, Raj Dabre, M Aswanth, Jay Gala, Thanmay Jayakumar, Ratish Puduppully, and 1 more author
    In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024
  21. Incorporating Hypernym Features for Improving Low-resource Neural Machine Translation
    Abhisek Chakrabarty, Haiyue Song, Raj Dabre, Hideki Tanaka, and Masao Utiyama
    In Proceedings of the First International Workshop on Knowledge-Enhanced Machine Translation, 2024
  22. How effective is synthetic data and instruction fine-tuning for translation with markup using LLMs?
    Raj Dabre, Haiyue Song, Miriam Exel, Bianka Buschbeck, Johannes Eschbach-Dymanus, and Hideki Tanaka
    In Proceedings of the 16th Conference of the Association for Machine Translation in the Americas (Volume 1: Research Track), 2024
  23. Findings of wmt 2024’s multiindic22mt shared task for machine translation of 22 indian languages
    Raj Dabre, and Anoop Kunchukuttan
    In Proceedings of the Ninth Conference on Machine Translation, 2024
  24. Bhasaanuvaad: A speech translation dataset for 14 indian languages
    Sparsh Jain, Ashwin Sankar, Devilal Choudhary, Dhairya Suman, Nikhil Narasimhan, Mohammed Safi Ur Rahman Khan, and 3 more authors
    arXiv e-prints, 2024
  25. Pretraining language models using translationese
    Meet Doshi, Raj Dabre, and Pushpak Bhattacharyya
    In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024
  26. Leveraging Adapters for Improved Cross-Lingual Transfer for Low-Resource Creole MT
    Marcell Richard Fekete, Ernests Lavrinovics, Nathaniel Romney Robinson, Heather Lent, Raj Dabre, and Johannes Bjerva
    In Proceedings of the Fourth Workshop on Multilingual Representation Learning (MRL 2024), 2024
  27. Machine translation of Marathi dialects: A case study of Kadodi
    Raj Dabre, Mary Dabre, and Teresa Pereira
    In Proceedings of the Eleventh Workshop on Asian Translation (WAT 2024), 2024
  28. CVQA: Culturally-diverse multilingual visual question answering benchmark
    David Orlando Romero Mogrovejo, Chenyang Lyu, Haryo Akbarianto Wibowo, Santiago Góngora, Aishik Mandal, Sukannya Purkayastha, and 5 more authors
    In The Thirty-eight Conference on Neural Information Processing Systems Datasets and Benchmarks Track, 2024
  29. Pralekha: An indic document alignment evaluation benchmark
    Sanjay Suryanarayanan, Haiyue Song, Mohammed Safi Ur Rahman Khan, Anoop Kunchukuttan, Mitesh M Khapra, and Raj Dabre
    arXiv e-prints, 2024
  30. Proceedings of the Eleventh Workshop on Asian Translation (WAT 2024)
    Toshiaki Nakazawa, and Isao Goto
    In Proceedings of the Eleventh Workshop on Asian Translation (WAT 2024), 2024
  31. Linguistically Motivated Neural Machine Translation
    Haiyue Song, Hour Kaing, and Raj Dabre
    , 2024

2023

  1. YANMTT: Yet another neural machine translation toolkit
    Raj Dabre, Diptesh Kanojia, Chinmay Sawant, and Eiichiro Sumita
    In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations), 2023
  2. MT metrics correlate with human ratings of simultaneous speech translation
    Dominik Macháček, Ondřej Bojar, and Raj Dabre
    In Proceedings of the 20th International Conference on Spoken Language Translation (IWSLT 2023), 2023
  3. An empirical study of leveraging knowledge distillation for compressing multilingual neural machine translation models
    Varun Gumma, Raj Dabre, and Pratyush Kumar
    In Proceedings of the 24th Annual Conference of the European Association for Machine Translation, 2023
  4. Variable-length neural interlingua representations for zero-shot neural machine translation
    Zhuoyuan Mao, Haiyue Song, Raj Dabre, Chenhui Chu, and Sadao Kurohashi
    In Proceedings of the 1st International Workshop on Multilingual, Multimodal and Multitask Language Generation, 2023
  5. Exploring the impact of layer normalization for zero-shot neural machine translation
    Zhuoyuan Mao, Raj Dabre, Qianying Liu, Haiyue Song, Chenhui Chu, and Sadao Kurohashi
    In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 2023
  6. Low-resource multilingual neural translation using linguistic feature-based relevance mechanisms
    Abhisek Chakrabarty, Raj Dabre, Chenchen Ding, Masao Utiyama, and Eiichiro Sumita
    ACM Transactions on Asian and Low-Resource Language Information Processing, 2023
  7. Indictrans2: Towards high-quality and accessible machine translation models for all 22 scheduled indian languages
    Jay Gala, Pranjal A Chitale, Raghavan Ak, Varun Gumma, Sumanth Doddapaneni, Aswanth Kumar, and 5 more authors
    arXiv preprint arXiv:2305.16307, 2023
  8. Robustness of multi-source MT to transcription errors
    Dominik Macháček, Peter Polák, Ondřej Bojar, and Raj Dabre
    In Findings of the Association for Computational Linguistics: ACL 2023, 2023
  9. IndicMT eval: A dataset to meta-evaluate machine translation metrics for Indian languages
    Tanay Dixit, Vignesh Nagarajan, Anoop Kunchukuttan, Pratyush Kumar, Mitesh M Khapra, Raj Dabre, and 1 more author
    In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023
  10. SelfSeg: a self-supervised sub-word segmentation method for neural machine translation
    Haiyue Song, Raj Dabre, Chenhui Chu, Sadao Kurohashi, and Eiichiro Sumita
    ACM Transactions on Asian and Low-Resource Language Information Processing, 2023
  11. Turning whisper into real-time transcription system
    Dominik Macháček, Raj Dabre, and Ondřej Bojar
    In Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics: System Demonstrations, 2023
  12. Overview of the 10th workshop on Asian translation
    Toshiaki Nakazawa, Kazutaka Kinugawa, Hideya Mino, Isao Goto, Raj Dabre, Shohei Higashiyama, and 5 more authors
    In Proceedings of the 10th Workshop on Asian Translation, 2023
  13. A study on the effectiveness of large language models for translation with markup
    Raj Dabre, Bianka Buschbeck, Miriam Exel, and Hideki Tanaka
    In Proceedings of Machine Translation Summit XIX, Vol. 1: Research Track, 2023
  14. NICT-AI4B’s Submission to the Indic MT Shared Task in WMT 2023
    Raj Dabre, Jay Gala, and Pranjal A Chitale
    In Proceedings of the Eighth Conference on Machine Translation, 2023
  15. DecoMT: Decomposed prompting for machine translation between related languages using large language models
    Ratish Puduppully, Anoop Kunchukuttan, Raj Dabre, Aiti Aw, and Nancy Chen
    In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023
  16. CTQScorer: Combining multiple features for in-context example selection for machine translation
    Aswanth Kumar, Ratish Puduppully, Raj Dabre, and Anoop Kunchukuttan
    In Findings of the Association for Computational Linguistics: EMNLP 2023, 2023
  17. Developing State-Of-The-Art Massively Multilingual Machine Translation Systems for Related Languages
    Jay Gala, Pranjal A Chitale, and Raj Dabre
    In Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics: Tutorial Abstract, 2023
  18. Large pre-trained language models with multilingual prompt for Japanese natural language tasks
    Haiyue Song, and Raj Dabre2 Chenhui Chu1 Sadao Kurohashi
    In Proc. 29th Annu. Meet. Conf. Nat. Lang. Process, 2023
  19. Turning Whisper into Real-Time Transcription System (Version 2). arXiv
    D Macháček, R Dabre, and O Bojar
    , 2023

2022

  1. Self-supervised dynamic programming encoding for neural machine translation
    Haiyue Song, Raj Dabre, Chenhui Chu, Sadao Kurohashi, and Eiichiro Sumita
    , 2022
  2. Fusion of self-supervised learned models for MOS prediction
    Zhengdong Yang, Wangjin Zhou, Chenhui Chu, Sheng Li, Raj Dabre, Raphael Rubino, and 1 more author
    arXiv preprint arXiv:2204.04855, 2022
  3. When do contrastive word alignments improve many-to-many neural machine translation?
    Zhuoyuan Mao, Chenhui Chu, Raj Dabre, Haiyue Song, Zhen Wan, and Sadao Kurohashi
    In Findings of the Association for Computational Linguistics: NAACL 2022, 2022
  4. ACL Rolling Review: A New Format For Centralized Peer Review
    Raj Dabre
    Journal of Natural Language Processing, 2022
  5. IndicBART: A pre-trained model for indic natural language generation
    Raj Dabre, Himani Shrotriya, Anoop Kunchukuttan, Ratish Puduppully, Mitesh M Khapra, and Pratyush Kumar
    In Findings of the Association for Computational Linguistics: ACL 2022, 2022
  6. MorisienMT: A dataset for Mauritian Creole machine translation
    Raj Dabre, and Aneerav Sukhoo
    arXiv preprint arXiv:2206.02421, 2022
  7. NICT’s Submission to the WAT 2022 Structured Document Translation Task
    Raj Dabre
    In Proceedings of the 9th Workshop on Asian Translation, 2022
  8. FeatureBART: Feature based sequence-to-sequence pre-training for low-resource NMT
    Abhisek Chakrabarty, Raj Dabre, Chenchen Ding, Hideki Tanaka, Masao Utiyama, and Eiichiro Sumita
    In Proceedings of the 29th International Conference on Computational Linguistics, 2022
  9. BERTSeg: BERT based unsupervised subword segmentation for neural machine translation
    Haiyue Song, Raj Dabre, Zhuoyuan Mao, Chenhui Chu, and Sadao Kurohashi
    In Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), 2022
  10. Kreolmorisienmt: A dataset for mauritian creole machine translation
    Raj Dabre, and Aneerav Sukhoo
    In Findings of the association for computational linguistics: Aacl-ijcnlp 2022, 2022
  11. A Multilingual Multiway Evaluation Data Set for Structured Document Translation of Asian Languages
    Bianka Buschbeck, Raj Dabre, Miriam Exel, Matthias Huck, Patrick Huy, Raphael Rubino, and 1 more author
    In Findings of the Association for Computational Linguistics: AACL-IJCNLP 2022, 2022
  12. NICT at MixMT 2022: Synthetic Code-Mixed Pre-training and Multi-way Fine-tuning for Hinglish–English Translation
    Raj Dabre
    In Proceedings of the Seventh Conference on Machine Translation (WMT), 2022
  13. IndicNLG benchmark: Multilingual datasets for diverse NLG tasks in Indic languages
    Aman Kumar, Himani Shrotriya, Prachi Sahu, Amogh Mishra, Raj Dabre, Ratish Puduppully, and 3 more authors
    In Proceedings of the 2022 conference on empirical methods in natural language processing, 2022
  14. Indicbart: A pre-trained model for indic languages
    Raj Dabre, and  others
    In Proceedings of LREC, 2022
  15. Indicbart: a pre-trained model for Indic natural language generation of Indic languages
    R Dabre, H Shrotriya, A Kunchukuttan, R Puduppully, MM Khapra, and P Kumar
    In , 2022

2021

  1. Overview of the 8th workshop on Asian translation
    Toshiaki Nakazawa, Hideki Nakayama, Chenchen Ding, Raj Dabre, Shohei Higashiyama, Hideya Mino, and 5 more authors
    In Proceedings of the 8th Workshop on Asian Translation (WAT2021), 2021
  2. Simultaneous multi-pivot neural machine translation
    Raj Dabre, Aizhan Imankulova, Masahiro Kaneko, and Abhisek Chakrabarty
    arXiv preprint arXiv:2104.07410, 2021
  3. Recurrent stacking of layers in neural networks: An application to neural machine translation
    Raj Dabre, and Atsushi Fujita
    arXiv preprint arXiv:2106.10002, 2021
  4. Investigating softmax tempering for training neural machine translation models
    Raj Dabre, and Atsushi Fujita
    In Proceedings of Machine Translation Summit XVIII: Research Track, 2021
  5. Studying the impact of document-level context on simultaneous neural machine translation
    Raj Dabre, Aizhan Imankulova, and Masahiro Kaneko
    In Proceedings of Machine Translation Summit XVIII: Research Track, 2021
  6. NICT-5’s submission to WAT 2021: MBART pre-training and in-domain fine tuning for indic languages
    Raj Dabre, and Abhisek Chakrabarty
    In Proceedings of the 8th Workshop on Asian Translation (WAT2021), 2021
  7. Proceedings of the 8th Workshop on Asian Translation (WAT2021)
    Toshiaki Nakazawa, Hideki Nakayama, Isao Goto, Hideya Mino, Chenchen Ding, Raj Dabre, and 5 more authors
    In Proceedings of the 8th Workshop on Asian Translation (WAT2021), 2021

2020

  1. A survey of multilingual neural machine translation
    Raj Dabre, Chenhui Chu, and Anoop Kunchukuttan
    ACM Computing Surveys (CSUR), 2020
  2. Balancing cost and benefit with tied-multi transformers
    Raj Dabre, Raphael Rubino, and Atsushi Fujita
    In Proceedings of the Fourth Workshop on Neural Generation and Translation, 2020
  3. Domain adaptation of neural machine translation through multistage fine-tuning
    Haiyue Song, Raj Dabre, Atsushi Fujita, and Sadao Kurohashi
    In 26th Annual Conference of the Association for Natural Language Processing, 2020
  4. JASS: Japanese-specific sequence to sequence pre-training for neural machine translation
    Zhuoyuan Mao, Fabien Cromieres, Raj Dabre, Haiyue Song, and Sadao Kurohashi
    In Proceedings of the Twelfth Language Resources and Evaluation Conference, 2020
  5. Joint Training End-to-End Speech Recognition Systems with Speaker Attributes.
    Sheng Li, Xugang Lu, Raj Dabre, Peng Shen, and Hisashi Kawai
    In Odyssey, 2020
  6. Pre-training via leveraging assisting languages for neural machine translation
    Haiyue Song, Raj Dabre, Zhuoyuan Mao, Fei Cheng, Sadao Kurohashi, and Eiichiro Sumita
    In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop, 2020
  7. Combining sequence distillation and transfer learning for efficient low-resource neural machine translation models
    Raj Dabre, and Atsushi Fujita
    In Proceedings of the Fifth Conference on Machine Translation, 2020
  8. Harnessing cross-lingual features to improve cognate detection for low-resource languages
    Diptesh Kanojia, Raj Dabre, Shubham Dewangan, Pushpak Bhattacharyya, Gholamreza Haffari, and Malhar Kulkarni
    In Proceedings of the 28th international conference on computational linguistics, 2020
  9. Improving low-resource NMT through relevance based linguistic features incorporation
    Abhisek Chakrabarty, Raj Dabre, Chenchen Ding, Masao Utiyama, and Eiichiro Sumita
    In Proceedings of the 28th International Conference on Computational Linguistics, 2020
  10. NICT‘s Submission To WAT 2020: How Effective Are Simple Many-To-Many Neural Machine Translation Models?
    Raj Dabre, and Abhisek Chakrabarty
    In Proceedings of the 7th Workshop on Asian Translation, 2020
  11. Extremely low-resource neural machine translation for Asian languages
    Raphael Rubino, Benjamin Marie, Raj Dabre, Atushi Fujita, Masao Utiyama, and Eiichiro Sumita
    Machine Translation, 2020
  12. A comprehensive survey of multilingual neural machine translation
    Dabre Raj, Chu Chenhui, and Kunchukuttan Anoop
    CoRR, 2020
  13. ニューラル機械翻訳のための言語知識に基づくマルチタスク事前学習
    Zhuoyuan Mao, Raj Dabre, Fabien Cromieres, Haiyue Song, Ryota Nakao, and Sadao Kurohashi
    言語処理学会 第 26 回年次大会, 2020
  14. Extremely low-resource neural machine translation for Asian languages, vol. 34, no. 4
    R Rubino, B Marie, R Dabre, A Fujita, M Utiyama, and E Sumita
    , 2020

2019

  1. Recurrent stacking of layers for compact neural machine translation models
    Raj Dabre, and Atsushi Fujita
    In Proceedings of the AAAI Conference on Artificial Intelligence, 2019
  2. Multilingual multi-domain adaptation approaches for neural machine translation
    Chenhui Chu, and Raj Dabre
    arXiv preprint arXiv:1906.07978, 2019
  3. Exploiting out-of-domain parallel data through multilingual transfer learning for low-resource neural machine translation
    Aizhan Imankulova, Raj Dabre, Atsushi Fujita, and Kenji Imamura
    In Proceedings of Machine Translation Summit XVII: Research Track, 2019
  4. Nict’s supervised neural machine translation systems for the wmt19 translation robustness task
    Raj Dabre, and Eiichiro Sumita
    In Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1), 2019
  5. NICT’s supervised neural machine translation systems for the WMT19 news translation task
    Raj Dabre, Kehai Chen, Benjamin Marie, Rui Wang, Atsushi Fujita, Masao Utiyama, and 1 more author
    In Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1), 2019
  6. NICT’s machine translation systems for the WMT19 similar language translation task
    Benjamin Marie, Raj Dabre, and Atsushi Fujita
    In Proceedings of the Fourth Conference on Machine Translation (Volume 3: Shared Task Papers, Day 2), 2019
  7. Improving Transformer-Based Speech Recognition Systems with Compressed Structure and Speech Attributes Augmentation.
    Sheng Li, Raj Dabre, Xugang Lu, Peng Shen, Tatsuya Kawahara, and Hisashi Kawai
    In Interspeech, 2019
  8. Proceedings of the 6th Workshop on Asian Translation
    Toshiaki Nakazawa, Chenchen Ding, Raj Dabre, Anoop Kunchukuttan, Nobushige Doi, Yusuke Oda, and 4 more authors
    In Proceedings of the 6th Workshop on Asian Translation, 2019
  9. Exploiting multilingualism through multistage fine-tuning for low-resource neural machine translation
    Raj Dabre, Atsushi Fujita, and Chenhui Chu
    In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 2019
  10. NICT’s participation to WAT 2019: Multilingualism and Multi-step Fine-Tuning for Low Resource NMT
    Raj Dabre, and Eiichiro Sumita
    In Proceedings of the 6th Workshop on Asian Translation, 2019
  11. Coursera Corpus Mining and Multistage Fine-Tuning for Improving Lectures Translation
    Haiyue Song, Raj Dabre, Atsushi Fujita, and Sadao Kurohashi
    arXiv preprint arXiv:1912.11739, 2019
  12. Multilingual Multi-Domain Adaptation Approaches for Neural Machine Translation. CoRR abs/1906.07978 (2019)
    Chenhui Chu, and Raj Dabre
    arXiv preprint arXiv:1906.07978, 2019
  13. Comparison of SMT and RBMT
    S Sreelekha, Raj Dabre, and Pushpak Bhattacharyya
    The Requirement of Hybridization for Marathi–Hindi MT, 2019
  14. Improving Transformer-Based Speech Recognition Systems with Compressed Structure and Speech Attributes Augmentation. Interspeech, 2019
    S Li, R Dabre, X Lu, P Shen, T Kawahara, and H Kawai
    Crossref, Web of Science, 2019

2018

  1. Multilingual and multi-domain adaptation for neural machine translation
    Chenhui Chu, and Raj Dabre
    In Proceedings of the 24st Annual Meeting of the Association for Natural Language Processing (NLP 2018), 2018
  2. Exploiting Multilingual Corpora Simply and Efficiently in Neural Machine Translation
    Raj Dabre, Fabien Cromieres, and Sadao Kurohashi
    Journal of Information Processing, 2018
  3. Exploiting Multilingualism and Transfer Learning for Low Resource Machine Translation
    Raj Noel Dabre Prasanna
    , 2018
  4. NICT’s Participation in WAT 2018: Approaches Using Multilingualism and Recurrently Stacked Layers
    Raj Dabre, Anoop Kunchukuttan, Atsushi Fujita, and Eiichiro Sumita
    , 2018
  5. A comprehensive empirical comparison of domain adaptation methods for neural machine translation
    Chenhui Chu, Raj Dabre, and Sadao Kurohashi
    Journal of Information Processing, 2018
  6. Overview of the 5th workshop on asian translation
    Toshiaki Nakazawa, Katsuhito Sudoh, Shohei Higashiyama, Chenchen Ding, Raj Dabre, Hideya Mino, and 4 more authors
    In Proceedings of the 32nd Pacific Asia Conference on Language, Information and Computation: 5th Workshop on Asian Translation: 5th Workshop on Asian Translation, 2018

2017

  1. Enabling multi-source neural machine translation by concatenating source sentences in multiple languages
    Raj Dabre, Fabien Cromieres, and Sadao Kurohashi
    In Proceedings of Machine Translation Summit XVI: Research Track, 2017
  2. MMCR4NLP: multilingual multiway corpora repository for natural language processing
    Raj Dabre, and Sadao Kurohashi
    arXiv preprint arXiv:1710.01025, 2017
  3. An empirical study of language relatedness for transfer learning in neural machine translation
    Raj Dabre, Tetsuji Nakagawa, and Hideto Kazawa
    In Proceedings of the 31st Pacific Asia conference on language, information and computation, 2017
  4. Kyoto university mt system description for iwslt 2017
    Raj Dabre, Fabien Cromieres, and Sadao Kurohashi
    In Proceedings of the 14th International Conference on Spoken Language Translation, 2017
  5. An empirical comparison of domain adaptation methods for neural machine translation
    Chenhui Chu, Raj Dabre, and Sadao Kurohashi
    In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 2017
  6. An empirical comparison of simple domain adaptation methods for neural machine translation
    Chenhui Chu, Raj Dabre, and Sadao Kurohashi
    arXiv preprint arXiv:1701.03214, 2017
  7. Neural machine translation: Basics, practical aspects and recent trends
    Fabien Cromieres, Toshiaki Nakazawa, and Raj Dabre
    In Proceedings of the IJCNLP 2017, Tutorial Abstracts, 2017

2016

  1. Sophisticated Lexical Databases-Simplified Usage: Mobile Applications and Browser Plugins For Wordnets
    Diptesh Kanojia, Raj Dabre, and Pushpak Bhattacharyya
    In Proceedings of the 8th Global WordNet Conference (GWC), 2016
  2. The Kyoto University cross-lingual pronoun translation system
    Raj Dabre, Yevgeniy Puzikov, Fabien Cromieres, and Sadao Kurohashi
    In Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers, 2016
  3. Parallel sentence extraction from comparable corpora with neural network features
    Chenhui Chu, Raj Dabre, and Sadao Kurohashi
    In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16), 2016
  4. Kyoto university participation to WAT 2016
    Fabien Cromieres, Chenhui Chu, Toshiaki Nakazawa, and Sadao Kurohashi
    In Proceedings of the 3rd Workshop on Asian Translation (WAT2016), 2016

2015

  1. Large-scale japanese-chinese scientific dictionary construction via pivot-based statistical machine translation
    Chenhui Chu, Raj Dabre, Toshiaki Nakazawa, and Sadao Kurohashi
    In Proceedings of the 21st Annual Meeting of the Association for Natural Language Processing (NLP 2015), 2015
  2. Leveraging small multilingual corpora for smt using many pivot languages
    Raj Dabre, Fabien Cromieres, Sadao Kurohashi, and Pushpak Bhattacharyya
    In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2015
  3. KyotoEBMT System Description for the 2nd Workshop on Asian Translation
    John Richardson, Raj Dabre, Chenhui Chu, Fabien Cromieres, Toshiaki Nakazawa, and Sadao Kurohashi
    In Proceedings of the 2nd Workshop on Asian Translation (WAT2015), 2015
  4. Large-scale dictionary construction via pivot-based statistical machine translation with significance pruning and neural network features
    Raj Dabre, Chenhui Chu, Fabien Cromieres, Toshiaki Nakazawa, and Sadao Kurohashi
    In Proceedings of the 29th Pacific Asia Conference on Language, Information and Computation, 2015
  5. Augmenting Pivot based SMT with word segmentation
    Rohit More, Anoop Kunchukuttan, Pushpak Bhattacharyya, and Raj Dabre
    In Proceedings of the 12th International Conference on Natural Language Processing, 2015

2014

  1. Do not do processing, when you can look up: Towards a discrimination net for WSD
    Diptesh Kanojia, Pushpak Bhattacharyya, Raj Dabre, Siddhartha Gunti, and Manish Shrivastava
    In Proceedings of the Seventh Global Wordnet Conference, 2014
  2. PaCMan: Parallel corpus management workbench
    Diptesh Kanojia, Manish Shrivastava, Raj Dabre, and Pushpak Bhattacharyya
    In Proceedings of the 11th International Conference on Natural Language Processing, 2014
  3. Tackling Close Cousins: Experiences In Developing Statistical Machine Translation Systems For Marathi And Hindi
    Raj Dabre, Jyotesh Choudhari, and Pushpak Bhattacharyya
    In Proceedings of the 11th International Conference on Natural Language Processing, 2014
  4. Anou tradir: Experiences in building statistical machine translation systems for mauritian languages–creole, English, French
    Raj Dabre, Aneerav Sukhoo, and Pushpak Bhattacharyya
    In Proceedings of the 11th International Conference on Natural Language Processing, 2014

2013

  1. A way to break them all: A compound word analyzer for Marathi
    Raj Dabre, Archana Amberkar, and Pushpak Bhattacharyya
    ICON, 2013
  2. Comparison of SMT and RBMT: The requirement of Hybridization for Marathi–Hindi MT
    Sreelekha S, Raj Dabre, and Pushpak Bhattacharyya
    ICON, 2013

2012

  1. Morphological Analyzer for Affix Stacking Languages: A Case Study of Marathi.
    Raj Dabre, Archana Amberkar, and Pushpak Bhattacharyya
    In COLING (Posters), 2012
  2. Morphology Analyser for Affix Stacking Languages: a case study in Marathi
    Dabre Raj, and Amberkar Archana
    In , 2012