publications
* denotes equal contribution
An up-to-date list is available on Google Scholar.
2024
- arXivIndicLLMSuite: A Blueprint for Creating Pre-training and Fine-Tuning Datasets for Indian LanguagesarXiv preprint, 2024
- arXivDo Not Worry if You Do Not Have Data: Building Pretrained Language Models Using TranslationesearXiv preprint, 2024
- arXivRomanSetu: Efficiently unlocking multilingual capabilities of Large Language Models models via RomanizationarXiv preprint, 2024
- arXivMOS-FAD: Improving Fake Audio Detection Via Automatic Mean Opinion Score PredictionarXiv preprint, 2024
- arXiv
- arXivPUB: A Pragmatics Understanding Benchmark for Assessing LLMs’ Pragmatics CapabilitiesarXiv preprint, 2024
- arXiv
2023
- arXiv
- SuperShaper: A Pre-Training Approach for Discovering Efficient Transformer ShapesIn Workshop on Efficient Systems for Foundation Models @ ICML2023 Aug, 2023
- arXivSciCap+: A Knowledge Augmented Dataset to Study the Challenges of Scientific Figure CaptioningarXiv preprint Aug, 2023
-
- Variable-length Neural Interlingua Representations for Zero-shot Neural Machine TranslationIn Proceedings of the 1st International Workshop on Multilingual, Multimodal and Multitask Language Generation Jun, 2023
2022
- BERTSeg: BERT Based Unsupervised Subword Segmentation for Neural Machine TranslationIn Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (Volume 2: Short Papers) Nov, 2022
2021
2020
2019
-
- Exploiting Out-of-Domain Parallel Data through Multilingual Transfer Learning for Low-Resource Neural Machine TranslationIn Proceedings of Machine Translation Summit XVII: Research Track Aug, 2019
2018
- Overview of the 5th Workshop on Asian TranslationIn Proceedings of the 32nd Pacific Asia Conference on Language, Information and Computation: 5th Workshop on Asian Translation: 5th Workshop on Asian Translation "1–3 " # "dec", 2018
- NICT’s Participation in WAT 2018: Approaches Using Multilingualism and Recurrently Stacked LayersIn Proceedings of the 32nd Pacific Asia Conference on Language, Information and Computation: 5th Workshop on Asian Translation: 5th Workshop on Asian Translation "1–3 " # "dec", 2018