Publications

All papers have been subject to peer review unless indicated otherwise. *indicates equal contributions.

Click on a paper to view its short summary.

Refining Low-Resource Unsupervised Translation by Language Disentanglement of Multilingual Model

36th Conference on Neural Information Processing Systems (NeurIPS 2022), New Orleans, USA, 2022

Refining Low-Resource Unsupervised Translation by Language Disentanglement of Multilingual Model

Thumbnail

Citation: Xuan-Phi Nguyen, Shafiq Joty, Wu Kui & Aw Ai Ti (2022). Refining Low-Resource Unsupervised Translation by Language Disentanglement of Multilingual Model. 36th Conference on Neural Information Processing Systems (NeurIPS 2022).
Paper Link: https://arxiv.org/abs/2205.15544

Contrastive Clustering to Mine Pseudo Parallel Data for Unsupervised Translation

International Conference on Learning Representations (ICLR-22) 2022, 2022

Fully unsupervised mining method that can built synthetic parallel data for unsupervised machine translation

Thumbnail

Citation: Xuan-Phi Nguyen, Hongyu Gong, Yun Tang, Changhan Wang, Philipp Koehn, and Shafiq Joty (2022). Contrastive Clustering to Mine Pseudo Parallel Data for Unsupervised Translation. In International Conference on Learning Representations (ICLR) 2022.
Paper Link: https://openreview.net/pdf?id=pN1JOdrSY9

Cross-model Back-translated Distillation for Unsupervised Machine Translation

38th International Conference on Machine Learning (ICML), 2021

A novel strategy to improve unsupervised MT by using back-translation with multiple models.

Thumbnail

Citation: Xuan-Phi Nguyen, Shafiq Joty, Thanh-Tung Nguyen, Wu Kui, & Ai Ti Aw (2021). Cross-model Back-translated Distillation for Unsupervised Machine Translation. In Proceedings of the 38th International Conference on Machine Learning (ICML 2021).
Paper Link: https://arxiv.org/abs/2006.02163

A Conditional Splitting Framework for Efficient Constituency Parsing

ACL 2021 - The 59th Annual Meeting of the Association for Computational Linguistics, 2021

A Seq2Seq parsing framework that casts constituency parsing problems into a series of conditional splitting decisions.

Thumbnail

Citation: Thanh-Tung Nguyen, Xuan-Phi Nguyen, Shafiq Joty & Xiaoli Li (2021). A Conditional Splitting Framework for Efficient Constituency Parsing. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics.
Paper Link: not-ready-yet

RST Parsing from Scratch

Proceedings of the North American Chapter of the Association for Computational Linguistics (NAACL), 2021

A novel top-down end-to-end formulation of document level discourse parsing in the Rhetorical Structure Theory (RST) framework.

Thumbnail

Citation: Thanh-Tung Nguyen, Xuan-Phi Nguyen, Shafiq Joty & Xiaoli Li (2021). In Proceedings of the North American Chapter of the Association for Computational Linguistics (NAACL 2021).
Paper Link: https://www.aclweb.org/anthology/2020.acl-main.589/

Data Diversification: An Elegant Strategy For Neural Machine Translation

34th Conference on Neural Information Processing Systems (NeurIPS 2020), Vancouver, Canada, 2020

A simple way to boost many NMT tasks by using multiple backward and forward models.

Thumbnail

Citation: Xuan-Phi Nguyen, Shafiq Joty, Wu Kui, & Ai Ti Aw (2019). Data Diversification: An Elegant Strategy For Neural Machine Translation. In the 34th Conference on Neural Information Processing Systems (NeurIPS 2020), Vancouver, Canada, 2020.
Paper Link: https://arxiv.org/abs/1911.01986

Tree-structured Attention with Hierarchical Accumulation

International Conference on Learning Representations (ICLR), 2020

A novel attention mechanism that aggregates hierarchical structures to encode constituency trees for downstream tasks.

Thumbnail

Citation: Xuan-Phi Nguyen, Shafiq Joty, Steven Hoi, & Richard Socher (2020). Tree-Structured Attention with Hierarchical Accumulation. In International Conference on Learning Representations.
Paper Link: https://arxiv.org/abs/2002.08046

Efficient Constituency Parsing by Pointing

ACL 2020 - The 58th Annual Meeting of the Association for Computational Linguistics, 2020

A new parsing method that employs pointing mechanism to perform top-down decoding. The method is competitive with the state-of-the-art while being faster.

Thumbnail

Citation: Thanh-Tung Nguyen, Xuan-Phi Nguyen, Shafiq Joty & Xiaoli Li (2020). Efficient Constituency Parsing by Pointing. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics.
Paper Link: https://www.aclweb.org/anthology/2020.acl-main.301/

Differentiable Window for Dynamic Local Attention

ACL 2020 - The 58th Annual Meeting of the Association for Computational Linguistics, 2020

Using differentiable windows to perform local attentions greatly improve performance of machine translation and language modeling.

Thumbnail

Citation: Xuan-Phi Nguyen*, Thanh-Tung Nguyen*, Shafiq Joty & Xiaoli Li (2020). Differentiable Window for Dynamic Local Attention. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics.
Paper Link: https://www.aclweb.org/anthology/2020.acl-main.589/

Medical Image Segmentation with Stochastic Aggregated Loss in a Unified U-Net

2019 IEEE EMBS International Conference on Biomedical & Health Informatics (BHI), 2018

Traditional U-Net models suffer from gradient vanishing under certain circumstances, such as detecting the existence of tumors in the brain. We introduce a novel Stochastic Aggregated Loss that improves the gradient flows of U-Net and performance.

Thumbnail

Citation: P. X. Nguyen, Z. Lu, W. Huang, S. Huang, A. Katsuki & Z. Lin (2019).Medical image segmentation with stochastic aggregated loss in a unified U-Net. In 2019 IEEE EMBS International Conference on Biomedical Health Informatics (BHI) (IEEE BHI 2019), Chicago, USA.
Paper Link: https://ieeexplore.ieee.org/document/8834667