About

Hi! I'm Phi. I am a Research Scientist in Artificial Intelligence (AI) at Salesforce Inc., working to build efficient and state-of-the-art LLMs

My research interests are Natural Language Processing (NLP), Multilingual Large Language Models (LLMs), Machine translation.

I graduated as PhD in Computer Science and AI in Nanyang Technological University, Singapore (NTU) under Prof. Shafiq Joty. My thesis was about Unsupervised, Semi-supervised and Multilingual neural machine translation

Find my CV here.

Working Experience

May 2023 - Present
United States

Research Scientist

Salesforce AI Research

Lead efforts in SFR-RAG - smaller but powerful language models specialized in Retrieval Augmented Generation (RAG). SFR-RAG is praised by CEO Marc Benioff and receives great media coverage.

Conduct research in improving reliability, faithfulness and reasoning abilities of large language models.
April 2023 - May 2023
Singapore

Senior Algorithm Engineer (Research)

Damo Academy, Alibaba

Led research and technical efforts in SeaLLMs - The first, open-source and state-of-the-art large language models for Southeast Asian languages, earning significant attentions from media and research communities.
May 2022 - Sep 2022
United States

Research Intern

Meta AI (FAIR)

Researched on improving Speech-to-Speech translation by introducing new ways to generate and augment synthetic data. Our paper is being submitted to ICASSP 2023.
May 2021 - Sep 2021
United States (Remote)

Research Intern

Facebook AI Research (FAIR)

Researched and developed a novel pseudo-parallel data mining technique and integration strategy to achieve the state of the art in Unsupervised Machine Translation. Our paper is published in ICLR 2022.
May 2019 - Aug 2019
Singapore

NLP Research Intern

Salesforce AI Research

Researched on different aspects of linguistic structures languages on the performances of neural architectures on various natural language processing tasks. Proposed new state-of-the-art methods and wrote a paper that is published in ICLR 2020.
Mar 2018 - May 2019
Singapore

Research Assistant

Natural Language Processing Group, NTU

Researched on different limitations and improvements on Neural Machine Translation, such as document-level machine translation, discourse phenomena, phrase-based, parsing-tree-based and unsupervised neural machine translation. Wrote papers published to various machine learning and NLP conferences, e.g: ICLR, ACL, EMNLP.
May 2017 - Jul 2017 & Jan 2018 - Jul 2018
Singapore

Software Engineer Intern

Visa Inc.

Developed a novel lightweight character-level convolutional neural network to perform scripted text classification tasks at up to 98% accuracy while consuming 1000 times less resources and achieving 10 times faster training time than standard deep models. Developed production-level code to deploy the models.
May 2016 - Jul 2016
Singapore

Software Engineer Intern

Panasonic R&D Center Singapore

Researched and cooperated to develop a new machine learning algorithm based on Support Vector Machine to classify electrical signals, achieving 94% of experimental accuracy. Assisted to design a Raspberry Pi robot for collecting sensor signals and communicating with server to manipulate a real car's system. Real-time accuracy reached 83.8%.

Education

2019-2023
Singapore

Doctor of Philosophy in Computer Science & Artificial Intelligence

Nanyang Technological University

Researched on neural approaches for supervised/unsupervised/semi-supervised and multilingual machine translation systems.
Thesis: Improving neural machine translation: data centric approaches
2015-2019
Singapore

Bachelor in Electrical & Electronics Engineering

Nanyang Technological University

Worked in various machine learning projects, ranging from classical ML like SVM to deep learning methods for computer vision and natural language processing.