Skip to main navigation menu Skip to main content Skip to site footer

Articles

Vol. 4 No. 1 (2025): New Era of AI

Efficient Vietnamese Name Retrieval using Highly Discriminative N-Grams

Submitted
February 8, 2025
Published
2025-05-31

Abstract

Retrieving Vietnamese names from large global databases is crucial for fostering connections within professional Vietnamese communities. However, the use of Latin characters in Vietnamese names often causes ambiguity, as they can resemble names from other countries. This paper introduces Highly Discriminative N-grams (HDNs), a novel query method designed to efficiently retrieve Vietnamese names from diverse datasets. Experimental results show that HDNs significantly outperform traditional unigram queries, achieving superior precision, recall, and cost-effectiveness. This innovative approach improves the accuracy and efficiency of Vietnamese name retrieval, supporting efforts to connect the Vietnamese diaspora with global opportunities.