Publications | Duc Tai Phan

2025

DAAL: Dual Ambiguity in Active Learning for Object Detection with YOLOE

Duc Tai Phan, Nhut Minh Nguyen, and Duc Ngoc Minh Dang

In 17th International Conference on Management of Digital Ecosystems , Sep 2025

Abs Code

The high cost of annotating vast datasets poses a significant challenge in object detection, hindering the development of robust models. By selectively annotating the most instructive samples, active learning has become a viable strategy for addressing this problem and improving model performance and labeling efficiency. However, the two main types of model uncertainty—epistemic, which captures the model’s uncertainty in classifying an object, and aleatoric, which addresses the inherent ambiguity in an object’s presence and location—are frequently not balanced in traditional active learning techniques. In this paper, we introduce DAAL (Dual Ambiguity Active Learning), a novel framework that quantifies and combines both epistemic and aleatoric ambiguity into a single, weighted score. Epistemic ambiguity measures the model’s indecision in assigning semantic labels, while aleatoric ambiguity assesses the conviction in object presence and localization. By combining these, DAAL selects the most informative images for annotation, optimizing model performance under limited labeling budgets. Extensive experiments on popular benchmarks demonstrate that DAAL consistently outperforms traditional methods, achieving superior accuracy under the same limited labeling budget. This affirms its effectiveness in creating more efficient annotation workflows for object detection.
ALMUS: Enhancing Active Learning for Object Detection with Metric-Based Uncertainty Sampling

Duc Tai Phan, Nhut Minh Nguyen, Khang Phuc Nguyen, Tri Pham, and Duc Ngoc Minh Dang

In 25th Asia-Pacific Network Operations and Management Symposium 2025 (APNOMS’2025) , Sep 2025

Abs

Object detection is critical in computer vision but often requires large amounts of labeled data for effective training. Active learning (AL) has emerged as a promising solution to reduce the annotation burden by selecting the most informative samples for labeling. However, existing AL methods for object detection primarily focus on uncertainty sampling, which may not effectively balance the dual challenges of classification and localization. In this study, we explore active learning for object detection, with the objective of optimizing model performance while substantially reducing the demand for annotated data. We propose a novel Active Learning with Metric-based Uncertainty Sampling (ALMUS) that works effectively for the object detection task. This approach prioritizes selecting images containing objects from categories where the model exhibits suboptimal performance, as determined by category-specific evaluation metrics. To balance the annotation budget across different object classes, we propose a dynamic allocation strategy that considers the difficulty of each class and the distribution of object instances within the dataset. This combination of strategies enables our method to effectively address the dual challenges of classification and localization in object detection tasks while still focusing on the rarest and most challenging classes. We conduct extensive experiments on the PASCAL VOC 2007 and 2012 datasets, demonstrating that our method outperforms several active learning baselines. Our results indicate that the proposed approach enhances model performance and accelerates convergence, making it a valuable contribution to the field of active learning in object detection.

2024

Improving Face Attendance Checking System with Ensemble Learning

Duc Tai Phan, Nam Phuong Tran, and Duc Ngoc Minh Dang

In 18th IEEE-RIVF International Conference on Computing and Communication Technologies , Oct 2024

Abs Code

In many industrialized countries, implementing an efficient face attendance checking system is crucial for effective staff management. The face attendance system nowadays encounters such problems. The biggest challenge to be concerned with is the various light and angle conditions, which can damage the performance of these systems. In this paper, we present an innovative approach that combines the results of GoogLeNet, VGGFace, and FaceNet to achieve higher accuracy through Ensemble Learning. With this solution, newer systems can achieve better results. Our system encompasses image capture, face detection, database management, and feature extraction for face recognition. All related code is available at https://anonymous.4open.science/r/Face-Attendance-Checking-2640/