Harnessing unlabeled data to improve generalization of biometric gender and age classifiers

Thumbnail Image
Nadimpalli, Aakash Varma
Reddy, Narsi
Ramachandran, Sreeraj
Rattani, Ajita
Issue Date
Conference paper
Deep learning , Training , Parameter estimation , Biometrics (access control) , Computational modeling , Unlabeled data , Semi-supervised learning , Soft Biometrics
Research Projects
Organizational Units
Journal Issue
A. V. Nadimpalli, N. Reddy, S. Ramachandran and A. Rattani, "Harnessing Unlabeled Data to Improve Generalization of Biometric Gender and Age Classifiers," 2021 IEEE Symposium Series on Computational Intelligence (SSCI), 2021, pp. 1-7, doi: 10.1109/SSCI50451.2021.9660182.

With significant advances in deep learning, many computer vision applications have reached the inflection point. However, these deep learning models need large amount of labeled data for model training and optimum parameter estimation. Limited labeled data for model training results in overfitting and impacts their generalization performance. However, the collection and annotation of large amount of data is a very time consuming and expensive operation. Further, due to privacy and security concerns, the large amount of labeled data could not be collected for certain applications such as those involving medical field. Self-training, Co-training, and Self-ensemble methods are three types of semi-supervised learning methods that can be used to exploit unlabeled data. In this paper, we propose self-ensemble based deep learning model that along with limited labeled data, harness unlabeled data for improving the generalization performance. We evaluated the proposed self-ensemble based deep-learning model for soft-biometric gender and age classification. Experimental evaluation on CelebA and VISO B datasets suggest gender classification accuracy of 94.46 % and 81.00 %, respectively, using only 1000 labeled samples and remaining 199k samples as unlabeled samples for CelebA dataset and similarly,1000 labeled samples with remaining 107k samples as unlabeled samples for VISOB dataset. Comparative evaluation suggest that there is 5.74% and 8.47% improvement in the accuracy of the self-ensemble model when compared with supervised model trained on the entire CelebA and VISOB dataset, respectively. We also evaluated the proposed learning method for age-group prediction on Adience dataset and it outperformed the baseline supervised deep-learning learning model with a better exact accuracy of 55.55 ± 4.28 which is 3.92% more than the baseline.

Table of Contents
Preprint version available. Also available from DOI. Presented at 2021 IEEE Symposium Series on Computational Intelligence.
Book Title
2021 IEEE Symposium Series on Computational Intelligence (SSCI);2021
PubMed ID