Towards Dynamic Self-Training for Scalable Semi-Supervised Learning on Graphs

Fadi Dornaika, Zoulfikar Ibrahim, Jinan Charafeddine, Alireza Bosaghzadeh

January 2025

Abstract

In the realm of graph-based semi-supervised learning (GSSL), traditional methodologies often struggle to effectively handle labeled samples and scale to accommodate large datasets. To increase supervision information in semi-supervised learning, the self-training paradigm is often used, mainly in datasets with moderate sizes. On the other hand, the use of anchors was adopted with large datasets. In this research endeavor, we propose a novel framework for GSSL that leverages a novel self-training principle tailored for very large datasets, and introduces an advanced method for automatic graph construction using anchors. Our approach focuses on utilizing generated labels of random batches of unlabeled samples, subsequently incorporating these predictions into the training set to enhance the model’s accuracy. Pseudo-labeling, a specific instance of self-training, assigns pseudo-labels to the most confidently predicted unlabeled examples, treating them as ground truth during the training phase. By constructing anchor-to-anchor affinity graphs that incorporate both feature and label information, our method facilitates robust learning on large-scale datasets. Through comprehensive experimentation across diverse large datasets, our approach demonstrates its efficacy in achieving scalable and reliable semi-supervised learning outcomes. These findings represent a significant advancement in the field of GSSL, with wide-ranging implications for various applications across different domains. Our method not only addresses the scalability issue but also ensures the effective integration of both labeled and pseudo labeled data, thereby enhancing the overall learning process.

Type

Journal article

Publication

Neurocomputing

Towards Dynamic Self-Training for Scalable Semi-Supervised Learning on Graphs

Abstract

Fadi Dornaika

Ikerbasque Research Professor

Zoulfikar Ibrahim

Professor and Software Engineer