Learning the truth by weakly connected agents in social networks using multi-Armed Bandit
Authors
Advisors
Issue Date
Type
Keywords
Citation
Abstract
This article provides a study into the social network where influential personalities collaborate positively among themselves to learn an underlying truth over time, but may have misled their followers to believe a false information. Most existing work that study leader-follower relationships in a social network model the social network as a graph network, and apply non-Bayesian learning to train the weakly connected agents to learn the truth. Although this approach is popular, it has the limitation of assuming that the truth-otherwise called the true state-is time-invariant. This is not practical in social network, where streams of information are released and updated every second, making the true state arbitrarily time-varying. Thus, this article improves on existing work by introducing online reinforcement learning into the graph theoretic framework. Specifically, multi-Armed bandit technique is applied. A multi-Armed bandit algorithm is proposed and used to train the weakly connected agents to converge to the most stable state over time. The speed of convergence for these weakly connected agents trained with the proposed algorithm is slower by 66% on average, when compared to the speed of convergence for strongly connected agents trained with the state-of-The-Art algorithm. This is because weakly connected agents are difficult to train. However, the speed of convergence of these weakly connected agents can be improved by approximately 50% on average, by fine-Tuning the learning rate of the proposed algorithm. The sublinearity of the regret bound for the proposed algorithm is compared to the sublinearity of the regret bound for the state-of-The-Art algorithm for strongly connected networks.