The Heterophilic Snowflake Hypothesis: Training and Empowering GNNs for Heterophilic Graphs

Kun Wang, Guibin Zhang, Xinnan Zhang, Junfeng Fang, Xun Wu, Guohao Li, Shirui Pan, Wei Huang, Yuxuan Liang

2024-06-18Node Classification

Abstract

Graph Neural Networks (GNNs) have become pivotal tools for a range of graph-based learning tasks. Notably, most current GNN architectures operate under the assumption of homophily, whether explicitly or implicitly. While this underlying assumption is frequently adopted, it is not universally applicable, which can result in potential shortcomings in learning effectiveness. In this paper, \textbf{for the first time}, we transfer the prevailing concept of ``one node one receptive field" to the heterophilic graph. By constructing a proxy label predictor, we enable each node to possess a latent prediction distribution, which assists connected nodes in determining whether they should aggregate their associated neighbors. Ultimately, every node can have its own unique aggregation hop and pattern, much like each snowflake is unique and possesses its own characteristics. Based on observations, we innovatively introduce the Heterophily Snowflake Hypothesis and provide an effective solution to guide and facilitate research on heterophilic graphs and beyond. We conduct comprehensive experiments including (1) main results on 10 graphs with varying heterophily ratios across 10 backbones; (2) scalability on various deep GNN backbones (SGC, JKNet, etc.) across various large number of layers (2,4,6,8,16,32 layers); (3) comparison with conventional snowflake hypothesis; (4) efficiency comparison with existing graph pruning algorithms. Our observations show that our framework acts as a versatile operator for diverse tasks. It can be integrated into various GNN frameworks, boosting performance in-depth and offering an explainable approach to choosing the optimal network depth. The source code is available at \url{https://github.com/bingreeky/HeteroSnoH}.

Results

Task	Dataset	Metric	Value	Model
Node Classification	Wisconsin	Accuracy	88.77	MGNN + Hetero-S (6 layers)
Node Classification	Squirrel	Accuracy	57.83	JKNet + Hetero-S (8 layers)
Node Classification	Texas	Accuracy	93.09	MGNN + Hetero-S (8 layers)
Node Classification	Cornell	Accuracy	68.18	MGNN + Hetero-S (4 layers)
Node Classification	Chameleon	Accuracy	70.18	JKNet + Hetero-S (8 layers)
Node Classification	Actor	Accuracy	35.99	MGNN + Hetero-S (4 layers)

Abstract

Results

Task	Dataset	Metric	Value	Model
Node Classification	Wisconsin	Accuracy	88.77	MGNN + Hetero-S (6 layers)
Node Classification	Squirrel	Accuracy	57.83	JKNet + Hetero-S (8 layers)
Node Classification	Texas	Accuracy	93.09	MGNN + Hetero-S (8 layers)
Node Classification	Cornell	Accuracy	68.18	MGNN + Hetero-S (4 layers)
Node Classification	Chameleon	Accuracy	70.18	JKNet + Hetero-S (8 layers)
Node Classification	Actor	Accuracy	35.99	MGNN + Hetero-S (4 layers)

The Heterophilic Snowflake Hypothesis: Training and Empowering GNNs for Heterophilic Graphs

Abstract

Results

Related Papers

The Heterophilic Snowflake Hypothesis: Training and Empowering GNNs for Heterophilic Graphs

Abstract

Results

Related Papers