HiTIN: Hierarchy-aware Tree Isomorphism Network for Hierarchical Text Classification

He Zhu, Chong Zhang, JunJie Huang, Junran Wu, Ke Xu

2023-05-24Text Classification text-classification Multi-Label Classification Multi-Label Text Classification Hierarchical Multi-label Classification

Paper PDF Code(official)

Abstract

Hierarchical text classification (HTC) is a challenging subtask of multi-label classification as the labels form a complex hierarchical structure. Existing dual-encoder methods in HTC achieve weak performance gains with huge memory overheads and their structure encoders heavily rely on domain knowledge. Under such observation, we tend to investigate the feasibility of a memory-friendly model with strong generalization capability that could boost the performance of HTC without prior statistics or label semantics. In this paper, we propose Hierarchy-aware Tree Isomorphism Network (HiTIN) to enhance the text representations with only syntactic information of the label hierarchy. Specifically, we convert the label hierarchy into an unweighted tree structure, termed coding tree, with the guidance of structural entropy. Then we design a structure encoder to incorporate hierarchy-aware information in the coding tree into text representations. Besides the text encoder, HiTIN only contains a few multi-layer perceptions and linear transformations, which greatly saves memory. We conduct experiments on three commonly used datasets and the results demonstrate that HiTIN could achieve better test performance and less memory consumption than state-of-the-art (SOTA) methods.

Results

Task	Dataset	Metric	Value	Model
Multi-Label Classification	RCV1-v2	Macro F1	69.95	HiTIN+BERT
Multi-Label Classification	RCV1-v2	Micro F1	86.71	HiTIN+BERT
Multi-Label Classification	RCV1-v2	Macro F1	64.37	HiTIN
Multi-Label Classification	RCV1-v2	Micro F1	84.81	HiTIN
Multi-Label Classification	New York Times Annotated Corpus	Macro F1	69.31	HiTIN+BERT
Multi-Label Classification	New York Times Annotated Corpus	Micro F1	79.65	HiTIN+BERT
Multi-Label Classification	New York Times Annotated Corpus	Macro F1	61.09	HiTIN
Multi-Label Classification	New York Times Annotated Corpus	Micro F1	75.13	HiTIN
Multi-Label Classification	WOS	Macro F1	81.57	HiTIN+BERT
Multi-Label Classification	WOS	Micro F1	87.19	HiTIN+BERT
Multi-Label Classification	WOS	Macro F1	81.11	HiTIN
Multi-Label Classification	WOS	Micro F1	86.66	HiTIN

HiTIN: Hierarchy-aware Tree Isomorphism Network for Hierarchical Text Classification

Abstract

Results

Related Papers

HiTIN: Hierarchy-aware Tree Isomorphism Network for Hierarchical Text Classification

Abstract

Results

Related Papers