TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Methods

100 machine learning methods and techniques

AllAudioComputer VisionGeneralGraphsNatural Language ProcessingReinforcement LearningSequential

MNMF

Modularity preserving NMF

GraphsIntroduced 20003 papers

S-GCN

Spherical Graph Convolutional Network

GraphsIntroduced 20003 papers

InfoGraph

GraphsIntroduced 20003 papers

BiGG

BiGG is an autoregressive model for generative modeling for sparse graphs. It utilizes sparsity to avoid generating the full adjacency matrix, and reduces the graph generation time complexity to . Furthermore, during training this autoregressive model can be parallelized with synchronization stages, which makes it much more efficient than other autoregressive models that require . The approach is based on three key elements: (1) an process for generating each edge using a binary tree data structure, inspired by R-MAT; (2) a tree-structured autoregressive model for generating the set of edges associated with each node; and (3) an autoregressive model defined over the sequence of nodes.

GraphsIntroduced 20003 papers

Cluster-GCN

Cluster-GCN is a novel GCN algorithm that is suitable for SGD-based training by exploiting the graph clustering structure. Cluster-GCN works as the following: at each step, it samples a block of nodes that associate with a dense subgraph identified by a graph clustering algorithm, and restricts the neighborhood search within this subgraph. This simple but effective strategy leads to significantly improved memory and computational efficiency while being able to achieve comparable test accuracy with previous algorithms. Description and image from: Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks

GraphsIntroduced 20003 papers

SDNE

Structural Deep Network Embedding

GraphsIntroduced 20003 papers

NetMF

Network Embedding as Matrix Factorization:

GraphsIntroduced 20003 papers

CP-N3

Canonical Tensor Decomposition with N3 Regularizer

Canonical Tensor Decomposition, trained with N3 regularizer

GraphsIntroduced 20003 papers

RREA

Relational Reflection Entity Alignment

GraphsIntroduced 20002 papers

GCNFN

Graph Convolutional Networks for Fake News Detection

Social media are nowadays one of the main news sources for millions of people around the globe due to their low cost, easy access and rapid dissemination. This however comes at the cost of dubious trustworthiness and significant risk of exposure to 'fake news', intentionally written to mislead the readers. Automatically detecting fake news poses challenges that defy existing content-based analysis approaches. One of the main reasons is that often the interpretation of the news requires the knowledge of political or social context or 'common sense', which current NLP algorithms are still missing. Recent studies have shown that fake and real news spread differently on social media, forming propagation patterns that could be harnessed for the automatic fake news detection. Propagation-based approaches have multiple advantages compared to their content-based counterparts, among which is language independence and better resilience to adversarial attacks. In this paper we show a novel automatic fake news detection model based on geometric deep learning. The underlying core algorithms are a generalization of classical CNNs to graphs, allowing the fusion of heterogeneous data such as content, user profile and activity, social graph, and news propagation. Our model was trained and tested on news stories, verified by professional fact-checking organizations, that were spread on Twitter. Our experiments indicate that social network structure and propagation are important features allowing highly accurate (92.7% ROC AUC) fake news detection. Second, we observe that fake news can be reliably detected at an early stage, after just a few hours of propagation. Third, we test the aging of our model on training and testing data separated in time. Our results point to the promise of propagation-based approaches for fake news detection as an alternative or complementary strategy to content-based approaches.

GraphsIntroduced 20002 papers

MTransE

GraphsIntroduced 20002 papers

MinCutPool

MinCut Pooling

MinCutPool is a trainable pooling operator for graphs that learns to map nodes into clusters. The method is trained to approximate the minimum K-cut of the graph to ensure that the clusters are balanced, while also jointly optimizing the objective of the task at hand.

GraphsIntroduced 20002 papers

MultiKE

Multi-view Knowledge Graph Embedding

GraphsIntroduced 20002 papers

RE-NET

Recurrent Event Network

Recurrent Event Network (RE-NET) is an autoregressive architecture for predicting future interactions. The occurrence of a fact (event) is modeled as a probability distribution conditioned on temporal sequences of past knowledge graphs. RE-NET employs a recurrent event encoder to encode past facts and uses a neighborhood aggregator to model the connection of facts at the same timestamp. Future facts can then be inferred in a sequential manner based on the two modules.

GraphsIntroduced 20002 papers

GaAN

Gated Attention Networks

Gated Attention Networks (GaAN) is a new architecture for learning on graphs. Unlike the traditional multi-head attention mechanism, which equally consumes all attention heads, GaAN uses a convolutional sub-network to control each attention head’s importance. Image credit: GaAN: Gated Attention Networks for Learning on Large and Spatiotemporal Graphs

GraphsIntroduced 20002 papers

GNNCL

Graph Neural Networks with Continual Learning

Although significant effort has been applied to fact-checking, the prevalence of fake news over social media, which has profound impact on justice, public trust and our society, remains a serious problem. In this work, we focus on propagation-based fake news detection, as recent studies have demonstrated that fake news and real news spread differently online. Specifically, considering the capability of graph neural networks (GNNs) in dealing with non-Euclidean data, we use GNNs to differentiate between the propagation patterns of fake and real news on social media. In particular, we concentrate on two questions: (1) Without relying on any text information, e.g., tweet content, replies and user descriptions, how accurately can GNNs identify fake news? Machine learning models are known to be vulnerable to adversarial attacks, and avoiding the dependence on text-based features can make the model less susceptible to the manipulation of advanced fake news fabricators. (2) How to deal with new, unseen data? In other words, how does a GNN trained on a given dataset perform on a new and potentially vastly different dataset? If it achieves unsatisfactory performance, how do we solve the problem without re-training the model on the entire data from scratch? We study the above questions on two datasets with thousands of labelled news items, and our results show that: (1) GNNs can achieve comparable or superior performance without any text information to state-of-the-art methods. (2) GNNs trained on a given dataset may perform poorly on new, unseen data, and direct incremental training cannot solve the problem---this issue has not been addressed in the previous work that applies GNNs for fake news detection. In order to solve the problem, we propose a method that achieves balanced performance on both existing and new datasets, by using techniques from continual learning to train GNNs incrementally.

GraphsIntroduced 20002 papers

AD-GCL

Adversarial Graph Contrastive Learning

GraphsIntroduced 20002 papers

WYS

Watch Your Step

GraphsIntroduced 20002 papers

GFSA

Graph Finite-State Automaton

Graph Finite-State Automaton, or GFSA, is a differentiable layer for learning graph structure that adds a new edge type (expressed as a weighted adjacency matrix) to a base graph. This layer can be trained end-to-end to add derived relationships (edges) to arbitrary graph-structured data based on performance on a downstream task.

GraphsIntroduced 20002 papers

ED-GNN

Medical Entity Disambiguation using Graph Neural Networks

GraphsIntroduced 20001 papers

CP-N3-RP

CP with N3 Regularizer and Relation Prediction

CP with N3 Regularizer and Relation Prediction

GraphsIntroduced 20001 papers

CT-Layer

Commute Times Layer

TL;DR: CT-Layer is a GNN Layer which is able to rewire a graph in an inductive an parameter-free way according to the commute times distance (or effective resistance). We address it learning a differentiable way to compute the CT-embedding of the graph. Summary CT-Layer is able to Learn the Commute Times distance between nodes (i.e. effective resistance distance) in a differentiable way, instead of the common spectral version, and in a parameter free manner, which is not the cased of the heat kernel. This approach allow to solve it as an optimization problem inside a GNN, leading to have a new layer which is able to learn how rewire a given graph in an optimal, and inductive way. In addition, CT-Layer also is able to learn Commute Times embeddings, and then calculate it for any graph in an inductive way. The Commute Times embedding is also related with the eigenvalues and eigenvectors of the Laplacian of the graph, because CT embedding is just the eigenvectors scaled. Therefore, CT-Layer is also able to learn hot to calculate the spectrum of the Laplacian in a differentiable way. Therefore, this embedding must satisfy orthogonality and normality. Finally, recent connections has been found between commute times distance and curvature (which is non-differentiable), establishing equivalences between them. Therefore, CT-Layer can also be seen as the differentiable version of the curvature rewiring. We are going through a quick overview of the layer, but I suggest go to the paper for a detailed explanation. Spectral CT- Embedding downsides CT-embedding is computed spectrally in the literature (until the proposal of this method) or it is approximated using the heat kernel (very dependent on hyperparameter ). This fact does not allow us to propose differentiable methods using that measure: Then, CT-distance is given by the Euclidean distances between the embeddings . The spectral form is: being the eigenvectors of the graph Laplacian. This embedding and distances gives us desirable properties of the graph, such an understanding of the structure, or an embedding based on the spectrum which minimizes Dirichlet energies. However, the spectral computation is not differentiable. CT-Layer as an optimization problem: Differentiable, learnable and inductive CT-Layer Giving that minimizes Dirichlet energies s.t. being orthogonal and normalized, we can formulate this problem as constraining neighboring nodes to have a similar embeddings s.t. . With the above elements we have a definition of CT-Layer, our rewiring layer: Given the matrix encoding the features of the nodes after any message passing (MP) layer, learns the association while is optimized according to the loss This results in the following resistance diffusion (Hadamard product between the resistance distance and the adjacency) which provides as input to the subsequent MP layer a learnt convolution matrix. As explained before, is the commute times embedding matrix and the pairwise euclidian distance of that learned embeddings are the commute times distances or resistance distances. Therefore, once trained this layer, it will be able to calculate the commute times embedding for a new graph, and rewire that new and unseen graph in a principled way based on the commute times distance. Preservation of Structure Does this rewiring preserve the original structure? Let be a sampling algorithm of graph , where edges are sampled with probability (proportional to the effective resistance, i.e. commute times). Then, for sufficiently large and , we need O(n\log n/\epsilon^2)G'(1\pm \epsilon)G$.

GraphsIntroduced 20001 papers

WEGL

Wasserstein Embedding for Graph Learning

Please enter a description here

GraphsIntroduced 20001 papers

DyGED

Dynamic Graph Event Detection

GraphsIntroduced 20001 papers

KGRefiner

Knowledge Graph Refiner

GraphsIntroduced 20001 papers

PinvGCN

Pseudoinverse Graph Convolutional Network

A GCN method targeted at the unique spectral properties of dense graphs and hypergraphs, enabled by efficient numerical linear algebra.

GraphsIntroduced 20001 papers

CP N3

CP with N3 Regularizer

CP with N3 Regularizer

GraphsIntroduced 20001 papers

Symbolic Deep Learning

This is a general approach to convert a neural network into an analytic equation. The technique works as follows: 1. Encourage sparse latent representations 2. Apply symbolic regression to approximate the transformations between in/latent/out layers 3. Compose the symbolic expressions. In the paper, we show that we find the correct known equations, including force laws and Hamiltonians, can be extracted from the neural network. We then apply our method to a non-trivial cosmology example-a detailed dark matter simulation-and discover a new analytic formula which can predict the concentration of dark matter from the mass distribution of nearby cosmic structures. The symbolic expressions extracted from the GNN using our technique also generalized to out-of-distribution data better than the GNN itself. Our approach offers alternative directions for interpreting neural networks and discovering novel physical principles from the representations they learn.

GraphsIntroduced 20001 papers

SETSe

Strain Elevation Tension Spring embedding

SETSe is a deterministic physics based graph embedding algorithm. It embeds weighted feature rich networks. It treats each edge as a spring and each node as a bead whose movement is constrained by the graph adjacency matrix so that the nodes move in parallel planes enforcing a minimum distance between neighboring nodes. The node features act as forces moving the nodes up and down. The network converges to the embedded state when the force produced by each node is equal and opposite to the sum of the forces exerted by its edges, creating a net force of 0. SETSe has no conventional loss function and does not attempt to place similar nodes close to each other.

GraphsIntroduced 20001 papers

Canonical Partition

\emph{Canonical partition} crops the index-restricted d-hop neighborhood around the center node from the target graph. means the shortest distance between and on . \begin{equation} \mathcal{P}(Gt, vc, d) = Gc, \operatorname{ s.t. } Gc \subseteq Gt, Vc = \{ vi \in Vt|\mathcal{D}(Gt,vi,vc) \leq d , vi \leq vc\} \end{equation} The graph obtained by canonical partition is called the \emph{canonical neighborhood}. Canonical neighborhoods can correctly substitute the target graph in canonical count. The subgraph count of query in target equals the summation of the canonical count of query in canonical neighborhoods for all target nodes. Canonical neighborhoods are acquired with canonical partition , given any greater than the diameter of the query. \begin{equation} \mathcal{C}(Gq,Gt) = \sum{vc \in Vt} \mathcal{C}c(Gq, \mathcal{P}(Gt, vc, d),vc), d \geq \max{vi, vj \in Vq} \mathcal{D}(Gq, vi, vj) \end{equation}

GraphsIntroduced 20001 papers

AdaGPR

AdaGPR is an adaptive, layer-wise graph convolution model. AdaGPR applies adaptive generalized Pageranks at each layer of a GCNII model by learning to predict the coefficients of generalized Pageranks using sparse solvers.

GraphsIntroduced 20001 papers

MXMNet

Multiplex Molecular Graph Neural Network

The Multiplex Molecular Graph Neural Network (MXMNet) is an approach for the representation learning of molecules. The molecular interactions are divided into two categories: local and global. Then a two-layer multiplex graph is constructed for a molecule. In , the local layer only contains the local connections that mainly capture covalent interactions, and the global layer contains the global connections that cover non-covalent interactions. MXMNet uses the Multiplex Molecular (MXM) module that contains a novel angle-aware message passing operated on and an efficient message passing operated on .

GraphsIntroduced 20001 papers

HMGNN

Heterogeneous Molecular Graph Neural Network

As they carry great potential for modeling complex interactions, graph neural network (GNN)-based methods have been widely used to predict quantum mechanical properties of molecules. Most of the existing methods treat molecules as molecular graphs in which atoms are modeled as nodes. They characterize each atom's chemical environment by modeling its pairwise interactions with other atoms in the molecule. Although these methods achieve a great success, limited amount of works explicitly take many-body interactions, i.e., interactions between three and more atoms, into consideration. In this paper, we introduce a novel graph representation of molecules, heterogeneous molecular graph (HMG) in which nodes and edges are of various types, to model many-body interactions. HMGs have the potential to carry complex geometric information. To leverage the rich information stored in HMGs for chemical prediction problems, we build heterogeneous molecular graph neural networks (HMGNN) on the basis of a neural message passing scheme. HMGNN incorporates global molecule representations and an attention mechanism into the prediction process. The predictions of HMGNN are invariant to translation and rotation of atom coordinates, and permutation of atom indices. Our model achieves state-of-the-art performance in 9 out of 12 tasks on the QM9 dataset.

GraphsIntroduced 20001 papers

ComplEx-N3-RP

ComplEx with N3 Regularizer and Relation Prediction Objective

ComplEx model trained with a nuclear norm regularizer; A relation prediction objective is added on top of the commonly used 1vsAll objective.

GraphsIntroduced 20001 papers

DualGCN

Dual Graph Convolutional Networks

A dual graph convolutional neural network jointly considers the two essential assumptions of semi-supervised learning: (1) local consistency and (2) global consistency. Accordingly, two convolutional neural networks are devised to embed the local-consistency-based and global-consistency-based knowledge, respectively. Description and image from: Dual Graph Convolutional Networks for Graph-Based Semi-Supervised Classification

GraphsIntroduced 20181 papers

ISNE

Inductive Shallow Node Embedding

Inductive Shallow Node Embedding extends shallow embeddings to the realm of inductive learning. It has a novel encoder architecture that captures the local neighborhood structure of each node, enabling effective generalization to unseen nodes. In the generalization, robustness is essential to avoid degradation of performance arising from noise in the dataset. It has been theoretically proven that the covariance of the additive noise term in the proposed model is inversely proportional to the cardinality of a node’s neighbors. Another contribution is a mathematical lower bound to quantify the robustness of node embeddings, confirming its advantage over traditional shallow embedding methods, particularly in the presence of parameter noise. The proposed method demonstrably excels in dynamic networks, consistently achieving over 90% performance on previously unseen nodes compared to nodes encountered during training on various benchmarks. The empirical evaluation concludes that our method outperforms competing methods on the vast majority of datasets in both transductive and inductive tasks.

GraphsIntroduced 20001 papers

IPA-GNN

Instruction Pointer Attention Graph Neural Network

Instruction Pointer Attention Graph Neural Network, or IPA-GNN, is a learning-interpreter neural network (LNN) based on GNNs for learning to execute programmes. It achieves improved systematic generalization on the task of learning to execute programs using control flow graphs. The model arises by considering RNNs operating on program traces with branch decisions as latent variables. The IPA-GNN can be seen either as a continuous relaxation of the RNN model or as a GNN variant more tailored to execution.

GraphsIntroduced 20001 papers

GraphESN

Graph Echo State Network

Graph Echo State Network (GraphESN) model is a generalization of the Echo State Network (ESN) approach to graph domains. GraphESNs allow for an efficient approach to Recursive Neural Networks (RecNNs) modeling extended to deal with cyclic/acyclic, directed/undirected, labeled graphs. The recurrent reservoir of the network computes a fixed contractive encoding function over graphs and is left untrained after initialization, while a feed-forward readout implements an adaptive linear output function. Contractivity of the state transition function implies a Markovian characterization of state dynamics and stability of the state computation in presence of cycles. Due to the use of fixed (untrained) encoding, the model represents both an extremely efficient version and a baseline for the performance of recursive models with trained connections. Description from: Graph Echo State Networks

GraphsIntroduced 20101 papers

GeniePath

GeniePath is a scalable approach for learning adaptive receptive fields of neural networks defined on permutation invariant graph data. In GeniePath, we propose an adaptive path layer consists of two complementary functions designed for breadth and depth exploration respectively, where the former learns the importance of different sized neighborhoods, while the latter extracts and filters signals aggregated from neighbors of different hops away. Description and image from: GeniePath: Graph Neural Networks with Adaptive Receptive Paths

GraphsIntroduced 20001 papers

RESCAL RP

RESCAL

GraphsIntroduced 20001 papers

StoGCN

StoGCN is a control variate based algorithm which allow sampling an arbitrarily small neighbor size. Presents new theoretical guarantee for the algorithms to converge to a local optimum of GCN.

GraphsIntroduced 20001 papers

RelDiff

RelDiff generates entity-relation-entity embeddings in a single embedding space. RelDiff adopts two fundamental vector algebraic operators to transform entity and relation embeddings from knowledge graphs into entity-relation-entity embeddings. In particular, RelDiff can encode finer-grained information about the relations than is captured when separate embeddings are learned for the entities and the relations.

GraphsIntroduced 20001 papers

DeepDrug

DeepDrug is a deep learning framework to overcome these shortcomings by using graph convolutional networks to learn the graphical representations of drugs and proteins such as molecular fingerprints and residual structures in order to boost the prediction accuracy.

GraphsIntroduced 20001 papers

GAP-Layer

Spectral Gap Rewiring Layer

TL;DR: GAP-Layer is a GNN Layer which is able to rewire a graph in an inductive an parameter-free way optimizing the spectral gap (minimizing or maximizing the bottleneck size), learning a differentiable way to compute the Fiedler vector and the Fiedler value of the graph. Summary GAP-Layer is a rewiring layer based on minimizing or maximizing the spectral gap (or graph bottleneck size) in an inductive way. Depending on the mining task we want to perform in our graph, we would like to maximize or minimize the size of the bottleneck, aiming to more connected or more separated communities. GAP-Layer: Spectral Gap Rewiring Loss and derivatives using or For this explanation, we are going to suppose we want to minimize the spectral gap, i.e. make the graph bottleneck size smaller. For minimizing the spectral GAP we minimize this loss: The gradients of this cost function w.r.t each element of are not trivial. Depending on if we use the Laplacian, , or the normalized Laplacian, , the derivatives are going to be different. For the former case (), we will use the derivatives presented in Kang et al. 2019. In the latter scenario (), we present the Spectral Gradients: derivatives from the spectral gap w.r.t. the Normalized Laplacian. However, whatever option we choose, can seen as a function of and , hence, , the gradient of wrt each component of (how does the bottleneck change with each change in our graph?), comes from the chain rule of the matrix derivative if using the Laplacian or if using the normalized Laplacian. Both of this derivatives, relies on the Fiedler vector (2nd eigenvector: if we use and if using instead). For more details on those derivatives, and for the sake of simplicity in this blog explanation, I suggest go to the original paper. Differentiable approximation of and Once we have those derivatives, the problem is still not that trivial. Note that our cost function , relies on an eigenvalue . In addition, the derivatives also depends on the Fiedler vector or , which is the eigenvector corresponding to the aforementioned eigenvalue. However, we DO NOT COMPUTE IT SPECTRALLY, as its computation has a complexity of and would need to be computed in every learning iteration. Instead, we learn an approximation of and use its Dirichlet energy to approximate the . In addition, if using , since , we first approximate and then approximate from . With this approximation, we can easily compute the node belonging to each cluster with a simple MLP. In addition, such as the Fiedler value must satisfy orthogonality and normality, restrictions must be added to that MLP Clustering. GAP-Layer To sum up, GAP-Layer can be defined as the following. Given the matrix encoding the features of the nodes after any message passing (MP) layer, learns the association while is optimized according to the loss: Then, the is approximated from using equation. Once calculated and we consider the loss: returning . Then the GAP diffusion results from minimizing References (Kang et al. 2019) Kang, J., & Tong, H. (2019, November). N2n: Network derivative mining. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management (pp. 861-870).

GraphsIntroduced 20001 papers

TuckER-RP

TuckER with Relation Prediction

TuckER model trained with a relation prediction objective on top of the 1vsAll loss

GraphsIntroduced 20001 papers

RESCAL-RP

RESCAL with Relation Prediction

RESCAL model trained with a relation prediction objective on top of the 1vsAll loss

GraphsIntroduced 20001 papers

AutoGL

Automated Graph Learning

Automated graph learning is a method that aims at discovering the best hyper-parameter and neural architecture configuration for different graph tasks/data without manual design.

GraphsIntroduced 20001 papers

CayleyNet

The core ingredient of CayleyNet is a new class of parametric rational complex functions (Cayley polynomials) allowing to efficiently compute spectral filters on graphs that specialize on frequency bands of interest. The model generates rich spectral filters that are localized in space, scales linearly with the size of the input data for sparsely-connected graphs, and can handle different constructions of Laplacian operators. Description adapted from: CayleyNets: Graph Convolutional Neural Networks with Complex Rational Spectral Filters

GraphsIntroduced 20001 papers

NN4G

Neural network for graphs

NN4G is based on a constructive feedforward architecture with state variables that uses neurons with no feedback connections. The neurons are applied to the input graphs by a general traversal process that relaxes the constraints of previous approaches derived by the causality assumption over hierarchical input data. Description from: Neural network for graphs: a contextual constructive approach

GraphsIntroduced 2009

Local Augmentation

Local Augmentation for Graph Neural Networks, or LA-GNN, is a data augmentation technique that enhances node features by its local subgraph structures. Specifically, it learns the conditional distribution of the connected neighbors’ representations given the representation of the central node, which has an analogy with the Skip-gram of word2vec model that predicts the probability of the context given the central word. After augmenting the neighborhood, we concat the initial and the generated feature matrix as input for GNNs.

GraphsIntroduced 2000
PreviousPage 2 of 2