Generative Models for Graphs

David White & Richard Wilson

The goal of this research is to develop methods of constructing generative models over sets of relational graphs. In other words, given a set of graphs how can we generate new graphs that are part of the distribution of the original set?

To give a simple example, suppose we have a set of graphs describing various facial expressions of a person. We would then like to be able to generate new graphs that describe similar but previously unseen facial expressions. We apply statistical techniques to ensure that the new graphs generated are part of the distribution of the original set. In terms of the example above, if the majority of the facial expressions described in the set were happy then we would expect to generate mostly happy facial expressions. To accomplish this there are two steps involved 1) constructing a statistical model of the set of graphs that captures the underlying structural variations present and 2) sampling from this model to generate new examples that are part of the distribution of the original set.

The remainder of this page describes two methods of accomplishing this.

Vectorial Generative Models for Graphs

This work is based on transforming graphs into vectorial representations and then using statistical methods to construct distributions over the resulting vector space. New graphs may then be generated by sampling from this space. However, transforming a graph into a representation where statistical techniques can be applied is a hard problem for the following reasons: 1) Arbitrary vertex ordering: Graphs have no natural ordering over the vertices so if we want to compare two graphs we must first find a mapping between the two sets of vertices. This is known as graph alignment. 2) Arbitrary graph size: The size of graphs (the number of vertices) can vary dramatically and therefore we require a representation that can accommodate graphs of all sizes. This problem is generally solved by padding graphs with dummy vertices thus making them equal size. 3) Graph structure. The representation should fully describe the structure of the graph. If these problems are overcome then a vectorial representation of each input graph can be produced and a distribution describing the variations in the resulting vector space can be constructed.

At this stage we may sample from the distribution to generate new vectorial representations of graphs. However, there is no guarantee that a generated vector will describe a graph correctly. Therefore a final step must be performed where the generated vector is projected onto the nearest correct graph.

In the papers below we compare a number of different approaches to the construction of a vectorial generative model over a graph set. The naive approach is to vectorise the adjacency matrix or a similar representation and then construct a standard distribution over the vector space. We propose a more sophisticated approach where the spectral decomposition of a graph is modeled by two distributions over vector spaces. Finally, we examine the use of distributions on manifolds using the exponential map. These approaches are validated on both synthetic data and graphs from the COIL database.

Parts-based Generative Models for Graphs

In this work we examine the problem of creating parts-based generative models over sets of graphs. We model the variation in a set of graphs by observing which subgraphs are present in each graph and how these subgraphs are connected. By performing clustering on the subgraphs we can group those with similar structure. Distributions are then defined on the clusters present in each graph, which subgraphs are present in each cluster and the way subgraphs are connected. A new graph can then be generated by sampling from each of the three distributions. We show the utility of our approach on synthetically generated point sets and point sets derived from real-world imagery of articulated objects. Due to the decomposition based approach this method is particularly effective on large graphs.