Assessment of model fit via network comparison methods based on subgraph counts

Luis Ospina-Forero, Charlotte M. Deane, Gesine Reinert, Tiago Peixoto

Research output: Contribution to journalArticlepeer-review

Abstract (may include machine translation)

While the number of network comparison methods is increasing, benchmarking of these methods is still in its infancy. The lack of understanding of complex dependencies among network characteristics makes it difficult to fully understand the meaning of the different network comparison methodologies and the relations between them. In this article, we use a Monte Carlo framework as a way to address three general questions about the network comparison methods based on subgraph counts: (1) Can the methods differentiate between networks generated from different network generation mechanisms? (2) Are the number of nodes or average degree, confounding factors for the comparison of networks? (3) Do all methods reach the same conclusions? We further use the Monte Carlo framework to test the fit of ER, Chung-Lu and a duplication-divergence model to the protein-protein interaction (PPI) networks of Yeast, Fly, Worm, Human, Escherichia Coli, five herpes virus networks and five social networks. In contrast to previous claims in the literature, we show that the large PPI networks are not well modelled by the Chung-Lu model according to any of our tested methods. We find that network comparison statistics are not completely invariant to changes in the number of nodes and edges. Some methods focus on fine grain similarities, such as graphlet correlation distance, while other methods such as Netdis, can capture the similarities of networks despite them having different numbers of nodes and edges.

Original languageEnglish
Article numbercny017
Pages (from-to)226-253
Number of pages28
JournalJournal of Complex Networks
Volume7
Issue number2
DOIs
StatePublished - 1 Apr 2019

Keywords

  • model fit
  • network comparison
  • subgraph counts

Fingerprint

Dive into the research topics of 'Assessment of model fit via network comparison methods based on subgraph counts'. Together they form a unique fingerprint.

Cite this