Constructing networks from correlation matrices

S. Kojaku and N. Masuda. Proceedings of the Royal Society A (2019) [in press]
Naoki Masuda, Sadamori Kojaku, Yukie Sano. Physical Review E, 98, 012312 (2018)

Many networks of scientific interest have been constructed from correlation matrices. Examples include protein interaction networks, gene co-expression networks, functional brain networks, climate networks and asset graphs. In correlation networks, the edges indicate nontrivial correlations between the features of the nodes. A widely accepted approach for constructing networks from correlation matrices is the thresholding; one places an edge between two nodes if the correlation is larger than a threshold. However, the thresholding may induce spurious edges in the generated networks because a large correlation may be induced by a factor irrelevant to the node pair such as a global trend and pseudo correlations. An example is a large correlation between ice cream sale and murder rate; people buy more ice cream and interact with more people when it is hotter.

We developed an algorithm for constructing networks from correlation matrices, which we refer to as Scola. A key feature of the algorithm is to use the null models for correlation matrices, based on which we decide whether to place edges between nodes or not. In other words, we place edges between nodes whose correlations are unexpected for the null models. Users can run the algorithm with a null model of interest or let the algorithm to choose statistically the best null model among candidate null models. For three economic data, the algorithm produces statistically better networks and more accurately predicts country-level product export volumes.

Figure: Sample correlation matrices and networks for the product space data. The solid lines inside the matrices indicate the boundary between year t (first half) and year t+10 (second half). The node colour indicates the product type. The value of the PRODY index averaged over the nodes in the same type is shown in the parentheses.

Core-periphery structure of networks

Sadamori Kojaku, Naoki Masuda. New Journal of Physics, 20, 043012 (2018)
Sadamori Kojaku, Naoki Masuda. Physical Revew E, 96, 052313 (2017)

Core-periphery structure is a mesoscale structure of networks, where a core is a group of densely interconnected nodes, and a periphery is a group of sparsely interconnected nodes. Core-periphery structure has been found in various empirical networks such as social networks, biological networks and transportation networks [1, 2]. Many studies have assumed that networks consist of one core and one periphery.
However, networks may contain multiple core-periphery pairs.

We developed an algorithm to find multiple core-periphery pairs in networks [3]. In the political blog network [4] shown in Fig. 1, the algorithm found two core-periphery pairs, each of which consists of the blogs sharing the same political agenda. We applied this algorithm or its variant to an inter-bank trade network [5], a maritime transportation network [6] and a short-text classification task [7].

We also showed that heterogeneous degree distributions alone explain single core-periphery structure [8]. In other words, when one says that a network is composed of a single core and a single periphery, hubs (i.e. nodes with large degrees, or the number of edges) are core nodes and nodes with small degrees are peripheral nodes. We proved that a core-periphery structure that is not merely explained by heterogeneous degree distributions necessarily involves at least three blocks of nodes. An example is a combination of a single core, a single periphery and a community, which yields three blocks. Another example is two core-periphery pairs, which yields four blocks in total. Based on this result, we extended our first algorithm [7] to find multiple core-periphery pairs in networks that are not merely explained by heterogeneous degree distributions [8]. The extended algorithm allows one to find small-degree core nodes and large-degree peripheral nodes.

Figure: Two core-periphery pairs in the political blog network detected by our algorithm. The filled or open circles indicate core nodes or peripheral nodes, respectively.
The colour of each circle indicates the core-periphery pair to which the node belongs. The open circles with a grey border indicate nodes that do not belong to any core-periphery pair.

  1. P. Csermely, A. London, L.-Y. Wu, and B. Uzzi, Journal of Complex Networks, 1, 93 (2013) [link]
  2. M. P. Rombach, M. A. Porter, J. H. Fowler, P. J. Mucha, `SIAM Review, 59, 619-646 (2017) [link]
  3. S. Kojaku and N. Masuda, Physical Review E, 96, 052313 (2017) [link]
  4. L. A. Adamic and N. Glance, in Proceedings of the 3rd International Workshop on Link Discovery, 36–43 (ACM, New York, USA, 2005) [link]
  5. S. Kojaku, G. Cimini, G. Caldarelli, N. Masuda, Journal of Network Theory in Finance, 4, 33-51 (2018) [link]
  6. X. Cui, S. Kojaku, N. Masuda, and D. Bollegala, in Proceedings of the 7th Joint Conference on Lexical and Computational Semantics, 225-264 (ACL, New Orleans, USA, 2018) [link]
  7. S. Kojaku, M. Xu, H. Xia, N. Masuda, `Preprint arXiv:1808.04549 (2018) [link]
  8. S. Kojaku and N. Masuda, `New Journal of Physics 20, 043012 (2018) [link]