September 11, 2025
Advanced Topics in Network Science
Sadamori Kojaku
skojaku@binghamton.edu
Quiz
Let us denote the adjacency list of a directed graph as follows:
where \(i, j, k, \ldots\) are the nodes in a graph, and there are directed edge from \(i\) to \(j\) and from \(i\) to \(k\).
Identify the number of strongly connected components in the following graph.
Stanley Milgram (1933-1984)
Despite hundreds of millions of people in the US, their social network was remarkably compact!
Modern Confirmations:
Play the game: WikiRace
Question 🤔:
Why are people in the world connected by a small number of steps?
Think about your family tree:
How many ancestors do you have in each generation?
Wait, think about 100 generations back
Then, what’s wrong with the estimate 🤔?
Local clustering asks: given all your friends, how many of triangles you and your friends form, relative to the maximum possible number of triangles?
\[ C_i = \dfrac{\text{\# of triangles involving } i \text{ and its neighbors}}{\text{\# of edges possibly exist in the neighborhood of } i} \]
What are the local clustering coefficients of A, B and C?
\[ C_i = \dfrac{\text{# of triangles involving } i \text{ and its neighbors}}{\text{# of edges possibly exist in the neighborhood of } i} \]
Average clustering coefficient is the average of the local clustering coefficients of all nodes.
\[ \overline{C} = \dfrac{1}{N} \sum_{i} C_i \]
\[ \overline{C} = \frac{1}{6}\left( \underbrace{\frac{1}{3}}_{A} + \underbrace{\frac{1}{3}}_{B} + \underbrace{0}_{C} + \underbrace{\frac{1}{3}}_{D} + \underbrace{0}_{E} + \underbrace{\frac{2}{3}}_{F} \right) = \frac{5}{18} \]
Global clustering coefficient focuses on the total number of triangles in the network.
\[ C = \frac{3 \times \text{number of triangles}}{\text{number of connected triplets}} = \frac{3 \times \text{number of triangles}}{\sum_{i} k_i(k_i-1)/2} \]
where \(k_i\) is the degree of node \(i\).
Connected triplets = Three nodes joined by at least two edges. When counting, we distinguish the triplets by the node that is centered. A triangle counts as three triplets. A node with degree \(k\) has \(k(k-1)/2\) triplets.
Three types of clustering coefficients:
Local clustering coefficient \(\rightarrow\) Density of triangles in a node’s neighborhood
Average clustering coefficient \(\rightarrow\) Average of the local clustering
Global clustering coefficient \(\rightarrow\) Density of triangles in the entire network
Question:
If a network has a high global clustering coefficient, does it necessarily have a high average local clustering coefficient?
If not, can you draw a network with high global clustering but low average local clustering coefficient?
Now, let’s quantify the global connectivity via the average path length.
Distance between two nodes \(i\) and \(j\) is the minimum number of edges you need to traverse to get from one node to the other
Let’s find the distance between A and D:
Even though there are multiple paths, the distance from A to D is 2 edges.
Average path length \(L\) is the average distance between any two nodes:
\[ \overline{L} = \frac{2}{N(N-1)} \sum_{i<j} d_{ij} \]
where \(d_{ij}\) is the shortest path length between node \(i\) and node \(j\), and \(N\) is the number of nodes in the network.
\[ \begin{aligned} L &= \frac{1}{6} \left( \underbrace{1}_{A-B} + \underbrace{1}_{A-C} + \underbrace{2}_{A-D} + \underbrace{1}_{B-C} + \underbrace{1}_{B-D} + \underbrace{1}_{C-D} \right) \\ &= \frac{7}{6} \simeq 1.16 \end{aligned} \]
Small-world networks are networks that have both high clustering coefficient and short average path length.
And we can quantify the “small-worldness” of a network by, for example,
\[ \sigma_{\text{naive}} = \dfrac{\text{average clustering coefficient}}{\text{average path length}} \]
\[ \sigma = \dfrac{\sigma_{\text{naive}}}{\sigma_{\text{random}}} \]
For Erdős-Rényi Random Graphs, we have:
\[ \sigma_{\text{random}} = \dfrac{\overline{C}_{\text{random}}}{\overline{L}_{\text{random}}},\quad \overline{C}_{\text{random}} \approx \dfrac{\langle k \rangle}{N-1},\quad \overline{L}_{\text{random}} \approx \dfrac{\ln N}{\ln \langle k \rangle} \]
where \(\langle k \rangle\) is the average degree. See the lecture note for the derivation.
Now, we have a way to quantify the small-worldness of a network.
But we still don’t know why small-world networks emerge.
“What I cannot create, I do not understand”
—Richard Feynman
What are the mechanism behind the small-world phenomenon 🤔?
Step 1: Create ring of \(N\) nodes connected to \(k\) nearest neighbors
Step 2: Randomly rewire each edge with probability \(p\)
The Mechanism:
Examples:
Next: We’ll learn to compute shortest paths and connected components using igraph
Throughout this course, we’ll primarily use igraph
, a mature and robust library originally developed for R and later ported to Python.
Marimo:
https://github.com/skojaku/adv-net-sci/notebooks/m02-small-world/starter.py
Open the terminal and run:
or via uv
Jupyter Notebook:
https://github.com/skojaku/adv-net-sci/notebooks/m02-small-world/starter.ipynb
import igraph
edge_list = [(0, 1), (1, 2), (0, 2), (0, 3)]
g = igraph.Graph() # Create an empty graph
g.add_vertices(4) # Add 4 vertices
g.add_edges(edge_list) # Add edges to the graph
Plot the graph.
Simple paths:
Shortest path:
Distance:
Find connected components:
Membership:
Size:
The largest connected component:
Local clustering coefficient:
Average clustering coefficient:
Global clustering coefficient:
n_ws = 30 # Number of nodes
k_ws = 6 # Number of nearest neighbors in the ring lattice
p_rewire = 0.1 # Probability of rewiring each edge
g_smallworld = igraph.Graph.Watts_Strogatz(
dim=1,
size=n_ws,
nei=k_ws // 2,
p=p_rewire,
)
Compute the small-worldness \(\sigma\) using the formula below:
\[ \sigma = \dfrac{\sigma_{\text{naive}}}{\sigma_{\text{random}}}, \; \text{where}\; \sigma_{\text{naive}} = \dfrac{\text{average clustering coefficient}}{\text{average path length}} \]
\[ \sigma_{\text{random}} = \dfrac{\overline{C}_{\text{random}}}{\overline{L}_{\text{random}}},\; \overline{C}_{\text{random}} \approx \dfrac{\langle k \rangle}{N-1},\; \overline{L}_{\text{random}} \approx \dfrac{\ln N}{\ln \langle k \rangle} \]
Module 03: Network Robustness
Does a network remain connected when one or more nodes fail?