Skip to content

algorithms #

Algorithmic functions that can be run on Raphtory graphs

Classes#

Matching#

A Matching (i.e., a set of edges that do not share any nodes)

Infected#

Functions#

dijkstra_single_source_shortest_paths(graph, source, targets, direction='both', weight='weight') #

Finds the shortest paths from a single source to multiple targets in a graph.

Parameters:

Name Type Description Default
graph GraphView

The graph to search in.

required
source NodeInput

The source node.

required
targets list[NodeInput]

A list of target nodes.

required
direction Direction

The direction of the edges to be considered for the shortest path. Defaults to "both".

'both'
weight str

The name of the weight property for the edges. Defaults to "weight".

'weight'

Returns:

Type Description
NodeStateWeightedSP

Mapping from nodes to a tuple containing the total cost and the nodes representing the shortest path.

global_reciprocity(graph) #

Reciprocity - measure of the symmetry of relationships in a graph, the global reciprocity of the entire graph. This calculates the number of reciprocal connections (edges that go in both directions) in a graph and normalizes it by the total number of directed edges.

Parameters:

Name Type Description Default
graph GraphView

a directed Raphtory graph

required

Returns:

Type Description
float

reciprocity of the graph between 0 and 1.

betweenness_centrality(graph, k=None, normalized=True) #

Computes the betweenness centrality for nodes in a given graph.

Parameters:

Name Type Description Default
graph GraphView

A reference to the graph.

required
k int

Specifies the number of nodes to consider for the centrality computation. All nodes are considered by default.

None
normalized bool

Indicates whether to normalize the centrality values. Defaults to True.

True

Returns:

Type Description
NodeStateF64

Mapping from nodes to their betweenness centrality.

all_local_reciprocity(graph) #

Local reciprocity - measure of the symmetry of relationships associated with a node

This measures the proportion of a node's outgoing edges which are reciprocated with an incoming edge.

Parameters:

Name Type Description Default
graph GraphView

a directed Raphtory graph

required

Returns:

Type Description
NodeStateF64

Mapping of nodes to their reciprocity value.

triplet_count(graph) #

Computes the number of connected triplets within a graph

A connected triplet (also known as a wedge, 2-hop path) is a pair of edges with one node in common. For example, the triangle made up of edges A-B, B-C, C-A is formed of three connected triplets.

Parameters:

Name Type Description Default
graph GraphView

a Raphtory graph, treated as undirected

required

Returns:

Type Description
int

the number of triplets in the graph

local_triangle_count(graph, v) #

Implementations of various graph algorithms that can be run on a graph.

To run an algorithm simply import the module and call the function with the graph as the argument

Local triangle count - calculates the number of triangles (a cycle of length 3) a node participates in.

This function returns the number of pairs of neighbours of a given node which are themselves connected.

Parameters:

Name Type Description Default
graph GraphView

Raphtory graph, this can be directed or undirected but will be treated as undirected

required
v NodeInput

node id or name

required

Returns:

Type Description
int

number of triangles associated with node v

average_degree(graph) #

The average (undirected) degree of all nodes in the graph.

Note that this treats the graph as simple and undirected and is equal to twice the number of undirected edges divided by the number of nodes.

Parameters:

Name Type Description Default
graph GraphView

a Raphtory graph

required

Returns:

Type Description
float

the average degree of the nodes in the graph

directed_graph_density(graph) #

Graph density - measures how dense or sparse a graph is.

The ratio of the number of directed edges in the graph to the total number of possible directed edges (given by N * (N-1) where N is the number of nodes).

Parameters:

Name Type Description Default
graph GraphView

a directed Raphtory graph

required

Returns:

Type Description
float

Directed graph density of graph.

degree_centrality(graph) #

Computes the degree centrality of all nodes in the graph. The values are normalized by dividing each result with the maximum possible degree. Graphs with self-loops can have values of centrality greater than 1.

Parameters:

Name Type Description Default
graph GraphView

The graph view on which the operation is to be performed.

required

Returns:

Type Description
NodeStateF64

Mapping of nodes to their associated degree centrality.

max_degree(graph) #

Returns the largest degree found in the graph

Parameters:

Name Type Description Default
graph GraphView

The graph view on which the operation is to be performed.

required

Returns:

Type Description
int

The largest degree

min_degree(graph) #

Returns the smallest degree found in the graph

Parameters:

Name Type Description Default
graph GraphView

The graph view on which the operation is to be performed.

required

Returns:

Type Description
int

The smallest degree found

max_out_degree(graph) #

The maximum out degree of any node in the graph.

Parameters:

Name Type Description Default
graph GraphView

a directed Raphtory graph

required

Returns:

Type Description
int

value of the largest outdegree

max_in_degree(graph) #

The maximum in degree of any node in the graph.

Parameters:

Name Type Description Default
graph GraphView

a directed Raphtory graph

required

Returns:

Type Description
int

value of the largest indegree

min_out_degree(graph) #

The minimum out degree of any node in the graph.

Parameters:

Name Type Description Default
graph GraphView

a directed Raphtory graph

required

Returns:

Type Description
int

value of the smallest outdegree

min_in_degree(graph) #

The minimum in degree of any node in the graph.

Parameters:

Name Type Description Default
graph GraphView

a directed Raphtory graph

required

Returns:

Type Description
int

value of the smallest indegree

pagerank(graph, iter_count=20, max_diff=None, use_l2_norm=True, damping_factor=0.85) #

Pagerank -- pagerank centrality value of the nodes in a graph

This function calculates the Pagerank value of each node in a graph. See https://en.wikipedia.org/wiki/PageRank for more information on PageRank centrality. A default damping factor of 0.85 is used. This is an iterative algorithm which terminates if the sum of the absolute difference in pagerank values between iterations is less than the max diff value given.

Parameters:

Name Type Description Default
graph GraphView

Raphtory graph

required
iter_count int

Maximum number of iterations to run. Note that this will terminate early if convergence is reached. Defaults to 20.

20
max_diff Optional[float]

Optional parameter providing an alternative stopping condition. The algorithm will terminate if the sum of the absolute difference in pagerank values between iterations is less than the max diff value given.

None
use_l2_norm bool

Flag for choosing the norm to use for convergence checks, True for l2 norm, False for l1 norm. Defaults to True.

True
damping_factor float

The damping factor for the PageRank calculation. Defaults to 0.85.

0.85

Returns:

Type Description
NodeStateF64

Mapping of nodes to their pagerank value.

single_source_shortest_path(graph, source, cutoff=None) #

Calculates the single source shortest paths from a given source node.

Parameters:

Name Type Description Default
graph GraphView

A reference to the graph. Must implement GraphViewOps.

required
source NodeInput

The source node.

required
cutoff int

An optional cutoff level. The algorithm will stop if this level is reached.

None

Returns:

Type Description
NodeStateNodes

Mapping from end node to shortest path from the source node.

global_clustering_coefficient(graph) #

Computes the global clustering coefficient of a graph. The global clustering coefficient is defined as the number of triangles in the graph divided by the number of triplets in the graph.

Note that this is also known as transitivity and is different to the average clustering coefficient.

Parameters:

Name Type Description Default
graph GraphView

a Raphtory graph, treated as undirected

required

Returns:

Type Description
float

the global clustering coefficient of the graph

See also

Triplet Count

temporally_reachable_nodes(graph, max_hops, start_time, seed_nodes, stop_nodes=None) #

Temporally reachable nodes -- the nodes that are reachable by a time respecting path followed out from a set of seed nodes at a starting time.

This function starts at a set of seed nodes and follows all time respecting paths until either a) a maximum number of hops is reached, b) one of a set of stop nodes is reached, or c) no further time respecting edges exist. A time respecting path is a sequence of nodes v_1, v_2, ... , v_k such that there exists a sequence of edges (v_i, v_i+1, t_i) with t_i < t_i+1 for i = 1, ... , k - 1.

Parameters:

Name Type Description Default
graph GraphView

directed Raphtory graph

required
max_hops int

maximum number of hops to propagate out

required
start_time int

time at which to start the path (such that t_1 > start_time for any path starting from these seed nodes)

required
seed_nodes list[NodeInput]

list of node names or ids which should be the starting nodes

required
stop_nodes Optional[list[NodeInput]]

nodes at which a path shouldn't go any further

None

Returns:

Type Description
NodeStateReachability

Mapping of nodes to their reachability history.

temporal_bipartite_graph_projection(graph, delta, pivot_type) #

Projects a temporal bipartite graph into an undirected temporal graph over the pivot node type. Let G be a bipartite graph with node types A and B. Given delta > 0, the projection graph G' pivoting over type B nodes, will make a connection between nodes n1 and n2 (of type A) at time (t1 + t2)/2 if they respectively have an edge at time t1, t2 with the same node of type B in G, and |t2-t1| < delta.

Parameters:

Name Type Description Default
graph GraphView

A directed raphtory graph

required
delta int

Time period

required
pivot_type str

node type to pivot over. If a bipartite graph has types A and B, and B is the pivot type, the new graph will consist of type A nodes.

required

Returns:

Type Description
Graph

Projected (unipartite) temporal graph.

local_clustering_coefficient(graph, v) #

Local clustering coefficient - measures the degree to which nodes in a graph tend to cluster together.

The proportion of pairs of neighbours of a node who are themselves connected.

Parameters:

Name Type Description Default
graph GraphView

Raphtory graph, can be directed or undirected but will be treated as undirected.

required
v NodeInput

node id or name

required

Returns:

Type Description
float

the local clustering coefficient of node v in graph.

local_clustering_coefficient_batch(graph, v) #

Returns the Local clustering coefficient (batch, intersection) for each specified node in a graph. This measures the degree to which one or multiple nodes in a graph tend to cluster together.

Uses path-counting for its triangle-counting step.

weakly_connected_components(graph) #

Weakly connected components -- partitions the graph into node sets which are mutually reachable by an undirected path

This function assigns a component id to each node such that nodes with the same component id are mutually reachable by an undirected path.

Parameters:

Name Type Description Default
graph GraphView

Raphtory graph

required

Returns:

Type Description
NodeStateUsize

Mapping of nodes to their component ids.

strongly_connected_components(graph) #

Strongly connected components

Partitions the graph into node sets which are mutually reachable by an directed path

Parameters:

Name Type Description Default
graph GraphView

Raphtory graph

required

Returns:

Type Description
NodeStateUsize

Mapping of nodes to their component ids

in_components(graph) #

In components -- Finding the "in-component" of a node in a directed graph involves identifying all nodes that can be reached following only incoming edges.

Parameters:

Name Type Description Default
graph GraphView

Raphtory graph

required

Returns:

Type Description
NodeStateNodes

Mapping of nodes to the nodes in their 'in-component'

in_component(node) #

In component -- Finding the "in-component" of a node in a directed graph involves identifying all nodes that can be reached following only incoming edges.

Parameters:

Name Type Description Default
node Node

The node whose in-component we wish to calculate

required

Returns:

Type Description
NodeStateUsize

Mapping of nodes in the in-component to the distance from the starting node.

out_components(graph) #

Out components -- Finding the "out-component" of a node in a directed graph involves identifying all nodes that can be reached following only outgoing edges.

Parameters:

Name Type Description Default
graph GraphView

Raphtory graph

required

Returns:

Type Description
NodeStateNodes

Mapping of nodes to the nodes within their 'out-component'

out_component(node) #

Out component -- Finding the "out-component" of a node in a directed graph involves identifying all nodes that can be reached following only outgoing edges.

Parameters:

Name Type Description Default
node Node

The node whose out-component we wish to calculate

required

Returns:

Type Description
NodeStateUsize

A NodeState mapping the nodes in the out-component to their distance from the starting node.

fast_rp(graph, embedding_dim, normalization_strength, iter_weights, seed=None, threads=None) #

Computes embedding vectors for each vertex of an undirected/bidirectional graph according to the Fast RP algorithm. Original Paper: https://doi.org/10.48550/arXiv.1908.11512 Arguments: graph (GraphView): The graph view on which embeddings are generated. embedding_dim (int): The size (dimension) of the generated embeddings. normalization_strength (float): The extent to which high-degree vertices should be discounted (range: 1-0) iter_weights (list[float]): The scalar weights to apply to the results of each iteration seed (int, optional): The seed for initialisation of random vectors threads (int, optional): The number of threads to be used for parallel execution.

Returns:

Type Description
NodeStateListF64

Mapping from nodes to embedding vectors.

global_temporal_three_node_motif(graph, delta, threads=None) #

Computes the number of three edge, up-to-three node delta-temporal motifs in the graph, using the algorithm of Paranjape et al, Motifs in Temporal Networks (2017). We point the reader to this reference for more information on the algorithm and background, but provide a short summary below.

Motifs included:

Stars

There are three classes (in the order they are outputted) of star motif on three nodes based on the switching behaviour of the edges between the two leaf nodes.

  • PRE: Stars of the form i<->j, i<->j, i<->k (ie two interactions with leaf j followed by one with leaf k)
  • MID: Stars of the form i<->j, i<->k, i<->j (ie switching interactions from leaf j to leaf k, back to j again)
  • POST: Stars of the form i<->j, i<->k, i<->k (ie one interaction with leaf j followed by two with leaf k)

Within each of these classes is 8 motifs depending on the direction of the first to the last edge -- incoming "I" or outgoing "O". These are enumerated in the order III, IIO, IOI, IOO, OII, OIO, OOI, OOO (like binary with "I"-0 and "O"-1).

Two node motifs:

Also included are two node motifs, of which there are 8 when counted from the perspective of each node. These are characterised by the direction of each edge, enumerated in the above order. Note that for the global graph counts, each motif is counted in both directions (a single III motif for one node is an OOO motif for the other node).

Triangles:

There are 8 triangle motifs:

  1. i --> j, k --> j, i --> k
  2. i --> j, k --> j, k --> i
  3. i --> j, j --> k, i --> k
  4. i --> j, j --> k, k --> i
  5. i --> j, k --> i, j --> k
  6. i --> j, k --> i, k --> j
  7. i --> j, i --> k, j --> k
  8. i --> j, i --> k, k --> j

Parameters:

Name Type Description Default
graph GraphView

A directed raphtory graph

required
delta int

Maximum time difference between the first and last edge of the motif. NB if time for edges was given as a UNIX epoch, this should be given in seconds, otherwise milliseconds should be used (if edge times were given as string)

required
threads int

The number of threads to use when running the algorithm.

None

Returns:

Type Description
list[int]

A 40 dimensional array with the counts of each motif, given in the same order as described above. Note that the two-node motif counts are symmetrical so it may be more useful just to consider the first four elements.

Notes

This is achieved by calling the local motif counting algorithm, summing the resulting arrays and dealing with overcounted motifs: the triangles (by dividing each motif count by three) and two-node motifs (dividing by two).

global_temporal_three_node_motif_multi(graph, deltas, threads=None) #

Computes the global counts of three-edge up-to-three node temporal motifs for a range of timescales. See global_temporal_three_node_motif for an interpretation of each row returned.

Parameters:

Name Type Description Default
graph GraphView

A directed raphtory graph

required
deltas list[int]

A list of delta values to use.

required
threads int

The number of threads to use.

None

Returns:

Type Description
list[list[int]]

A list of 40d arrays, each array is the motif count for a particular value of delta, returned in the order that the deltas were given as input.

local_temporal_three_node_motifs(graph, delta, threads=None) #

Computes the number of each type of motif that each node participates in. See global_temporal_three_node_motifs for a summary of the motifs involved.

Parameters:

Name Type Description Default
graph GraphView

A directed raphtory graph

required
delta int

Maximum time difference between the first and last edge of the motif. NB if time for edges was given as a UNIX epoch, this should be given in seconds, otherwise milliseconds should be used (if edge times were given as string)

required

Returns:

Type Description
NodeStateMotifs

A mapping from nodes to lists of motif counts (40 counts in the same order as the global motif counts) with the number of each motif that node participates in.

Notes

For this local count, a node is counted as participating in a motif in the following way. For star motifs, only the centre node counts

the motif. For two node motifs, both constituent nodes count the motif. For triangles, all three constituent nodes count the motif.

hits(graph, iter_count=20, threads=None) #

HITS (Hubs and Authority) Algorithm:

AuthScore of a node (A) = Sum of HubScore of all nodes pointing at node (A) from previous iteration / Sum of HubScore of all nodes in the current iteration

HubScore of a node (A) = Sum of AuthScore of all nodes pointing away from node (A) from previous iteration / Sum of AuthScore of all nodes in the current iteration

Parameters:

Name Type Description Default
graph GraphView

Graph to run the algorithm on

required
iter_count int

How many iterations to run the algorithm. Defaults to 20.

20
threads int

Number of threads to use

None

Returns:

Type Description
NodeStateHits

A mapping from nodes their hub and authority scores

balance(graph, name='weight', direction='both') #

Sums the weights of edges in the graph based on the specified direction.

This function computes the sum of edge weights based on the direction provided, and can be executed in parallel using a given number of threads.

Parameters:

Name Type Description Default
graph GraphView

The graph view on which the operation is to be performed.

required
name str

The name of the edge property used as the weight. Defaults to "weight".

'weight'
direction Direction

Specifies the direction of the edges to be considered for summation. Defaults to "both". * "out": Only consider outgoing edges. * "in": Only consider incoming edges. * "both": Consider both outgoing and incoming edges. This is the default.

'both'

Returns:

Type Description
NodeStateF64

Mapping of nodes to the computed sum of their associated edge weights.

label_propagation(graph, seed=None) #

Computes components using a label propagation algorithm

Parameters:

Name Type Description Default
graph GraphView

A reference to the graph

required
seed bytes

Array of 32 bytes of u8 which is set as the rng seed

None

Returns:

Type Description
list[set[Node]]

A list of sets each containing nodes that have been grouped

k_core(graph, k, iter_count, threads=None) #

Determines which nodes are in the k-core for a given value of k

Parameters:

Name Type Description Default
graph GraphView

A reference to the graph

required
k int

Value of k such that the returned nodes have degree > k (recursively)

required
iter_count int

The number of iterations to run

required
threads int

number of threads to run on

None

Returns:

Type Description
list[Node]

A list of nodes in the k core

temporal_SEIR(graph, seeds, infection_prob, initial_infection, recovery_rate=None, incubation_rate=None, rng_seed=None) #

Simulate an SEIR dynamic on the network

The algorithm uses the event-based sampling strategy from https://doi.org/10.1371/journal.pone.0246961

Parameters:

Name Type Description Default
graph GraphView

the graph view

required
seeds int | float | list[NodeInput]

the seeding strategy to use for the initial infection (if int, choose fixed number of nodes at random, if float infect each node with this probability, if list initially infect the specified nodes

required
infection_prob float

the probability for a contact between infected and susceptible nodes to lead to a transmission

required
initial_infection int | str | datetime

the time of the initial infection

required
recovery_rate float | None

optional recovery rate (if None, simulates SEI dynamic where nodes never recover) the actual recovery time is sampled from an exponential distribution with this rate

None
incubation_rate float | None

optional incubation rate (if None, simulates SI or SIR dynamics where infected nodes are infectious at the next time step) the actual incubation time is sampled from an exponential distribution with this rate

None
rng_seed int | None

optional seed for the random number generator

None

Returns:

Type Description
NodeStateSEIR

Mapping from nodes to Infected objects for each infected node with attributes infected: the time stamp of the infection event active: the time stamp at which the node actively starts spreading the infection (i.e., the end of the incubation period) recovered: the time stamp at which the node recovered (i.e., stopped spreading the infection)

louvain(graph, resolution=1.0, weight_prop=None, tol=None) #

Louvain algorithm for community detection

Parameters:

Name Type Description Default
graph GraphView

the graph view

required
resolution float

the resolution parameter for modularity. Defaults to 1.0.

1.0
weight_prop str | None

the edge property to use for weights (has to be float)

None
tol None | float

the floating point tolerance for deciding if improvements are significant (default: 1e-8)

None

Returns:

Type Description
NodeStateUsize

Mapping of nodes to their community assignment

fruchterman_reingold(graph, iterations=100, scale=1.0, node_start_size=1.0, cooloff_factor=0.95, dt=0.1) #

Fruchterman Reingold layout algorithm

Parameters:

Name Type Description Default
graph GraphView

the graph view

required
iterations int | None

the number of iterations to run. Defaults to 100.

100
scale float | None

the scale to apply. Defaults to 1.0.

1.0
node_start_size float | None

the start node size to assign random positions. Defaults to 1.0.

1.0
cooloff_factor float | None

the cool off factor for the algorithm. Defaults to 0.95.

0.95
dt float | None

the time increment between iterations. Defaults to 0.1.

0.1

Returns:

Type Description
NodeLayout

A mapping from nodes to their [x, y] positions

cohesive_fruchterman_reingold(graph, iter_count=100, scale=1.0, node_start_size=1.0, cooloff_factor=0.95, dt=0.1) #

Cohesive version of fruchterman_reingold that adds virtual edges between isolated nodes Arguments: graph (GraphView): A reference to the graph iter_count (int): The number of iterations to run. Defaults to 100. scale (float): Global scaling factor to control the overall spread of the graph. Defaults to 1.0. node_start_size (float): Initial size or movement range for nodes. Defaults to 1.0. cooloff_factor (float): Factor to reduce node movement in later iterations, helping stabilize the layout. Defaults to 0.95. dt (float): Time step or movement factor in each iteration. Defaults to 0.1.

Returns:

Type Description
NodeLayout

A mapping from nodes to their [x, y] positions

max_weight_matching(graph, weight_prop=None, max_cardinality=True, verify_optimum_flag=False) #

Compute a maximum-weighted matching in the general undirected weighted graph given by "edges". If max_cardinality is true, only maximum-cardinality matchings are considered as solutions.

The algorithm is based on "Efficient Algorithms for Finding Maximum Matching in Graphs" by Zvi Galil, ACM Computing Surveys, 1986.

Based on networkx implementation https://github.com/networkx/networkx/blob/3351206a3ce5b3a39bb2fc451e93ef545b96c95b/networkx/algorithms/matching.py

With reference to the standalone protoype implementation from: http://jorisvr.nl/article/maximum-matching

http://jorisvr.nl/files/graphmatching/20130407/mwmatching.py

The function takes time O(n**3)

Parameters:

Name Type Description Default
graph GraphView

The graph to compute the maximum weight matching for

required
weight_prop str

The property on the edge to use for the weight. If not provided,

None
max_cardinality bool

If set to True, consider only maximum-cardinality matchings. Defaults to True. If True, finds the maximum-cardinality matching with maximum weight among all maximum-cardinality matchings, otherwise, finds the maximum weight matching irrespective of cardinality.

True
verify_optimum_flag bool

Whether the optimum should be verified. Defaults to False. If true prior to returning, an additional routine to verify the optimal solution was found will be run after computing the maximum weight matching. If it's true and the found matching is not an optimal solution this function will panic. This option should normally be only set true during testing.

False

Returns:

Type Description
Matching

The matching