Module Documentation¶

directedstructure: Infer communities, hierarchies, and their connection in directed graphs

directedstructure.network_properties(G, **kwargs)[source]¶

Infer properties of the overall network structure. Returns a pandas DataFrame with the inferred parameters and their uncertainties, including those fixed by the model specification.

These are:

num_groups: number of groups in the network
density_in: average density of edges within groups
density_out: average density of edges between groups
variation_in: variation in the internal mixing weights of nodes within groups
variation_out: variation in the external mixing weights of nodes within groups
degree_correction: level of degree correction, or in-group degree inequality
mean_degree_scaling: scaling of mean degree with group size, where mean degree of nodes in a group of size n scales as n^mean_degree_scaling
individual_depth: spread of individual scores within groups
group_depth: spread of group scores
ties_parameter: frequency of ties in the hierarchy model

Parameters:

G (nx.Graph) – NetworkX graph to find group structure of.
kwargs (optional) – Additional keyword arguments to specify the model configuration or sampling. See samples() for full options.

Return type:

DataFrame

directedstructure.node_properties(G, consensus_clustering_metric='L1', **kwargs)[source]¶

Infer properties of individual nodes in the network. Returns a pandas DataFrame with the consensus group of each node, its score within the hierarchy, and the average status of their group.

Parameters:

G (nx.Graph) – NetworkX graph to find group structure of.
consensus_clustering_metric (str, optional) – Metric to use for finding consensus clustering. Options: {‘L1’, ‘L2’}. Defaults to ‘L1’.
kwargs (optional) – Additional keyword arguments to specify the model configuration or sampling. See samples() for full options.

Return type:

DataFrame

directedstructure.samples(G, *, groups_model='general_canonical', hierarchy_model=<object object>, interaction='coupled', assortative=<object object>, mixing_variation=<object object>, variation_in=<object object>, variation_out=<object object>, degree_correction=<object object>, mean_degree_scaling=<object object>, individual_depth=<object object>, group_depth=<object object>, ties_parameter=<object object>, num_groups=<object object>, initial_partition=None, initial_scores=None, seed=None, num_samples=1000, sweeps_per_sample=10, merge_split_enabled=True, beta=1.0, num_tempering_chains=1, no_cache=False, timeout=60.0, verbose=False)[source]¶

Get pandas DataFrame of samples from the posterior distribution given the provided network. Returns cached samples if the function has been called before with the same parameters. Provide a new seed to generate independent samples.

Parameters:

G (nx.Graph) – NetworkX graph to find group structure of.
groups_model (str, optional) –
Type of model to fit. Choices fix parameters to special cases of interest, but can be overridden by providing specific parameter values. Options and corrresponding parameter values:
- ’general_canonical’: All parameters inferred except mean_degree_scaling, which is set to 0 (canonical model) [Default]
- ’traditional_SBM’: assortative = False, mixing_variation = ‘simple’, degree_correction = 0, mean_degree_scaling = 0
- ’traditional_DCSBM’: assortative = False, mixing_variation = ‘simple’, degree_correction = 0.5, mean_degree_scaling = 0 (traditional degree-corrected stochastic block model)
- ’traditional_GDCSBM’: assortative = False, mixing_variation = ‘simple’, degree_correction = None, mean_degree_scaling = 0 (infer degree correction, generalized)
- ’simple_ASBM’: assortative = True, mixing_variation = ‘simple’, degree_correction = 0, mean_degree_scaling = 0 (assortative with simple mixing variation)
- ’hybrid_ASBM’: assortative = True, mixing_variation = ‘internal’, degree_correction = 0, mean_degree_scaling = 0 (assortative with internal mixing variation, Zhang and Peixoto 2020)
- ’planted_partition’: assortative = True, mixing_variation = ‘none’, degree_correction = 0, mean_degree_scaling = 0
- ’general_ASBM’: assortative = True, mixing_variation = ‘none’, degree_correction = 0, mean_degree_scaling = 0
- ’microcanonical_SBM’: assortative = False, mixing_variation = ‘simple’, degree_correction = 0, mean_degree_scaling = -1 (microcanonical SBM)
- ’microcanonical_DCSBM’: assortative = False, mixing_variation = ‘simple’, degree_correction = 0.5, mean_degree_scaling = -1 (microcanonical degree-corrected SBM)
- ’general_unified’: All parameters inferred including mean_degree_scaling (generalizes all models considered)
Defaults to ‘general_canonical’. If a different model_name is provided, the corresponding parameters are set unless explicitly overridden.
hierarchy_model (str, optional) –
Type of hierarchy model to fit. Options:
- ’bradley_terry’: Bradley-Terry model for pairwise comparisons
- ’bradley_terry_ties’: Bradley-Terry model with ties
Defaults to ‘bradley_terry_ties’ with neutral interactions are present and ‘bradley_terry’ otherwise. If neutral interactions are present and ‘bradley_terry’ is chosen only dominant interactions are used.
interaction (str, optional) –
Type of coupling between groups and hierarchy to use. Options:
- ’coupled’: groups and hierarchy are coupled (default)
- ’independent’: groups and hierarchy are inferred independently
assortative (bool, optional) – Whether to allow the model to consider generically assortative group structure. If assortative = False, fix the overall in and out group densities to be equal: rho_in = rho_out. Defaults to True.
mixing_variation (str, optional) –
Type of variation in mixing weights to use. Controls variation_in, variation_out parameters. Options:
- ’general’: infers variation_in and variation_out
- ’simple’: fixes variation_in = variation_out = 0.5
- ’none’: fixes variation_in = variation_out = 0
- ’internal’: fixes variation_in = 0.5, variation_out = 0
- ’external’: fixes variation_in = 0, variation_out = 0.5
Defaults to ‘general’.
variation_in (float, optional) – Value of the variation_in (intra-group mixing variation) parameter. Must be from 0 to 1. variation_in = 0 yields no variation in internal mixing weights, as found in the planted partition model, while variation_in = 0.5 corresponds to typical variation as found in the traditional SBM. A value of None allows the parameter to vary freely. Defaults to None.
variation_out (float, optional) – Value of the variation_out (inter-group mixing variation) parameter. Must be from 0 to 1. variation_out = 0 yields no variation in external mixing weights, as found in the planted partition model, while variation_out = 0.5 corresponds to typical variation as found in the traditional SBM. A value of None allows the parameter to vary freely. Defaults to None.
degree_correction (float, optional) – Value of the degree_correction (in-group degree inequality) parameter. Must be from 0 to 1. A value of 0 corresponds to no degree correction and a value of 0.5 corresponds to typical degree correction. A value of None allows the parameter to vary freely. Defaults to None.
mean_degree_scaling (float, optional) – Value of the mean_degree_scaling (gamma) parameter. Mean degree of nodes in a group of size n scales as n^mean_degree_scaling. Traditional canonical models have gamma = 0, while microcanonical models have gamma = -1. A value of None allows the parameter to vary freely. Defaults to 0.
individual_depth (float, optional) – Value of the individual_depth parameter controlling the spread of individual scores within groups. Must be positive. A value of None allows the parameter to vary freely. Defaults to 1.0.
group_depth (float, optional) – Value of the group_depth parameter controlling the spread of group scores. Must be non-negative. A value of None allows the parameter to vary freely. Defaults to 0.0, which corresponds to independent groups and hierarchy.
ties_parameter (float, optional) – Value of the ties_parameter controlling the frequency of ties in the hierarchy model. A value of None allows the parameter to vary freely. Defaults to 0.0 (no ties, Bradley-Terry model).
num_groups (int, optional) – Number of communities (q) in the graph. If None, the number of communities is inferred from the data. Defaults to None.
initial_partition (List[int], optional) – Initial partition of the graph. If None, a modularity maximized partition is used. Defaults to None.
initial_scores (List[float], optional) – Initial scores of the nodes for the hierarchy model. If None, win percentages are used. Defaults to None.
seed (int, optional) – Random seed for reproducibility. Defaults to None.
num_samples (int, optional) – Number of samples to take. Defaults to 1000.
sweeps_per_sample (int, optional) – Number of MCMC sweeps to perform between each sample. Defaults to 10.
merge_split_enabled (bool, optional) – Whether to use merge-split moves in the MCMC sampling. Defaults to True.
beta (float, optional) – Inverse temperature. Defaults to 1.0.
num_tempering_chains (int, optional) – Number of parallel tempering chains to use. Defaults to 1, no parallel tempering. If greater than 1, samples will include their value of beta ranging from 0 to the provided beta.
no_cache (bool, optional) – If True, do not use cached samples. Defaults to False.
timeout (float, optional) – Maximum time in seconds to take samples for before timing out. Defaults to 60.
verbose (bool, optional) – Whether to print progress and debug information. Defaults to False.

Returns:

DataFrame of posterior samples. Each row is a sample, and each column is a variable.

Return type:

pd.DataFrame