Module Documentation

Copyright (c) 2025 Max Jerdee. All rights reserved.

directedstructure: Infer communities, hierarchies, and their connection in directed graphs

directedstructure.network_properties(G, **kwargs)[source]

Infer properties of the overall network structure. Returns a pandas DataFrame with the inferred parameters and their uncertainties, including those fixed by the model specification.

These are:
  • num_groups: number of groups in the network

  • density_in: average density of edges within groups

  • density_out: average density of edges between groups

  • variation_in: variation in the internal mixing weights of nodes within groups

  • variation_out: variation in the external mixing weights of nodes within groups

  • degree_correction: level of degree correction, or in-group degree inequality

  • mean_degree_scaling: scaling of mean degree with group size, where mean degree of nodes in a group of size n scales as n^mean_degree_scaling

  • individual_depth: spread of individual scores within groups

  • group_depth: spread of group scores

  • ties_parameter: frequency of ties in the hierarchy model

Parameters:
  • G (nx.Graph) – NetworkX graph to find group structure of.

  • kwargs (optional) – Additional keyword arguments to specify the model configuration or sampling. See samples() for full options.

Return type:

DataFrame

directedstructure.node_properties(G, consensus_clustering_metric='L1', **kwargs)[source]

Infer properties of individual nodes in the network. Returns a pandas DataFrame with the consensus group of each node, its score within the hierarchy, and the average status of their group.

Parameters:
  • G (nx.Graph) – NetworkX graph to find group structure of.

  • consensus_clustering_metric (str, optional) – Metric to use for finding consensus clustering. Options: {‘L1’, ‘L2’}. Defaults to ‘L1’.

  • kwargs (optional) – Additional keyword arguments to specify the model configuration or sampling. See samples() for full options.

Return type:

DataFrame

directedstructure.samples(G, *, groups_model='general_canonical', hierarchy_model=<object object>, interaction='coupled', assortative=<object object>, mixing_variation=<object object>, variation_in=<object object>, variation_out=<object object>, degree_correction=<object object>, mean_degree_scaling=<object object>, individual_depth=<object object>, group_depth=<object object>, ties_parameter=<object object>, num_groups=<object object>, initial_partition=None, initial_scores=None, seed=None, num_samples=1000, sweeps_per_sample=10, merge_split_enabled=True, beta=1.0, num_tempering_chains=1, no_cache=False, timeout=60.0, verbose=False)[source]

Get pandas DataFrame of samples from the posterior distribution given the provided network. Returns cached samples if the function has been called before with the same parameters. Provide a new seed to generate independent samples.

Parameters:
  • G (nx.Graph) – NetworkX graph to find group structure of.

  • groups_model (str, optional) –

    Type of model to fit. Choices fix parameters to special cases of interest, but can be overridden by providing specific parameter values. Options and corrresponding parameter values:

    • ’general_canonical’: All parameters inferred except mean_degree_scaling, which is set to 0 (canonical model) [Default]

    • ’traditional_SBM’: assortative = False, mixing_variation = ‘simple’, degree_correction = 0, mean_degree_scaling = 0

    • ’traditional_DCSBM’: assortative = False, mixing_variation = ‘simple’, degree_correction = 0.5, mean_degree_scaling = 0 (traditional degree-corrected stochastic block model)

    • ’traditional_GDCSBM’: assortative = False, mixing_variation = ‘simple’, degree_correction = None, mean_degree_scaling = 0 (infer degree correction, generalized)

    • ’simple_ASBM’: assortative = True, mixing_variation = ‘simple’, degree_correction = 0, mean_degree_scaling = 0 (assortative with simple mixing variation)

    • ’hybrid_ASBM’: assortative = True, mixing_variation = ‘internal’, degree_correction = 0, mean_degree_scaling = 0 (assortative with internal mixing variation, Zhang and Peixoto 2020)

    • ’planted_partition’: assortative = True, mixing_variation = ‘none’, degree_correction = 0, mean_degree_scaling = 0

    • ’general_ASBM’: assortative = True, mixing_variation = ‘none’, degree_correction = 0, mean_degree_scaling = 0

    • ’microcanonical_SBM’: assortative = False, mixing_variation = ‘simple’, degree_correction = 0, mean_degree_scaling = -1 (microcanonical SBM)

    • ’microcanonical_DCSBM’: assortative = False, mixing_variation = ‘simple’, degree_correction = 0.5, mean_degree_scaling = -1 (microcanonical degree-corrected SBM)

    • ’general_unified’: All parameters inferred including mean_degree_scaling (generalizes all models considered)

    Defaults to ‘general_canonical’. If a different model_name is provided, the corresponding parameters are set unless explicitly overridden.

  • hierarchy_model (str, optional) –

    Type of hierarchy model to fit. Options:
    • ’bradley_terry’: Bradley-Terry model for pairwise comparisons

    • ’bradley_terry_ties’: Bradley-Terry model with ties

    Defaults to ‘bradley_terry_ties’ with neutral interactions are present and ‘bradley_terry’ otherwise. If neutral interactions are present and ‘bradley_terry’ is chosen only dominant interactions are used.

  • interaction (str, optional) –

    Type of coupling between groups and hierarchy to use. Options:
    • ’coupled’: groups and hierarchy are coupled (default)

    • ’independent’: groups and hierarchy are inferred independently

  • assortative (bool, optional) – Whether to allow the model to consider generically assortative group structure. If assortative = False, fix the overall in and out group densities to be equal: rho_in = rho_out. Defaults to True.

  • mixing_variation (str, optional) –

    Type of variation in mixing weights to use. Controls variation_in, variation_out parameters. Options:

    • ’general’: infers variation_in and variation_out

    • ’simple’: fixes variation_in = variation_out = 0.5

    • ’none’: fixes variation_in = variation_out = 0

    • ’internal’: fixes variation_in = 0.5, variation_out = 0

    • ’external’: fixes variation_in = 0, variation_out = 0.5

    Defaults to ‘general’.

  • variation_in (float, optional) – Value of the variation_in (intra-group mixing variation) parameter. Must be from 0 to 1. variation_in = 0 yields no variation in internal mixing weights, as found in the planted partition model, while variation_in = 0.5 corresponds to typical variation as found in the traditional SBM. A value of None allows the parameter to vary freely. Defaults to None.

  • variation_out (float, optional) – Value of the variation_out (inter-group mixing variation) parameter. Must be from 0 to 1. variation_out = 0 yields no variation in external mixing weights, as found in the planted partition model, while variation_out = 0.5 corresponds to typical variation as found in the traditional SBM. A value of None allows the parameter to vary freely. Defaults to None.

  • degree_correction (float, optional) – Value of the degree_correction (in-group degree inequality) parameter. Must be from 0 to 1. A value of 0 corresponds to no degree correction and a value of 0.5 corresponds to typical degree correction. A value of None allows the parameter to vary freely. Defaults to None.

  • mean_degree_scaling (float, optional) – Value of the mean_degree_scaling (gamma) parameter. Mean degree of nodes in a group of size n scales as n^mean_degree_scaling. Traditional canonical models have gamma = 0, while microcanonical models have gamma = -1. A value of None allows the parameter to vary freely. Defaults to 0.

  • individual_depth (float, optional) – Value of the individual_depth parameter controlling the spread of individual scores within groups. Must be positive. A value of None allows the parameter to vary freely. Defaults to 1.0.

  • group_depth (float, optional) – Value of the group_depth parameter controlling the spread of group scores. Must be non-negative. A value of None allows the parameter to vary freely. Defaults to 0.0, which corresponds to independent groups and hierarchy.

  • ties_parameter (float, optional) – Value of the ties_parameter controlling the frequency of ties in the hierarchy model. A value of None allows the parameter to vary freely. Defaults to 0.0 (no ties, Bradley-Terry model).

  • num_groups (int, optional) – Number of communities (q) in the graph. If None, the number of communities is inferred from the data. Defaults to None.

  • initial_partition (List[int], optional) – Initial partition of the graph. If None, a modularity maximized partition is used. Defaults to None.

  • initial_scores (List[float], optional) – Initial scores of the nodes for the hierarchy model. If None, win percentages are used. Defaults to None.

  • seed (int, optional) – Random seed for reproducibility. Defaults to None.

  • num_samples (int, optional) – Number of samples to take. Defaults to 1000.

  • sweeps_per_sample (int, optional) – Number of MCMC sweeps to perform between each sample. Defaults to 10.

  • merge_split_enabled (bool, optional) – Whether to use merge-split moves in the MCMC sampling. Defaults to True.

  • beta (float, optional) – Inverse temperature. Defaults to 1.0.

  • num_tempering_chains (int, optional) – Number of parallel tempering chains to use. Defaults to 1, no parallel tempering. If greater than 1, samples will include their value of beta ranging from 0 to the provided beta.

  • no_cache (bool, optional) – If True, do not use cached samples. Defaults to False.

  • timeout (float, optional) – Maximum time in seconds to take samples for before timing out. Defaults to 60.

  • verbose (bool, optional) – Whether to print progress and debug information. Defaults to False.

Returns:

DataFrame of posterior samples. Each row is a sample, and each column is a variable.

Return type:

pd.DataFrame