upxo.analysis.analysis2d module

class upxo.analysis.analysis2d.metaa(pixel_size: float = 1.0, unit: str = 'px', frame_dt: float | None = None, material: str = '', sample_id: str = '', source: str = '', notes: str = '', seed: int = 42)[source]

Bases: object

pixel_size: float = 1.0
unit: str = 'px'
frame_dt: float | None = None
material: str = ''
sample_id: str = ''
source: str = ''
notes: str = ''
seed: int = 42
class upxo.analysis.analysis2d.principle_component_analysis[source]

Bases: object

class upxo.analysis.analysis2d.kmodel(G)[source]

Bases: object

G
gprop
characterize_graph(printout=True, k_char_level=None)[source]

Compute and optionally print summary graph characteristics, with optional distance-based metrics for smaller, connected graphs. This method populates the self.gprop dictionary with core structural properties of the graph (nodes, edges, density, clustering, assortativity). If k_char_level is set to “full” or “advanced”, and the graph is both small (< 5000 nodes) and connected, it additionally computes eccentricity- derived metrics (radius, diameter, center, periphery). :param printout: If True, prints the computed metrics to stdout. :type printout: bool, default True :param k_char_level: Controls whether to compute additional distance-based metrics.

  • None (default): Compute only core properties.

  • ‘full’ or ‘advanced’: Also compute eccentricity, radius, diameter, center, and periphery, subject to graph size and connectivity.

Parameters:
  • Returns

  • None – Results are stored in self.gprop and optionally printed.

  • Notes

  • -----------

  • self.gprop (dict) – Populated/updated with the following keys: - ‘num_nodes’ : int - ‘num_edges’ : int - ‘density’ : float - ‘avg_clustering_coeff’ : float - ‘degree_assortativity’ : float Additionally, when k_char_level in {‘full’, ‘advanced’} and conditions are met (|V| < 5000 and graph is connected): - ‘eccentricity’ : dict[node, int] - ‘radius’ : int - ‘diameter’ : int - ‘center’ : list[node] - ‘periphery’ : list[node]

Notes

  • Advanced metrics are skipped for graphs with 5000 or more nodes to avoid excessive runtime.

  • Eccentricity-based metrics require the graph to be connected. If the graph is not connected, those metrics are skipped and a message is printed.

  • The method assumes self.G is a NetworkX graph object.

load_mprop(mprop_df)[source]

Load a morphological property DataFrame into the model.

Parameters:

mprop_df (pandas.DataFrame) – DataFrame whose rows correspond to grains and columns to morphological properties (e.g. area, aspect_ratio, eccentricity). Stored as self.mprop for downstream correlation and PCA methods.

summary()[source]

Print a one-line description of the graph topology and return key flags.

Returns:

{'directed': bool, 'multigraph': bool} indicating whether self.G is directed and/or a multigraph.

Return type:

dict

shortest_path_length(see_distribution=True, figsize=(3, 2), kde=True)[source]

Compute all-pairs shortest path lengths and optionally plot their distribution.

Iterates over every source node and records the shortest-path distance to every reachable node. Results are stored in self.pathlengths.

Parameters:
  • see_distribution (bool, default True) – If True, displays a histogram (with optional KDE) of path lengths.

  • figsize (tuple of float, default (3, 2)) – Width and height of the matplotlib figure in inches.

  • kde (bool, default True) – Overlay a kernel density estimate on the histogram when see_distribution is True.

Returns:

Flat list of all pairwise shortest path lengths collected across every source node.

Return type:

list of int

shortest_path_between_two_nodes(source_node, target_node, weight='weight')[source]

Return the shortest path between two nodes (grain IDs) in the graph.

Uses Dijkstra’s algorithm via NetworkX, traversing grain-boundary edges weighted by weight.

Parameters:
  • source_node (int) – Starting grain ID (node) in self.G.

  • target_node (int) – Destination grain ID (node) in self.G.

  • weight (str, default 'weight') – Edge attribute to use as the path cost. Pass None to treat all edges as having unit weight.

Returns:

Ordered list of node IDs forming the shortest path from source_node to target_node, inclusive.

Return type:

list of int

Example

>>> path = kmod.shortest_path_between_two_nodes(source_node=1, target_node=10)
average_shortest_path_length(recalulate=False)[source]

Return the mean of all pairwise shortest path lengths.

Re-uses self.pathlengths if already computed, unless recalulate is True.

Parameters:

recalulate (bool, default False) – Force recomputation of path lengths even if self.pathlengths already exists.

Returns:

Mean shortest path length across all node pairs.

Return type:

float

see_pathlength_distribution(recalulate=False, figsize=(3, 2), kde=True, throw_hist=False)[source]

Plot the distribution of all-pairs shortest path lengths.

Parameters:
  • recalulate (bool, default False) – Force recomputation of path lengths even if self.pathlengths already exists.

  • figsize (tuple of float, default (3, 2)) – Width and height of the matplotlib figure in inches.

  • kde (bool, default True) – Overlay a kernel density estimate on the histogram.

  • throw_hist (bool, default False) – If True, return the seaborn AxesSubplot object for further customisation; otherwise return None.

Returns:

The histogram axes object when throw_hist is True, else None.

Return type:

seaborn.axisgrid.FacetGrid or None

extract_subgraphs(method='connected_neighbors', nids=1, radii=5, include_central_node=True, treat_undirected=True, validate=True, see_on_map=False, upxo_gs_object=None, figsize=(6, 6), dpi=100, throw_plt_object=False)[source]

Extract one or more subgraphs from self.G using the chosen strategy.

Parameters:
  • method ({'connected_neighbors', 'largest_connected_component'}, default 'connected_neighbors') –

    Extraction strategy.

    • 'connected_neighbors' — ego-graph extraction: returns a neighbourhood subgraph of radius r centred on each node in nids.

    • 'largest_connected_component' — returns the single largest connected component of self.G.

  • nids (int or list of int, default 1) – Central node ID(s) for 'connected_neighbors'. Scalars are wrapped in a list automatically.

  • radii (int or list of int, default 5) – Neighbourhood radius for each entry in nids. A scalar is broadcast to all nodes.

  • include_central_node (bool, default True) – Whether to include the centre node in each ego subgraph.

  • treat_undirected (bool, default True) – Treat self.G as undirected during ego-graph extraction even if it is directed.

  • validate (bool, default True) – Raise ValueError for unrecognised method strings.

  • see_on_map (bool, default False) – Visualise each subgraph overlaid on the grain structure. Requires upxo_gs_object to be provided.

  • upxo_gs_object (upxo grain-structure object or None) – UPXO grain-structure instance used for map visualisation when see_on_map is True.

  • figsize (tuple of float, default (6, 6)) – Figure size for map visualisation.

  • dpi (int, default 100) – DPI for map visualisation.

  • throw_plt_object (bool, default False) – Return matplotlib figure objects alongside subgraphs.

Returns:

  • sg (dict) – Mapping of subgraph index → networkx.Graph subgraph.

  • plt_objects (dict or None) – Mapping of subgraph index → matplotlib figure when see_on_map and throw_plt_object are both True; else None.

extract_subgraph_connected_neighbors(**kwargs)[source]

Build ego-graph subgraphs centred on specified nodes.

Accepts keyword arguments forwarded from extract_subgraphs():

Parameters (via kwargs)

nidslist of int

Centre node IDs.

radiilist of int

Neighbourhood radius for each centre node.

include_central_nodebool

Include the centre node in the returned subgraph.

treat_undirectedbool

Ignore edge direction during traversal.

returns:

Mapping of sequential index (0-based) → networkx.Graph ego subgraph.

rtype:

dict

extract_largest_connected_component()[source]

Return a copy of the largest connected component of self.G.

Returns:

Subgraph induced by the node set that forms the largest connected component. Isolated nodes and smaller components are excluded.

Return type:

networkx.Graph

GET_connected_components(G)[source]

Return all connected components of G as a list of subgraphs.

Parameters:

G (networkx.Graph) – Graph to decompose.

Returns:

One frozen subgraph per connected component, ordered arbitrarily.

Return type:

list of networkx.Graph

GET_maximal_independent_set(G)[source]

Return a maximal independent set of nodes from G.

A maximal independent set (MIS) is a set of nodes such that no two are adjacent, and no additional node can be added without violating that property. The result is non-deterministic because NetworkX uses a random greedy algorithm.

Parameters:

G (networkx.Graph) – Graph from which to compute the MIS.

Returns:

Node IDs forming a maximal independent set.

Return type:

list of int

PRUNE_connected_component(cc, mis_nodes)[source]

Remove a set of nodes from a connected component and return the result.

Parameters:
  • cc (networkx.Graph) – Connected-component subgraph to prune (copied internally; the original is not modified).

  • mis_nodes (iterable of int) – Node IDs to remove. Typically the MIS returned by GET_maximal_independent_set().

Returns:

Copy of cc with mis_nodes removed.

Return type:

networkx.Graph

partition_into_nonconnected_sets_mis(see_results=True, verbose=False)[source]

Iteratively decompose a graph by: 1) Splitting into connected components 2) Computing a maximal independent set (MIS) per component 3) Removing MIS nodes 4) Repeating until no nodes remain

Returns a dict: round_index -> sorted list of MIS nodes removed that round.

see_nnodes_vs_peeldepth(decomposition_layers)[source]

Plot the number of MIS nodes removed at each decomposition round (peel depth).

Parameters:

decomposition_layers (dict) – Mapping of round index (1-based int) → sorted list of node IDs removed in that round, as returned by partition_into_nonconnected_sets_mis().

partition_into_nonconnected_sets_mis_nrealizations(n, throw_pd=False, see_results=True, see_types=['heatmap', 'mean_std'], _disp_n_decimals=1, figsize=(6, 4), dpi=120, save_partitions=False, normalize_ng=False, vmax=0.5)[source]

Run MIS-based graph peeling n times and summarise the statistics.

Because GET_maximal_independent_set() is non-deterministic, each realisation may yield a different peel-depth profile. This method aggregates n independent runs into a DataFrame and optionally visualises the spread.

Parameters:
  • n (int) – Number of independent decomposition realisations to run.

  • throw_pd (bool, default False) – If True, return the pandas DataFrame of per-run node counts.

  • see_results (bool, default True) – Display plots if True.

  • see_types (list of str, default ['heatmap', 'mean_std']) – Which plots to show. Recognised values: 'boxplot', 'violinplot', 'heatmap', 'mean_std'.

  • _disp_n_decimals (int, default 1) – Decimal places used when printing the descriptive statistics table.

  • figsize (tuple of float, default (6, 4)) – Width and height of each figure in inches.

  • dpi (int, default 120) – Resolution of each figure.

  • save_partitions (bool, default False) – Store a deep copy of every realisation’s decomposition dict.

  • normalize_ng (bool, default False) – Divide node counts by the total number of graph nodes so that values are fractions rather than absolute counts.

  • vmax (float, default 0.5) – Colour-scale maximum for the heatmap when normalize_ng is True.

Returns:

  • n_decomposition_layers_np (numpy.ndarray, shape (n, max_depth)) – Node counts (or fractions) per realisation and peel depth. Shorter realisations are zero-padded on the right.

  • n_decomposition_layers_pd (pandas.DataFrame or None) – Same data as a DataFrame with columns PD1, PD2, . None when throw_pd is False.

  • partitions (list of dict or None) – Deep copies of each realisation’s decomposition dict when save_partitions is True; else None.

fit_regr_lin_mis_partitions(n_decomposition_layers_np)[source]

Fit a linear regression to each MIS decomposition realisation.

For each row in n_decomposition_layers_np a degree-1 polynomial is fitted over the peel-depth axis. Trailing zeros (shorter realisations) are stripped before fitting. The 95 % confidence interval on each coefficient is computed from the covariance matrix returned by numpy.polyfit.

Parameters:

n_decomposition_layers_np (numpy.ndarray, shape (n_runs, max_depth)) – Array of node counts per realisation and peel depth, as returned by partition_into_nonconnected_sets_mis_nrealizations().

Returns:

  • regression_coeffs (numpy.ndarray, shape (n_runs, 2)) – [slope, intercept] for each realisation.

  • confidence_bounds (numpy.ndarray, shape (n_runs, 2, 2)) – 95 % CI lower/upper bounds on [slope, intercept] for each run. Rows with fewer than 2 valid data points are filled with NaN.

  • gradients (numpy.ndarray of object) – Array of 1-D arrays, one per realisation, containing the element-wise finite differences (numpy.diff) of the node-count profile.

community(method='louvain', comprops=['modularity'])[source]

Detect communities in the graph and compute community properties.

Parameters:
  • method ({'louvain', 'girvan_newman', 'label_propagation', 'greedy_modularity'}, default 'louvain') – Community detection algorithm to apply. Note that 'louvain' is currently a placeholder (pass) and does not yet produce a result.

  • comprops (list of str, default ['modularity']) – Community properties to compute after detection. Currently 'modularity' is the only supported value; it measures the strength of the community partition.

Returns:

  • comm (community partition object) – The raw partition returned by the chosen NetworkX algorithm.

  • modularity (pandas.DataFrame) – DataFrame with columns ['k', 'modularity'] giving the modularity of each community k when 'modularity' is in comprops.

Notes

Additional algorithms are listed at https://networkx.org/documentation/stable/reference/algorithms/community.html

visualize_communities(communities, i)[source]

Function to plot graph with node colouring based on communities .. rubric:: Example

G = nx.petersen_graph() communities = list(nx.community.girvan_newman(G)) fig, ax = plt.subplots(len(communities)+1, figsize=(15, 20)) for comm_count, comm in enumerate(communities):

visualize_communities(G, comm, comm_count)

modularity_df.plot.bar(

x=”k”, ax=ax[2], color=”#F2D140”, title=”Modularity Trend for Girvan-Newman Community Detection”,

) plt.show()

see_graph(plot_type='edges', seed=1)[source]

Draw self.G using a spring layout.

Parameters:
  • plot_type ({'edges', 'nodes', 'numbered nodes'}, default 'edges') –

    What to render.

    • 'edges' — draw edges only (no node markers or labels).

    • 'nodes' — draw nodes only (no edges or labels).

    • 'numbered nodes' — draw the full graph with node-ID labels.

  • seed (int, default 1) – Random seed passed to networkx.spring_layout for reproducible node positioning.

mprop
pathlengths
class upxo.analysis.analysis2d.gsan2d(creation='distr_single', stack={}, pnames=None)[source]

Bases: object

defmp = {'area': True, 'aspect_ratio': True, 'circularity': True, 'compactness': False, 'eccentricity': True, 'eq_diameter': False, 'euler_number': True, 'feret_diameter': False, 'gb_length_px': False, 'major_axis_length': True, 'minor_axis_length': True, 'moments_hu': True, 'morph_ori': False, 'npixels': False, 'npixels_gb': False, 'perimeter': False, 'perimeter_crofton': False, 'solidity': True}
chctrl = {'char_gb': False, 'char_grain_positions': True, 'find_neigh': True, 'find_neigh_include_central_feat': False, 'find_neigh_p': 1.0, 'find_neigh_throw_numba_dict': False, 'get_grain_coords': False, 'make_skim_prop': True}
metaa = metaa(pixel_size=1.0, unit='px', frame_dt=None, material='', sample_id='', source='', notes='', seed=42)
gsstack
pnames
gsid
dfs
stts
classmethod from_mcgs2d_single(gstslice, detect_grains=False, prechar=False, find_neigh=True, find_neigh_p=1.0, find_neigh_include_central_feat=False, find_neigh_throw_numba_dict=False, npixels=False, npixels_gb=False, gb_length_px=False, eq_diameter=False, feret_diameter=False, perimeter=False, perimeter_crofton=False, aspect_ratio=True, compactness=False, solidity=True, morph_ori=False, circularity=False, eccentricity=True, euler_number=True, moments_hu=True, char_grain_positions=False, char_gb=False, get_grain_coords=True, connectivity=2)[source]

Characterise a single 2-D grain-structure slice and wrap it for analysis.

Calls gstslice.char_morph_2d (and optionally find_neigh_v2) with the requested property flags, then returns a gsan2d instance whose gsstack contains this one slice under key 1.

Parameters:
  • gstslice (mcgs2_temporal_slice) – A single grain-structure time-slice object to characterise.

  • detect_grains (bool, default False) – Run grain detection on gstslice before characterisation.

  • prechar (bool, default False) – If True, skip char_morph_2d (assume the slice is already characterised).

  • find_neigh (bool, default True) – Find grain neighbours via find_neigh_v2 after characterisation.

  • find_neigh_p (float, default 1.0) – Sampling probability for neighbour search; must be in [0, 1].

  • find_neigh_include_central_feat (bool, default False) – Include the grain itself in its own neighbour list.

  • find_neigh_throw_numba_dict (bool, default False) – Return a Numba-typed dict from the neighbour search.

  • npixels (bool, default False) – Characterise grain area in pixels.

  • npixels_gb (bool, default False) – Characterise number of grain-boundary pixels per grain.

  • gb_length_px (bool, default False) – Characterise grain-boundary arc length in pixels.

  • eq_diameter (bool, default False) – Characterise equivalent circular diameter.

  • feret_diameter (bool, default False) – Characterise maximum Feret (calliper) diameter.

  • perimeter (bool, default False) – Characterise grain perimeter length.

  • perimeter_crofton (bool, default False) – Characterise perimeter using the Crofton formula.

  • aspect_ratio (bool, default True) – Characterise aspect ratio; also enables major_axis_length and minor_axis_length automatically.

  • compactness (bool, default False) – Characterise compactness (4π·area / perimeter²).

  • solidity (bool, default True) – Characterise solidity (area / convex-hull area).

  • morph_ori (bool, default False) – Characterise morphological orientation in degrees.

  • circularity (bool, default False) – Characterise circularity.

  • eccentricity (bool, default True) – Characterise eccentricity of the best-fit ellipse.

  • euler_number (bool, default True) – Characterise the Euler characteristic.

  • moments_hu (bool, default True) – Characterise the seven Hu invariant moments.

  • char_grain_positions (bool, default False) – Classify each grain as corner, edge, or internal.

  • char_gb (bool, default False) – Characterise grain-boundary pixel locations.

  • get_grain_coords (bool, default True) – Extract physical pixel coordinates for each grain.

  • connectivity (int, default 2) – Connectivity for feature labelling (1 = 4-connected, 2 = 8-connected in 2-D).

Returns:

Instance with creation='pxtal_single', gsstack={1: gstslice}, and pnames set to every property flag that was True.

Return type:

gsan2d

Example

>>> from upxo.ggrowth.mcgs import mcgs
>>> from upxo.analysis.analysis2d import gsan2d
>>> pxt = mcgs(input_dashboard='path/to/input_dashboard.xls')
>>> pxt.simulate()
>>> gsan = gsan2d.from_mcgs2d_single(pxt.gs[10],
...     solidity=True, eccentricity=True, euler_number=True,
...     moments_hu=True, get_grain_coords=False)
classmethod from_gsstack_varied(gsstack)[source]

Construct this instance from gsstack varied.

classmethod from_gsstack_temporal(gsstack, gsids=[], detect_grains=False, ispxtal=False, prechar=False, find_neigh=False, find_neigh_p=1.0, find_neigh_include_central_feat=False, find_neigh_throw_numba_dict=False, npixels=False, npixels_gb=False, gb_length_px=False, eq_diameter=False, feret_diameter=False, perimeter=False, perimeter_crofton=False, aspect_ratio=True, compactness=False, solidity=True, morph_ori=False, circularity=False, eccentricity=True, euler_number=True, moments_hu=True, char_gb=False, get_grain_coords=False)[source]

Characterise every slice in a temporal grain-structure stack and wrap for analysis.

Iterates over the stack, calls char_morph_2d (and optionally find_neigh_v2) on each slice with the requested property flags, then returns a gsan2d instance whose gsstack maps each grain-structure ID to its characterised slice.

Parameters:
  • gsstack (dict[int, mcgs2_temporal_slice] or pxtal) – Temporal stack of grain-structure slices. If ispxtal is True, this must be a pxtal object exposing a .gs attribute.

  • gsids (list of int, default []) – Subset of grain-structure IDs to include. An empty list uses all IDs present in gsstack.

  • detect_grains (bool, default False) – Run grain detection on each slice before characterisation.

  • ispxtal (bool, default False) – If True, treat gsstack as a pxtal object and extract its .gs dict, filtered to gsids when provided.

  • prechar (bool, default False) – If True, skip char_morph_2d on all slices (assume already characterised).

  • find_neigh (bool, default False) – Find grain neighbours in each slice via find_neigh_v2.

  • find_neigh_p (float, default 1.0) – Sampling probability for neighbour search; must be in [0, 1].

  • find_neigh_include_central_feat (bool, default False) – Include the grain itself in its own neighbour list.

  • find_neigh_throw_numba_dict (bool, default False) – Return a Numba-typed dict from the neighbour search.

  • npixels (bool, default False) – Characterise grain area in pixels.

  • npixels_gb (bool, default False) – Characterise number of grain-boundary pixels per grain.

  • gb_length_px (bool, default False) – Characterise grain-boundary arc length in pixels.

  • eq_diameter (bool, default False) – Characterise equivalent circular diameter.

  • feret_diameter (bool, default False) – Characterise maximum Feret (calliper) diameter.

  • perimeter (bool, default False) – Characterise grain perimeter length.

  • perimeter_crofton (bool, default False) – Characterise perimeter using the Crofton formula.

  • aspect_ratio (bool, default True) – Characterise aspect ratio; also enables major_axis_length and minor_axis_length automatically.

  • compactness (bool, default False) – Characterise compactness (4π·area / perimeter²).

  • solidity (bool, default True) – Characterise solidity (area / convex-hull area).

  • morph_ori (bool, default False) – Characterise morphological orientation in degrees.

  • circularity (bool, default False) – Characterise circularity.

  • eccentricity (bool, default True) – Characterise eccentricity of the best-fit ellipse.

  • euler_number (bool, default True) – Characterise the Euler characteristic.

  • moments_hu (bool, default True) – Characterise the seven Hu invariant moments.

  • char_gb (bool, default False) – Characterise grain-boundary pixel locations.

  • get_grain_coords (bool, default False) – Extract physical pixel coordinates for each grain.

Returns:

Instance with creation='pxtal_tmp', gsstack mapping each included grain-structure ID to its characterised slice, and pnames set to every property flag that was True.

Return type:

gsan2d

Example

>>> from upxo.ggrowth.mcgs import mcgs
>>> from upxo.analysis.analysis2d import gsan2d
>>> pxt = mcgs(input_dashboard='path/to/input_dashboard.xls')
>>> pxt.simulate()
>>> gsan = gsan2d.from_gsstack_temporal(pxt, ispxtal=True,
...     solidity=True, eccentricity=True, euler_number=True)
classmethod from_distr(distributions)[source]

Construct a gsan2d instance from pre-computed property distributions.

Parameters:

distributions (dict[str, array-like]) – Mapping of property name → array of values, one per grain.

Notes

Not yet implemented.

find_neigh(gsids=None, p=1.0, include_central_feat=False, throw_numba_dict=False, verbosity_nfids=1000)[source]

Find neigh.

find_neigh_variable_settings(gsids=None, p=1.0, include_central_feat=[False], throw_numba_dict=False, verbosity_nfids=1000)[source]

Find neigh variable settings.

extract_props()[source]

Extract props.

extract_props_pxtal_single(gsid=None)[source]

Extract props pxtal single.

compute_temporal_dfs()[source]

Return the ute temporal dfs.

compute_statistics()[source]

Return the ute statistics.

correlate(gsids=[1], pnames=['area', 'major_axis_length', 'minor_axis_length', 'eccentricity'], saa=True, throw=False)[source]

Correlate.

correlate_temporal(pnames=['area', 'major_axis_length', 'minor_axis_length', 'eccentricity'])[source]

Correlate temporal.

pca_analysis(gsids=[1], gids=[], pnames=['area', 'major_axis_length', 'minor_axis_length', 'eccentricity'], auto_ncomp=True, ncomp_method='mle', svd_solver='auto', saa=True, throw=False, see_scree=True, annotate=True, see_exvar=True, see_cum_exvar=False, figsize=(8, 3))[source]

Run principal component analysis on grain property data.

Standardises the selected properties for each requested grain-structure ID, fits a PCA model, and optionally plots explained-variance curves. Results are stored in self.pca[gsid] when saa is True.

Parameters:
  • gsids (list of int, default [1]) – Grain-structure IDs to analyse. An empty list selects all integer keys present in self.dfs.

  • gids (list of int, default []) – Specific grain IDs (1-indexed rows) to include. An empty list uses all grains in the dataframe.

  • pnames (list of str, default ['area', 'major_axis_length', 'minor_axis_length', 'eccentricity']) – Property names used as PCA features; must be columns in self.dfs[gsid]. An empty list uses all available columns.

  • auto_ncomp (bool, default True) – If True, fit with n_components=len(pnames) (all components). If False, ncomp_method is passed as n_components.

  • ncomp_method (str or int, default 'mle') – Passed as n_components to sklearn.decomposition.PCA when auto_ncomp is False. Common values: 'mle' or an integer.

  • svd_solver (str, default 'auto') – SVD solver forwarded to PCA; see scikit-learn documentation.

  • saa (bool, default True) – Save-and-apply: store fitted results in self.pca[gsid].

  • throw (bool, default False) – If True, return (pca_, scores_, exvar_); otherwise return (None, None, None).

  • see_scree (bool, default True) – Reserved for a future scree plot; not yet implemented.

  • annotate (bool, default True) – Reserved for annotation of variance plots; not yet implemented.

  • see_exvar (bool, default True) – Plot per-component explained variance (%) for each gsid.

  • see_cum_exvar (bool, default False) – Plot cumulative explained variance (%) for each gsid.

  • figsize (tuple of float, default (8, 3)) – Figure size (width, height) in inches for variance plots.

Returns:

  • pca_ (dict[int, sklearn.decomposition.PCA] or None) – gsid → fitted PCA object. None when throw is False.

  • scores_ (dict[int, numpy.ndarray] or None) – gsid → score array of shape (n_grains, n_components). None when throw is False.

  • exvar_ (dict[int, numpy.ndarray] or None) – gsid → explained-variance-ratio array of length n_components. None when throw is False.

Notes

Rows containing NaN values are dropped before fitting. self.pca is populated with principle_component_analysis objects keyed by gsid.

initiate_kmodel(gsids=[1], k_char_level='none', recalculate_neighbours=True, include_central_grain=False)[source]

Initiate kmodel.

see_stats(gsid=[1], pname='area', metric='mean')[source]

Plot a summary statistic of one property across specified time slices.

Parameters:
  • gsid (int or list of int, default [1]) – Time-slice ID(s) to include. Each must have been characterised via compute_statistics() before calling this method.

  • pname (str, default 'area') – Property name; must be a column in self.dfs[gsid].

  • metric (str, default 'mean') – Row label in the statistics table produced by DataFrame.describe(). Valid values include 'mean', 'std', 'min', '25%', '50%', '75%', 'max', 'skew', 'kurt'.

see_dstr_univariate(gsid=1, pnames=['area'], bw_adjust=[0.75], kde_clr=['blue'], title_fsz=14, xmax_mult=1.1, grid_alpha=0.3, multiple='stack', kind='kde', fill=True)[source]

See dstr univariate.

see_dstr_bivariate(gsid=1, pnames=['area', 'aspect_ratio'], jointplot=False, levels=5)[source]

Example

gsan.see_dstr_bivariate(gsid=1, pnames=[‘area’, ‘aspect_ratio’])

see_pairgrid(gsid=1, pnames=['area', 'aspect_ratio'])[source]

See pairgrid.

corr
pca
K
see_correlation(gsids=[1], pnames=['area', 'perimeter'], recorrelate=True)[source]

See correlation.

see_correlation_temporal()[source]

See correlation temporal.

see_dstr_stack(pname='area', metric='mean')[source]

See dstr stack.

see_stats_stack(pname='', metric='')[source]

See stats stack.

see_evol(pname='area', plottype='basic', metric='mean')[source]

See evol.