upxo.viz.vizDistr module
vizDistr.py — Distribution visualisation for UPXO grain structure analyses.
Provides the DistrViz class for plotting scalar grain property distributions (area, perimeter, aspect ratio, …) and angular misorientation distributions (MDF). Designed to complement ebsdviz.plot_mdf — use DistrViz.plot_mdf when peaks are not yet computed; use ebsdviz.plot_mdf for fully annotated MDF with peak labels and KDE from the peaks dict.
Typical usage
- Grain size:
dv = DistrViz(areas, label=’Grain area’, units=’µm²’) fig, ax = dv.plot_hist(bins=40, show_kde=True, step_size=rdr.step_size) plt.show() dv.print_stats()
- MDF (lightweight, no peaks dict required):
dv = DistrViz.from_mdf(mdf) fig, ax = dv.plot_mdf(mdf) plt.show()
- Multiple properties:
- fig, axes = DistrViz.multi(
{‘Grain area’: areas, ‘Aspect ratio’: ar, ‘Perimeter’: perim}, units_dict={‘Grain area’: ‘µm²’, ‘Aspect ratio’: ‘’, ‘Perimeter’: ‘µm’}, step_size=rdr.step_size,
) plt.show()
- class upxo.viz.vizDistr.DistrViz(data, label='value', units='')[source]
Bases:
objectDistribution visualiser for scalar grain properties and MDF data.
- Parameters:
- property stats
Dict of descriptive statistics computed from self.data.
- plot(vis='hist', bins=40, show_kde=True, show_stats=True, color='steelblue', figsize=(7, 4), log_scale=False, step_size=None, bw_method='scott', fill=True, ax=None)[source]
Unified plot dispatcher — routes to plot_hist, plot_kde, or plot_hist_kde based on vis.
- Parameters:
vis (str) –
'hist','kde', or'hist_kde'.bins (int) – Histogram bin count (used by
'hist'and'hist_kde').show_kde (bool) – KDE overlay on histogram (
'hist'only).show_stats (bool) – Annotate mean / median lines.
color (str)
figsize (tuple)
log_scale (bool) – Log x-axis (
'hist'only).step_size (float or None) – Appended to x-label when provided.
bw_method (str or float) – KDE bandwidth selector (
'kde'only).fill (bool) – Fill KDE area (
'kde'only).ax (Axes or None)
- Return type:
fig, ax
- plot_hist(bins=40, show_kde=True, show_stats=True, color='steelblue', figsize=(7, 4), log_scale=False, step_size=None, ax=None)[source]
Histogram with optional KDE overlay and mean/median annotations.
- Parameters:
- Return type:
fig, ax
- plot_kde(bw_method='scott', fill=True, color='steelblue', show_stats=True, figsize=(7, 4), step_size=None, ax=None)[source]
Pure KDE plot (probability density).
- plot_hist_kde(bins=40, color='steelblue', show_stats=True, figsize=(7, 4), step_size=None, ax=None)[source]
Density-normalised histogram with KDE overlay.
- Return type:
fig, ax
- plot_mdf(mdf, show_csl=True, show_stats=True, angle_max=65.0, figsize=(8, 4), ax=None)[source]
Bar-chart MDF from a pre-computed mdf dict with optional CSL markers.
Lighter alternative to ebsdviz.plot_mdf — does not require the peaks dict. Use ebsdviz.plot_mdf when peak labels and KDE are needed.
- Parameters:
mdf (dict) – Output of compute_mdf_from_quats. Required keys: ‘hist_bin_centers’, ‘hist_density’, ‘hist_bin_edges’, ‘n_pairs’, ‘mean_angle’, ‘std_angle’.
show_csl (bool) – Draw dashed vertical lines at common cubic CSL angles.
show_stats (bool) – Annotate mean ± std in the legend.
angle_max (float) – X-axis upper limit (degrees).
figsize (tuple)
ax (Axes or None)
- Return type:
fig, ax
- classmethod multi(data_dict, units_dict=None, step_size=None, bins=40, show_kde=True, show_stats=True, ncolumns=2, figsize_per=(5, 3.5), color='steelblue', log_scale=False)[source]
Plot distributions for multiple grain properties in a subplot grid.
- Parameters:
data_dict (dict) – {label: array-like} of grain properties to plot.
units_dict (dict or None) – {label: units_str}. Missing keys default to no units.
step_size (float or None) – Passed to each subplot for x-label annotation.
bins (int)
show_kde (bool)
show_stats (bool)
ncolumns (int)
figsize_per (tuple) – (width, height) per panel in inches.
color (str)
log_scale (bool)
- Return type:
fig, axes (axes is a flat ndarray)
- upxo.viz.vizDistr.plot_grouped_distributions(data, prop_labels=None, group_colors=None, group_labels=None, bins=40, bw_method='scott', peak_prominence=0.01, figsize_per=(5, 4), dpi=110, suptitle='Property distributions by group', ncols=None, fontsize=9.0, show_hist=True, show_peaks=True, show_legend=True, x_margin=0.03, do_tight_layout=True)[source]
Overlaid histogram + KDE + peak markers for multiple properties and groups.
Generic plotting function — no knowledge of grain structures or UPXO data formats. Data must be pre-extracted into plain arrays before calling.
- Parameters:
data (dict) –
{prop_name: {group_name: array-like}}— one entry per property, each containing one array per group. Arrays may be empty; empty/size-1 groups are silently skipped.prop_labels (dict or None) –
{prop_name: display_label}for axis / title text. Missing keys fall back to the prop_name itself.group_colors (dict or None) –
{group_name: colour_string}. Missing keys cycle through a default palette.group_labels (dict or None) –
{group_name: display_label}for legend entries. Missing keys fall back to the group_name itself.bins (int) – Number of histogram bins (shared x-range across groups per property).
bw_method (str or float) – Bandwidth selector passed to
scipy.stats.gaussian_kde.peak_prominence (float) – Fraction of KDE maximum used as minimum prominence for
find_peaks.figsize_per (tuple) –
(width, height)in inches per subplot panel.dpi (int) – Figure resolution.
suptitle (str) – Figure-level title.
ncols (int or None) – Subplot grid columns.
Noneplaces all panels in a single row.fontsize (float) – Base font size; tick labels use
fontsize-2, legendfontsize-2, peak annotationsfontsize-3, suptitlefontsize+1.show_hist (bool) – Draw histogram bars behind the KDE curves. Default
True.show_peaks (bool) – Draw vertical dashed lines and value annotations at KDE peaks. Default
True.show_legend (bool) – Draw a per-group legend on each subplot. Default
True.x_margin (float) – Fractional padding added to both sides of the x-axis so that tick labels are never clipped at the axis boundary. Default
0.03.do_tight_layout (bool) – Call
plt.tight_layout()before returning. Set toFalsewhen the caller needs to adjust the figure (e.g. to add a colorbar) before finalising the layout. DefaultTrue.
- Returns:
fig, axes
- Return type:
Figure and 2-D axes array (shape
(nrows, ncols_used)).
- upxo.viz.vizDistr.plot_repr_rank(repr_rank_ng: dict, figsize=None, dpi: int = 100, fontsize_annot: float = 8.0, fontsize_tick: float = 9.0, fontsize_title: float = 9.0, fontsize_suptitle: float = 11.0) None[source]
Five vertically stacked heatmaps showing the per-property rank of every MC time slice under each representativeness metric (ratio, Wasserstein, energy distance, KS statistic, Anderson–Darling statistic).
Colour encodes rank within each column independently: green = best (rank 1), red = worst (rank N). Cell text shows the raw numeric score. Rows are ordered best-to-worst by the aggregate score (inherited from the DataFrame sort order in
repr_rank_ng).Ranking rule per column: - ratio, property columns : rank by
|value − 1|ascending(closest to 1.0 = best)
ratio, aggregate column : rank by value ascending (lowest = best)
wasserstein / energy : rank by value ascending (lowest = best)
- Parameters:
repr_rank_ng (dict) –
{'ratio': df, 'wasserstein': df, 'energy': df}— as stored inrepgen2d.repr_rank_ngafter callingfind_repr_mcgs_props.figsize (tuple or None) – Override default figure size. Default auto-computes from data shape.
dpi (int) – Figure resolution.
fontsize_annot (float) – Font size for the numeric value printed in each cell.
fontsize_tick (float) – Font size for axis tick labels (slice keys on y-axis, column names on x-axis).
fontsize_title (float) – Font size for each panel title.
fontsize_suptitle (float) – Font size for the overall figure title.
- upxo.viz.vizDistr.plot_normalized_prop_distributions(ebsd_data: dict, mc_data: dict, props: list, scores: dict | None = None, prop_labels: dict | None = None, bins: int = 40, bw_method='scott', figsize_per: tuple = (5, 4), dpi: int = 100, ncols: int | None = None, fontsize: float = 9.0, show_hist: bool = True, show_peaks: bool = True, legend_loc: str = 'upper right', legend_ncol: int = 1, legend_fontsize: float | None = None) None[source]
Overlaid normalised property distributions for EBSD (merged) and MC slices.
Each distribution is normalised by its own mean before plotting, matching the normalisation used in
find_repr_mcgs_props. All curves are therefore centred near 1.0 on the x-axis and are directly shape-comparable.Wasserstein and energy distances are annotated in each subplot legend when
scoresis provided.- Parameters:
ebsd_data (dict) –
{prop: array}of EBSD-merged property values, each already divided by its own mean.mc_data (dict) –
{slice_key: {prop: array}}of MC property values, each already divided by its own mean.props (list of str) – Ordered list of property names to plot.
scores (dict or None) –
{slice_key: {prop: {'wasserstein': v, 'energy': v}}}extracted fromrepr_rank_ng. When supplied, each MC curve’s legend entry is annotated withW=... E=...for the per-property distance.prop_labels (dict or None) –
{prop: display_label}. Defaults tof'{prop} (mean normalized)'.bins – Forwarded to
plot_grouped_distributions().bw_method – Forwarded to
plot_grouped_distributions().figsize_per – Forwarded to
plot_grouped_distributions().dpi – Forwarded to
plot_grouped_distributions().ncols – Forwarded to
plot_grouped_distributions().fontsize – Forwarded to
plot_grouped_distributions().show_hist – Forwarded to
plot_grouped_distributions().show_peaks – Forwarded to
plot_grouped_distributions().legend_loc (str) – Legend location string passed to
ax.legend(loc=...). Examples:'upper right','upper left','lower right','center left','best'. Default'upper right'.legend_ncol (int) – Number of columns in the legend. Values > 1 split entries side-by-side, reducing legend height and — when entries are uniform in width — the overall legend footprint. Default
1(single column).legend_fontsize (float or None) – Font size for legend text. Reducing this is the most direct way to shrink the legend box since box width is driven by label text length. Defaults to
fontsize - 2when None.
- upxo.viz.vizDistr.plot_qq_comparison(ebsd_data: dict, mc_data: dict, props: list, prop_labels: dict | None = None, figsize_per: tuple = (4, 4), dpi: int = 100, ncols: int | None = None, fontsize: float = 9.0) None[source]
Quantile–Quantile (Q-Q) comparison of EBSD vs MC grain property distributions.
A Q-Q plot maps the quantiles of one distribution against the quantiles of another at the same probability levels (0 % to 100 %). Both distributions are normalised by their own mean before comparison, so the x- and y-axes share the same dimensionless scale centred near 1.0.
Interpretation
Points on the diagonal (y = x) — the two distributions have identical shape at that quantile. Perfect agreement.
Points above the diagonal — the MC distribution has larger values than EBSD at that quantile (heavier upper tail or higher spread in MC).
Points below the diagonal — the MC distribution has smaller values than EBSD at that quantile.
Deviations concentrated in the lower-left — fine/small grains differ.
Deviations concentrated in the upper-right — large/coarse grains differ.
One subplot is drawn per property; each MC slice is a separate line. The dashed black diagonal marks perfect distributional agreement.
- param ebsd_data:
{prop: array}of EBSD-merged values, each normalised by own mean.- type ebsd_data:
dict
- param mc_data:
{slice_key: {prop: array}}of MC values, each normalised by own mean.- type mc_data:
dict
- param props:
Properties to plot.
- type props:
list of str
- param prop_labels:
{prop: display_label}. Defaults tof'{prop} (mean normalized)'.- type prop_labels:
dict or None
- param figsize_per:
(width, height)per subplot in inches.- type figsize_per:
tuple
- param dpi:
- type dpi:
int
- param ncols:
Subplot grid columns.
Noneplaces all panels in a single row.- type ncols:
int or None
- param fontsize:
- type fontsize:
float
- upxo.viz.vizDistr.plot_ebsd_tvf(tvf_result: dict, figsize: tuple = (7, 4), dpi: int = 100, fontsize: float = 9.0, title: str = 'EBSD grain-role area fractions') None[source]
Horizontal bar chart of EBSD twin area fraction broken down by grain role.
Bars are drawn for each of the four grain-role categories:
Pure parents — matrix grains; never a twin of any grain.
Primary twins — first-generation twins whose parent is a pure parent.
Secondary twins — twins whose parent is itself an intermediate (twin-of-a-twin, 2nd generation).
Intermediate twins — grains that are simultaneously a twin of one grain and a parent of another (twin chains).
The overall twin area fraction (primary + secondary + intermediate) is annotated on the figure.
- Parameters:
tvf_result (dict) – Output of
repgen2d.compute_ebsd_tvf. Must contain keys'pure_parent_frac','primary_twin_frac','secondary_twin_frac','intermediate_frac','overall_twin_frac'.figsize (tuple) – Figure size
(width, height)in inches.dpi (int) – Figure resolution.
fontsize (float) – Base font size for labels and tick marks.
title (str) – Figure title.