upxo.repqual.mcgs2d_representativeness_assesser module

class upxo.repqual.mcgs2d_representativeness_assesser.mc2repr(target_type=None, target=None, samples=None, par_bounds=None, metrics=None, kde_options=None, stest={'ks_p_threshold': 0.9, 'kw_p_threshold': 0.9, 'mw_p_threshold': 0.9, 'tests': ['correlation', 'kldiv', 'ks', 'jsdiv', 'mannwhitneyu', 'kruskalwallis']}, test_metrics=['mode0_location', 'mode0_count', 'mode1_location', 'mode1_count', 'mean'], parameters=['area'])[source]

Bases: object

Representativeness qualificartion

target_type: str

Source of targer data. Options:

ebsd0 - un-processed 2D EBSD map: DefDAP object.
ebsd1 - processed DefDAP data. Remapped with avg. ori.
umc2 - UPXO Monte-Carlo Grain structure 2D.
umc3 - UPXO Monte-Carlo Grain structure 3D.
uvt2 - UPXO Voronoi-Tessellation Grain Structure 2D.
stats - Data samples across grain morphology par. Needs xori.
Could be in the form of dictionary or panadas dataframe. If dict or pandas dataframe, key or column name respectively, must be name of the parameter. Examples of parameter names include:
area, perimeter

aspecrt ratio, morphologhical orientation

MCGS.gs[tslice] for umc2 and umc3

VTGS for uvt2

ddap_ebsd - for un-processed or processed DefDAP data

Keys should be sample_names Values should contain either:

grain structure objects, or flag-string, ‘make’

If a value is a grain strucutre object, then it will be used as samples. It can be of types (a) umc2, (b) umc3 and (c) uvt2 If a value is ‘make’, then the following will be performanceormed:

read the excel file for grain structure generation parameters

simulate the grain structure evolution

Pull out specified slices at specified temporal slice intervals

Characterize the temporal slices

For each parameter in the key, value must be a list of:

[match bounds for peak locations in percentage,
match bounds for peak location density in percentage, J-S test bounds ]

KEYS:
area, perimeter, aspect ratio

VALUES:
bounds: [ [5, 5], [5, 5], [0.1, 0.1]]

List of metrics to use to enable representativeness qualification Examples include:

modes_n

modes_loc

modes_width

distr_type

skewness

kurtosis

key: bw_method value: choose from ‘scott’, ‘silverman’ or a scalar value

target_type

target

samples

par_bounds

metrics

kde_options

stest

test_metrics

parameters

performance

load_target(target=None, target_type=None)[source]: Load or import target.

load_samples(samples=None)[source]: Load or import samples.

add_sample(sample=None)[source]: Add or insert sample.

set_stests(tests)[source]: Set or update stests.

set_cor_thresh(cor_threshold)[source]: Set or update cor thresh.

set_kldiv_thresh(kldiv_thresh)[source]: Set or update kldiv thresh.

set_ks_thresh(ks_thresh_D, ks_thresh_P)[source]: Set or update ks thresh.

set_jsdiv_thresh(jsdiv_thresh)[source]: Set or update jsdiv thresh.

prop_to_excel(filename='pxtal_properties')[source]: Prop to excel.

build_distribution_dataset()[source]: Build and return distribution dataset.

determine_distr_type()[source]: Determine distr type.

test()[source]

TEST 1: correlation: For two datasets, it is a measure of the linear relationship between them. If correlation is close to 1 then, the distributions are very similar.

TEST 2: kldiv:

TEST 3: ks: Kolmogorov-Smirnov test: Determines of the two distribution samples differ significantly. It uses cumulative distributions of the two datasets. Retyurns D-statistic and P-value.

D-statistic: maximum absolute difference of the cumulative

distributions (absolute max distance (supremum) b/w the CDFs of the two samples). A smaller D-static value is indicative of similar distributions. * P-value: probability that thwe tywo distributions are similar. If p-value is low (<= 0.05), distributions are different. If p-value is high (> 0.05), we cannot reject the null-hypothesis that the two distributions are the same. * Note: if P <= 0.05: the null hypothesis that the two samples are drawn from tyhe sample sample can be rejected, indicating that the samples are not representative of the target

TEST 4: jsdiv: P value will allways be between 0 and 1. @ 0: Distributions are identical. @ 1: Distributions are completely different

TEST 5: mannwhitneyu: Mann-Whitney test: Used to determine if two ‘ distribution samples are drawn from a population having the same population. If P-value is less than or equal to 0.05, then different distributiopns. If P-value is > 0.05, then the two disrtirbutions are similar.

TEST 6: kruskalwallis: Kruskal-wallis test. Used to determine if there are statistically significant differences between two distributions.

stat_tests

test_threshold

distr_type