upxo.repqual.mcgs2d_representativeness_assesser module

class upxo.repqual.mcgs2d_representativeness_assesser.mc2repr(target_type=None, target=None, samples=None, par_bounds=None, metrics=None, kde_options=None, stest={'ks_p_threshold': 0.9, 'kw_p_threshold': 0.9, 'mw_p_threshold': 0.9, 'tests': ['correlation', 'kldiv', 'ks', 'jsdiv', 'mannwhitneyu', 'kruskalwallis']}, test_metrics=['mode0_location', 'mode0_count', 'mode1_location', 'mode1_count', 'mean'], parameters=['area'])[source]

Bases: object

Representativeness qualificartion

target_type: str
Source of targer data. Options:
  1. ebsd0 - un-processed 2D EBSD map: DefDAP object.

  2. ebsd1 - processed DefDAP data. Remapped with avg. ori.

  3. umc2 - UPXO Monte-Carlo Grain structure 2D.

  4. umc3 - UPXO Monte-Carlo Grain structure 3D.

  5. uvt2 - UPXO Voronoi-Tessellation Grain Structure 2D.

  6. stats - Data samples across grain morphology par. Needs xori.

    Could be in the form of dictionary or panadas dataframe. If dict or pandas dataframe, key or column name respectively, must be name of the parameter. Examples of parameter names include:

    1. area, perimeter

    2. aspecrt ratio, morphologhical orientation

  1. MCGS.gs[tslice] for umc2 and umc3

  2. VTGS for uvt2

  3. ddap_ebsd - for un-processed or processed DefDAP data

Keys should be sample_names Values should contain either:

grain structure objects, or flag-string, ‘make’

If a value is a grain strucutre object, then it will be used as samples. It can be of types (a) umc2, (b) umc3 and (c) uvt2 If a value is ‘make’, then the following will be performanceormed:

  1. read the excel file for grain structure generation parameters

  2. simulate the grain structure evolution

  3. Pull out specified slices at specified temporal slice intervals

  4. Characterize the temporal slices

For each parameter in the key, value must be a list of:
[match bounds for peak locations in percentage,

match bounds for peak location density in percentage, J-S test bounds ]

KEYS:

area, perimeter, aspect ratio

VALUES:

bounds: [ [5, 5], [5, 5], [0.1, 0.1]]

List of metrics to use to enable representativeness qualification Examples include:

  1. modes_n

  2. modes_loc

  3. modes_width

  4. distr_type

  5. skewness

  6. kurtosis

key: bw_method value: choose from ‘scott’, ‘silverman’ or a scalar value


target_type
target
samples
par_bounds
metrics
kde_options
stest
test_metrics
parameters
performance
load_target(target=None, target_type=None)[source]

Load or import target.

load_samples(samples=None)[source]

Load or import samples.

add_sample(sample=None)[source]

Add or insert sample.

set_stests(tests)[source]

Set or update stests.

set_cor_thresh(cor_threshold)[source]

Set or update cor thresh.

set_kldiv_thresh(kldiv_thresh)[source]

Set or update kldiv thresh.

set_ks_thresh(ks_thresh_D, ks_thresh_P)[source]

Set or update ks thresh.

set_jsdiv_thresh(jsdiv_thresh)[source]

Set or update jsdiv thresh.

prop_to_excel(filename='pxtal_properties')[source]

Prop to excel.

build_distribution_dataset()[source]

Build and return distribution dataset.

determine_distr_type()[source]

Determine distr type.

test()[source]

TEST 1: correlation: For two datasets, it is a measure of the linear relationship between them. If correlation is close to 1 then, the distributions are very similar.

TEST 2: kldiv:

TEST 3: ks: Kolmogorov-Smirnov test: Determines of the two distribution samples differ significantly. It uses cumulative distributions of the two datasets. Retyurns D-statistic and P-value.

  • D-statistic: maximum absolute difference of the cumulative

distributions (absolute max distance (supremum) b/w the CDFs of the two samples). A smaller D-static value is indicative of similar distributions. * P-value: probability that thwe tywo distributions are similar. If p-value is low (<= 0.05), distributions are different. If p-value is high (> 0.05), we cannot reject the null-hypothesis that the two distributions are the same. * Note: if P <= 0.05: the null hypothesis that the two samples are drawn from tyhe sample sample can be rejected, indicating that the samples are not representative of the target

TEST 4: jsdiv: P value will allways be between 0 and 1. @ 0: Distributions are identical. @ 1: Distributions are completely different

TEST 5: mannwhitneyu: Mann-Whitney test: Used to determine if two ‘ distribution samples are drawn from a population having the same population. If P-value is less than or equal to 0.05, then different distributiopns. If P-value is > 0.05, then the two disrtirbutions are similar.

TEST 6: kruskalwallis: Kruskal-wallis test. Used to determine if there are statistically significant differences between two distributions.

stat_tests
test_threshold
distr_type