upxo.repqual.mcgs2d_representativeness_assesser module
- class upxo.repqual.mcgs2d_representativeness_assesser.mc2repr(target_type=None, target=None, samples=None, par_bounds=None, metrics=None, kde_options=None, stest={'ks_p_threshold': 0.9, 'kw_p_threshold': 0.9, 'mw_p_threshold': 0.9, 'tests': ['correlation', 'kldiv', 'ks', 'jsdiv', 'mannwhitneyu', 'kruskalwallis']}, test_metrics=['mode0_location', 'mode0_count', 'mode1_location', 'mode1_count', 'mean'], parameters=['area'])[source]
Bases:
objectRepresentativeness qualificartion
- target_type: str
- Source of targer data. Options:
ebsd0 - un-processed 2D EBSD map: DefDAP object.
ebsd1 - processed DefDAP data. Remapped with avg. ori.
umc2 - UPXO Monte-Carlo Grain structure 2D.
umc3 - UPXO Monte-Carlo Grain structure 3D.
uvt2 - UPXO Voronoi-Tessellation Grain Structure 2D.
- stats - Data samples across grain morphology par. Needs xori.
Could be in the form of dictionary or panadas dataframe. If dict or pandas dataframe, key or column name respectively, must be name of the parameter. Examples of parameter names include:
area, perimeter
aspecrt ratio, morphologhical orientation
MCGS.gs[tslice] for umc2 and umc3
VTGS for uvt2
ddap_ebsd - for un-processed or processed DefDAP data
Keys should be sample_names Values should contain either:
grain structure objects, or flag-string, ‘make’
If a value is a grain strucutre object, then it will be used as samples. It can be of types (a) umc2, (b) umc3 and (c) uvt2 If a value is ‘make’, then the following will be performanceormed:
read the excel file for grain structure generation parameters
simulate the grain structure evolution
Pull out specified slices at specified temporal slice intervals
Characterize the temporal slices
- For each parameter in the key, value must be a list of:
- [match bounds for peak locations in percentage,
match bounds for peak location density in percentage, J-S test bounds ]
- KEYS:
area, perimeter, aspect ratio
- VALUES:
bounds: [ [5, 5], [5, 5], [0.1, 0.1]]
List of metrics to use to enable representativeness qualification Examples include:
modes_n
modes_loc
modes_width
distr_type
skewness
kurtosis
key: bw_method value: choose from ‘scott’, ‘silverman’ or a scalar value
- target_type
- target
- samples
- par_bounds
- metrics
- kde_options
- stest
- test_metrics
- parameters
- performance
- test()[source]
TEST 1: correlation: For two datasets, it is a measure of the linear relationship between them. If correlation is close to 1 then, the distributions are very similar.
TEST 2: kldiv:
TEST 3: ks: Kolmogorov-Smirnov test: Determines of the two distribution samples differ significantly. It uses cumulative distributions of the two datasets. Retyurns D-statistic and P-value.
D-statistic: maximum absolute difference of the cumulative
distributions (absolute max distance (supremum) b/w the CDFs of the two samples). A smaller D-static value is indicative of similar distributions. * P-value: probability that thwe tywo distributions are similar. If p-value is low (<= 0.05), distributions are different. If p-value is high (> 0.05), we cannot reject the null-hypothesis that the two distributions are the same. * Note: if P <= 0.05: the null hypothesis that the two samples are drawn from tyhe sample sample can be rejected, indicating that the samples are not representative of the target
TEST 4: jsdiv: P value will allways be between 0 and 1. @ 0: Distributions are identical. @ 1: Distributions are completely different
TEST 5: mannwhitneyu: Mann-Whitney test: Used to determine if two ‘ distribution samples are drawn from a population having the same population. If P-value is less than or equal to 0.05, then different distributiopns. If P-value is > 0.05, then the two disrtirbutions are similar.
TEST 6: kruskalwallis: Kruskal-wallis test. Used to determine if there are statistically significant differences between two distributions.
- stat_tests
- test_threshold
- distr_type