Data Report — UCI_1150

Source: UCI dataset 1150

SemMap JSON-LD: dataset.semmap.json · RDFa HTML

Overview

Metric Value
Dataset UCI_1150
Source UCI dataset 1150
Rows 319
Columns 39
Discrete 8
Continuous 31
SemMap SemMap JSON-LD
SemMap HTML
Missingness Not modeled

Variables and summary

variable inferred dist
Gallstone Status discrete Gallstones absent [1]: 158 (49.53%)
Age continuous 48.0690 ± 12.1146 [20, 38.5, 49, 56, 96]
Gender discrete Female [1]: 157 (49.22%)
Comorbidity discrete 0: 217 (68.03%)
1: 99 (31.03%)
3: 2 (0.63%)
2: 1 (0.31%)
Coronary Artery Disease (CAD) discrete 1: 12 (3.76%)
Hypothyroidism discrete 1: 9 (2.82%)
Hyperlipidemia discrete 1: 8 (2.51%)
Diabetes Mellitus (DM) discrete 1: 43 (13.48%)
Height continuous 167.1567 ± 10.0530 [145, 159.5, 168, 175, 191]
Weight continuous 80.5649 ± 15.7091 [42.9, 69.6, 78.8, 91.25, 143.5]
Body Mass Index (BMI) continuous 28.8771 ± 5.3137 [17.4, 25.25, 28.3, 31.85, 49.7]
Total Body Water (TBW) continuous 40.5878 ± 7.9302 [13, 34.2, 39.8, 47, 66.2]
Extracellular Water (ECW) continuous 17.0712 ± 3.1619 [9, 14.8, 17.1, 19.4, 27.8]
Intracellular Water (ICW) continuous 23.6345 ± 5.3493 [13.8, 19.3, 23, 27.55, 57.1]
Extracellular Fluid/Total Body Water (ECF/TBW) continuous 42.2120 ± 3.2445 [29.23, 40.075, 42, 44, 52]
Total Body Fat Ratio (TBFR) (%) continuous 28.2750 ± 8.4444 [6.3, 22.025, 27.82, 34.81, 50.92]
Lean Mass (LM) (%) continuous 71.6382 ± 8.4376 [48.99, 65.165, 72.11, 77.85, 93.67]
Body Protein Content (Protein) (%) continuous 15.9388 ± 2.3347 [5.56, 14.465, 15.87, 17.43, 24.81]
Visceral Fat Rating (VFR) continuous 9.0784 ± 4.3325 [1, 6, 9, 12, 31]
Bone Mass (BM) continuous 2.8033 ± 0.5095 [1.4, 2.4, 2.8, 3.2, 4]
Muscle Mass (MM) continuous 54.2730 ± 10.6038 [4.7, 45.8, 53.9, 62.6, 78.8]
Obesity (%) continuous 35.8501 ± 109.7997 [0.4, 13.9, 25.6, 41.75, 1954]
Total Fat Content (TFC) continuous 23.4878 ± 9.6076 [3.1, 17, 22.6, 28.55, 62.5]
Visceral Fat Area (VFA) continuous 12.1716 ± 5.2622 [0.9, 8.57, 11.59, 15.1, 41]
Visceral Muscle Area (VMA) (Kg) continuous 30.4034 ± 4.4605 [18.9, 27.25, 30.4081, 33.8, 41.1]
Hepatic Fat Accumulation (HFA) discrete 0: 129 (40.44%)
2: 122 (38.24%)
1: 41 (12.85%)
3: 26 (8.15%)
4: 1 (0.31%)
Glucose continuous 108.6887 ± 44.8487 [69, 92, 98, 109, 575]
Total Cholesterol (TC) continuous 203.4953 ± 45.7585 [60, 172, 198, 233, 360]
Low Density Lipoprotein (LDL) continuous 126.6524 ± 38.5412 [11, 100.5, 122, 151, 293]
High Density Lipoprotein (HDL) continuous 49.4755 ± 17.7187 [25, 40, 46.5, 56, 273]
Triglyceride continuous 144.5022 ± 97.9045 [1.39, 83, 119, 172, 838]
Aspartat Aminotransferaz (AST) continuous 21.6850 ± 16.6976 [8, 15, 18, 23, 195]
Alanin Aminotransferaz (ALT) continuous 26.8558 ± 27.8844 [3, 14.25, 19, 30, 372]
Alkaline Phosphatase (ALP) continuous 73.1125 ± 24.1811 [7, 58, 71, 86, 197]
Creatinine continuous 0.8006 ± 0.1764 [0.46, 0.65, 0.79, 0.92, 1.46]
Glomerular Filtration Rate (GFR) continuous 100.8189 ± 16.9714 [10.6, 94.17, 104, 110.745, 132]
C-Reactive Protein (CRP) continuous 1.8539 ± 4.9896 [0, 0, 0.215, 1.615, 43.4]
Hemoglobin (HGB) continuous 14.4182 ± 1.7758 [8.5, 13.3, 14.4, 15.7, 18.8]
Vitamin D continuous 21.4014 ± 9.9817 [3.5, 13.25, 22, 28.06, 53.1]

Fidelity summary

umap model backend disc jsd mean disc jsd median cont ks mean cont w1 mean downstream sign match
metasyn metasyn 0.0243 0.0107 0.0934 2.3584 1
clg_mi2 pybnesian 0.0238 0.0192 0.1047 4.2669
semi_mi5 pybnesian 0.0218 0.0189 0.0985 3.4149
ctgan_fast synthcity 0.1235 0.0649 0.8082 34.5586
tvae_quick synthcity 0.0792 0.0944 0.2344 5.3212

Privacy summary

model backend n real n synth exact overlap rate near duplicate rate eps nn distance mean k min k pct lt5 k map rare qi reproduction rate identifiability score delta presence
metasyn metasyn 319 319 0 0.9906 0.0128 1 1 1 0 3
clg_mi2 pybnesian 319 319 0 0.9969 0.0139 1 1 1 0 3.6154
semi_mi5 pybnesian 319 319 0 0.9969 0.0129 1 1 1 0 3
ctgan_fast synthcity 319 319 0 0.9749 0.1023 1 1 1 0 137
tvae_quick synthcity 319 319 0 0.9906 0.0241 1 1 2 0 14.3333

Models

UMAPDetailsStructure

Real data

Model: metasyn (metasyn)

Per-variable fidelity
variable type KS W1 JSD
Gallstone Status discrete 0.0053
Age continuous 0.069 0.9299
Gender discrete 0.0186
Comorbidity discrete 0.1044
Coronary Artery Disease (CAD) discrete 0
Hypothyroidism discrete 0
Hyperlipidemia discrete 0.0161
Diabetes Mellitus (DM) discrete 0.0039
Height continuous 0.1223 1.3322
Weight continuous 0.0721 1.5836
Downstream metrics
metric value
sign_match_rate 1
formula Gallstone_Status ~ Q('Age') + Q('Gender') + Q('Glucose') + Q('Age'):Q('Gender') + Q('Gender'):Q('Glucose')
skipped_reason
Privacy metrics
metric value
n_real 319
n_synth 319
exact_overlap_rate 0
near_duplicate_rate_eps 0.9906
nn_distance_mean 0.0128
k_min 1
k_pct_lt5 1
k_map 1
rare_qi_reproduction_rate 0
delta_presence 3
variable distribution
Gallstone Status core.multinoulli
Age core.normal
Gender core.multinoulli
Comorbidity core.multinoulli
Coronary Artery Disease (CAD) core.multinoulli
Hypothyroidism core.multinoulli
Hyperlipidemia core.multinoulli
Diabetes Mellitus (DM) core.multinoulli
Height core.truncated_normal
Weight core.lognormal
Body Mass Index (BMI) core.lognormal
Total Body Water (TBW) core.normal
Extracellular Water (ECW) core.normal
Intracellular Water (ICW) core.lognormal
Extracellular Fluid/Total Body Water (ECF/TBW) core.normal
Total Body Fat Ratio (TBFR) (%) core.normal
Lean Mass (LM) (%) core.normal
Body Protein Content (Protein) (%) core.normal
Visceral Fat Rating (VFR) core.truncated_normal
Bone Mass (BM) core.lognormal
Muscle Mass (MM) core.normal
Obesity (%) core.lognormal
Total Fat Content (TFC) core.lognormal
Visceral Fat Area (VFA) core.lognormal
Visceral Muscle Area (VMA) (Kg) core.normal
Hepatic Fat Accumulation (HFA) core.multinoulli
Glucose core.truncated_normal
Total Cholesterol (TC) core.normal
Low Density Lipoprotein (LDL) core.normal
High Density Lipoprotein (HDL) core.lognormal
Triglyceride core.lognormal
Aspartat Aminotransferaz (AST) core.lognormal
Alanin Aminotransferaz (ALT) core.lognormal
Alkaline Phosphatase (ALP) core.normal
Creatinine core.lognormal
Glomerular Filtration Rate (GFR) core.truncated_normal
C-Reactive Protein (CRP) core.truncated_normal
Hemoglobin (HGB) core.normal
Vitamin D core.truncated_normal

Model: clg_mi2 (pybnesian)

Per-variable fidelity
variable type KS W1 JSD
Gallstone Status discrete 0.016
Age continuous 0.0752 1.0865
Gender discrete 0.0027
Comorbidity discrete 0.049
Coronary Artery Disease (CAD) discrete 0.0224
Hypothyroidism discrete 0.0264
Hyperlipidemia discrete 0.0083
Diabetes Mellitus (DM) discrete 0.0115
Height continuous 0.0784 1.4113
Weight continuous 0.069 1.729
Privacy metrics
metric value
n_real 319
n_synth 319
exact_overlap_rate 0
near_duplicate_rate_eps 0.9969
nn_distance_mean 0.0139
k_min 1
k_pct_lt5 1
k_map 1
rare_qi_reproduction_rate 0
delta_presence 3.6154

Model: semi_mi5 (pybnesian)

Per-variable fidelity
variable type KS W1 JSD
Gallstone Status discrete 0.0053
Age continuous 0.0721 0.9809
Gender discrete 0.0133
Comorbidity discrete 0.0335
Coronary Artery Disease (CAD) discrete 0.0224
Hypothyroidism discrete 0.0153
Hyperlipidemia discrete 0.0088
Diabetes Mellitus (DM) discrete 0.0227
Height continuous 0.0721 1.2361
Weight continuous 0.069 1.232
Privacy metrics
metric value
n_real 319
n_synth 319
exact_overlap_rate 0
near_duplicate_rate_eps 0.9969
nn_distance_mean 0.0129
k_min 1
k_pct_lt5 1
k_map 1
rare_qi_reproduction_rate 0
delta_presence 3

Model: ctgan_fast (synthcity)

Per-variable fidelity
variable type KS W1 JSD
Gallstone Status discrete 0.0777
Age continuous 0.9875 28.069
Gender discrete 0.2665
Comorbidity discrete 0.4142
Coronary Artery Disease (CAD) discrete 0.0069
Hypothyroidism discrete 0.0293
Hyperlipidemia discrete 0.0521
Diabetes Mellitus (DM) discrete 0.019
Height continuous 0.9373 21.4306
Weight continuous 0.9875 53.784
Privacy metrics
metric value
n_real 319
n_synth 319
exact_overlap_rate 0
near_duplicate_rate_eps 0.9749
nn_distance_mean 0.1023
k_min 1
k_pct_lt5 1
k_map 1
rare_qi_reproduction_rate 0
delta_presence 137

Model: tvae_quick (synthcity)

Per-variable fidelity
variable type KS W1 JSD
Gallstone Status discrete 0.0694
Age continuous 0.2696 5.7794
Gender discrete 0.0107
Comorbidity discrete 0.1266
Coronary Artery Disease (CAD) discrete 0.0071
Hypothyroidism discrete 0.1194
Hyperlipidemia discrete 0.0284
Diabetes Mellitus (DM) discrete 0.1383
Height continuous 0.1693 2.4626
Weight continuous 0.2696 8.1252
Privacy metrics
metric value
n_real 319
n_synth 319
exact_overlap_rate 0
near_duplicate_rate_eps 0.9906
nn_distance_mean 0.0241
k_min 1
k_pct_lt5 1
k_map 2
rare_qi_reproduction_rate 0
delta_presence 14.3333