Overview

Dataset statistics

Number of variables5
Number of observations10
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory528.0 B
Average record size in memory52.8 B

Variable types

NUM3
CAT2

Warnings

姓名 is uniformly distributed Uniform
考试类型 is uniformly distributed Uniform
数学 has unique values Unique

Reproduction

Analysis started2020-10-16 05:51:13.870829
Analysis finished2020-10-16 05:51:15.345534
Duration1.47 second
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

姓名
Categorical

UNIFORM

Distinct5
Distinct (%)50.0%
Missing0
Missing (%)0.0%
Memory size80.0 B
张香秀
麻寒
吕傲文
冯乐萱
廉凡
ValueCountFrequency (%) 
张香秀220.0%
 
麻寒220.0%
 
吕傲文220.0%
 
冯乐萱220.0%
 
廉凡220.0%
 
2020-10-16T13:51:15.410535image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-10-16T13:51:15.479564image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-16T13:51:15.594536image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length3
Median length3
Mean length2.6
Min length2

语文
Real number (ℝ≥0)

Distinct9
Distinct (%)90.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean80.7
Minimum59
Maximum97
Zeros0
Zeros (%)0.0%
Memory size80.0 B
2020-10-16T13:51:15.684565image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum59
5-th percentile62.6
Q169.25
median78
Q395.75
95-th percentile96.55
Maximum97
Range38
Interquartile range (IQR)26.5

Descriptive statistics

Standard deviation14.2987956
Coefficient of variation (CV)0.1771845799
Kurtosis-1.681802207
Mean80.7
Median Absolute Deviation (MAD)14
Skewness-0.04658850064
Sum807
Variance204.4555556
MonotocityNot monotonic
2020-10-16T13:51:15.776565image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%) 
96220.0%
 
95110.0%
 
76110.0%
 
80110.0%
 
59110.0%
 
73110.0%
 
68110.0%
 
67110.0%
 
97110.0%
 
ValueCountFrequency (%) 
59110.0%
 
67110.0%
 
68110.0%
 
73110.0%
 
76110.0%
 
ValueCountFrequency (%) 
97110.0%
 
96220.0%
 
95110.0%
 
80110.0%
 
76110.0%
 

数学
Real number (ℝ≥0)

UNIQUE

Distinct10
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean79.1
Minimum60
Maximum98
Zeros0
Zeros (%)0.0%
Memory size80.0 B
2020-10-16T13:51:15.881565image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum60
5-th percentile60.45
Q165.75
median83.5
Q388.75
95-th percentile94.85
Maximum98
Range38
Interquartile range (IQR)23

Descriptive statistics

Standard deviation13.76347824
Coefficient of variation (CV)0.1740009892
Kurtosis-1.455830827
Mean79.1
Median Absolute Deviation (MAD)8.5
Skewness-0.3599178593
Sum791
Variance189.4333333
MonotocityNot monotonic
2020-10-16T13:51:15.969563image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
63110.0%
 
61110.0%
 
60110.0%
 
91110.0%
 
74110.0%
 
89110.0%
 
88110.0%
 
86110.0%
 
98110.0%
 
81110.0%
 
ValueCountFrequency (%) 
60110.0%
 
61110.0%
 
63110.0%
 
74110.0%
 
81110.0%
 
ValueCountFrequency (%) 
98110.0%
 
91110.0%
 
89110.0%
 
88110.0%
 
86110.0%
 

英语
Real number (ℝ≥0)

Distinct8
Distinct (%)80.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean83.5
Minimum66
Maximum100
Zeros0
Zeros (%)0.0%
Memory size80.0 B
2020-10-16T13:51:16.069533image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum66
5-th percentile68.7
Q172.5
median79
Q397.75
95-th percentile100
Maximum100
Range34
Interquartile range (IQR)25.25

Descriptive statistics

Standard deviation13.45155918
Coefficient of variation (CV)0.1610965172
Kurtosis-1.937826699
Mean83.5
Median Absolute Deviation (MAD)10
Skewness0.2278499479
Sum835
Variance180.9444444
MonotocityNot monotonic
2020-10-16T13:51:16.157565image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%) 
72220.0%
 
100220.0%
 
94110.0%
 
99110.0%
 
75110.0%
 
74110.0%
 
83110.0%
 
66110.0%
 
ValueCountFrequency (%) 
66110.0%
 
72220.0%
 
74110.0%
 
75110.0%
 
83110.0%
 
ValueCountFrequency (%) 
100220.0%
 
99110.0%
 
94110.0%
 
83110.0%
 
75110.0%
 

考试类型
Categorical

UNIFORM

Distinct2
Distinct (%)20.0%
Missing0
Missing (%)0.0%
Memory size80.0 B
期中
期末
ValueCountFrequency (%) 
期中550.0%
 
期末550.0%
 
2020-10-16T13:51:16.272535image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-10-16T13:51:16.347535image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-16T13:51:16.422565image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length2
Median length2
Mean length2
Min length2

Interactions

2020-10-16T13:51:14.060534image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-16T13:51:14.175536image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-16T13:51:14.284566image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-16T13:51:14.395536image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-16T13:51:14.506533image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-16T13:51:14.617566image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-16T13:51:14.729533image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-16T13:51:14.839565image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-16T13:51:14.960535image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Correlations

2020-10-16T13:51:16.500535image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2020-10-16T13:51:16.616566image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2020-10-16T13:51:16.730537image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2020-10-16T13:51:16.848534image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2020-10-16T13:51:16.964534image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2020-10-16T13:51:15.144535image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-10-16T13:51:15.288535image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Sample

First rows

姓名语文数学英语考试类型
0吕傲文9674100期中
1张香秀969183期中
2麻寒766175期中
3廉凡6886100期中
4冯乐萱806072期中
5吕傲文978194期末
6张香秀678972期末
7麻寒956399期末
8廉凡739866期末
9冯乐萱598874期末

Last rows

姓名语文数学英语考试类型
0吕傲文9674100期中
1张香秀969183期中
2麻寒766175期中
3廉凡6886100期中
4冯乐萱806072期中
5吕傲文978194期末
6张香秀678972期末
7麻寒956399期末
8廉凡739866期末
9冯乐萱598874期末