Directory
- Wright criterion
-
- Introduction
- Grubbs criterion
-
- Introduction
- Arduino code implementation
- References
Wright Criterion
Introduction
Wright’s criterion is a method for identifying outliers under normal distribution. The specific contents are as follows:
Suppose that in a series of equal-precision measurement results, the
i
i
i measured values
x
i
x_i
The residual corresponding to xi?
v
i
=
x
i
?
x
ˉ
v_i=x_i-\bar{x}
The absolute value of vi?=xixˉ satisfies
∣
v
i
∣
m
a
x
>
3
σ
x
ˉ
|v_i|_{max}>3\sigma_{\bar{x}}
∣vi?∣max?>3σxˉ?The error is a gross error, and the corresponding measurement value
x
i
x_i
xi? is an abnormal value and should be discarded.
where the standard deviation is estimated:
σ
=
1
n
?
1
∑
i
=
1
n
v
i
2
(Bessel formula)
\sigma=\sqrt{\frac{1}{n-1}\sum_{i=1}^{n}v_i^2} (Bessel formula)
σ=n?11?i=1∑n?vi2?
n
n
When n is larger
(
n
>
10
)
(n>10)
(n>10).
Grabbs Criterion
Introduction
The Grubbs criterion is a method for identifying outliers in normal samples or near-normal samples when the population standard deviation is unknown.
Residual error of a certain measurement
v
i
=
x
i
?
x
ˉ
>
T
0
(
n
,
α
)
v_i=x_i-\bar{x}>T_0(n,\alpha)
vi?=xixˉ>T0?(n,α), it is judged that this value contains gross errors and should be eliminated.
T
T
T value and number of repeated measurements
n
n
n and confidence probability
α
\alpha
α are related, so the Grubbs criterion is a better judgment criterion.
T
T
The T value is obtained by looking up the table.
The Grubbs criterion theory is more rigorous and has clear meaning of probability. It can be used in situations with strict requirements.
20
<
n
<
100
20
T
0
(
n
,
α
)
T_0(n,\alpha)
T0?(n,α) table
Arduino code implementation
//Error data elimination program, returns the average value of valid data //The parameter data input is the original measurement data. When returning, the first datanum are valid data. //The parameter baddata has no input data, and the output is the deleted data. //The input parameter datanum is the number of original measurement data //The parameter badnum has no input data, and the output is the number of deleted data. //The parameter rule is the Wright or Grubbs criterion selection, 3 is Wright criterion, 4 is Grubbs 95%, 5 is Grubbs 99%, and less than 3 is a custom criterion. double Detection(double data[], double baddata[], int datanum, int &badnum, int rule) {<!-- --> double data_b[datanum]; // Temporarily store reserved data double v[datanum]; // Residual error double g95[] = {<!-- -->1.15, 1.46, 1.67, 1.82, 1.94, 2.03, 2.11, 2.18, 2.23, 2.29, 2.33, 2.37, 2.41, 2.44, 2.47, 2.50, 2.53, 2.56, 2. 58 , 2.60, 2.62, 2.64, 2.66, 2.74, 2.81, 2.87, 2.96, 3.17}; // Grubbs 95% double g99[] = {<!-- -->1.16, 1.49, 1.75, 1.94, 2.10, 2.22, 2.32, 2.41, 2.48, 2.55, 2.61, 2.66, 2.71, 2.75, 2.79, 2.82, 2.85, 2.88, 2. 91 , 2.94, 2.96, 2.99, 3.01, 3.10, 3.18, 3.24, 3.34, 3.58}; // Grubbs 99% double bsl; // Bessel formula result double maxdev; // maximum deviation from valid Wright or Grubbs criterion double sum; // Accumulate temporary storage double average; // average int badindex; //The number of deleted data at a certain time int validNum = 0; //Number of valid data int proindex = 0; //Number of loops double lg; // coefficient of Wright or Grubbs criterion int i; if (rule <= 3) // When rule is less than or equal to 3, directly use Wright coefficient 3 or a custom rule value lg = rule; else if (rule > 5) // When rule is greater than 5, it is forced to be Wright's criterion lg = 3; badnum = 0; //Initialize the number of bad data // Loop until the number of valid data is less than or equal to 5 or there is no bad data while (1) {<!-- --> //Select different Grubbs criteria based on rule value if (rule == 4) // Grubbs 95% {<!-- --> if (datanum >= 100) lg = g95[27]; // When the number of data is greater than 100 else if (datanum >= 50) lg = g95[26]; else if (datanum >= 40) lg = g95[25]; else if (datanum >= 35) lg = g95[24]; else if (datanum >= 30) lg = g95[23]; else if (datanum >= 25) // When the number of data is greater than 25 but less than 30 lg = g95[22]; else // When the number of data is less than 25 lg = g95[datanum - 3]; } // When rule is 5, use Grubbs 99% criterion else if (rule == 5) // Grubbs 99% {<!-- --> if (datanum >= 100) // When the number of data is greater than 100 lg = g99[27]; else if (datanum >= 50) lg = g99[26]; else if (datanum >= 40) lg = g99[25]; else if (datanum >= 35) lg = g99[24]; else if (datanum >= 30) lg = g99[23]; else if (datanum >= 25) // When the number of data is greater than 25 but less than 30 lg = g99[22]; else // When the number of data is less than 25 lg = g99[datanum - 3]; } proindex + + ; // update loop times sum = 0; for (i = 0; i < datanum; i + + ) sum + = data[i]; average = sum / datanum; // Calculate the average sum = 0; for (i = 0; i < datanum; i + + ) {<!-- --> v[i] = data[i] - average; // Calculate residuals sum + = v[i] * v[i]; // Calculate the sum of squares of the residuals } bsl = sqrt(sum / (datanum - 1)); // Calculate Bessel formula standard deviation maxdev = lg * bsl; // Calculate the maximum deviation // Eliminate bad values, that is, eliminate gross error data validNum = 0; badindex = 0; for (i = 0; i < datanum; i + + ) if (fabs(v[i]) >= maxdev & amp; & amp; maxdev != 0) // When |Vi|> criterion deviation value {<!-- --> baddata[badnum + + ] = data[i]; // Use Xi as gross error data and put it into the bad data array badindex + + ; } else data_b[validNum + + ] = data[i]; // Otherwise, temporarily store the valid number data in the data_b array for (i = 0; i < validNum; i + + ) // Return the temporarily stored valid number data to the data array data data[i] = data_b[i]; datanum = validNum; // Use the current number of valid data as the number of data // Determine whether the stopping condition is met if (datanum > 5) // If the valid data is greater than 5, continue processing {<!-- --> if (badindex == 0) // If there is no gross error data that can be eliminated break; // Jump out of the loop, that is, the gross error data is processed } else break; // If the valid data is less than or equal to 5, jump out of the loop directly } return average; // The subroutine returns the average of the valid data }
Reference materials
[1] Arduino uses ultrasonic ranging module HC-SR04 to obtain accurate measurement values – elimination of error data. https://blog.csdn.net/m0_61543203/article/details/127185686
[2] Processing of Arduino measurement error data – Wright and Grubbs criteria to eliminate abnormal data. https://blog.csdn.net/m0_61543203/article/details/126780804
[3] Statistics What is Wright’s criterion? https://zhidao.baidu.com/question/144962833.html
[4] Abnormal data and deviation data processing principles. https://zhuanlan.zhihu.com/p/93855259