Draw X-Bar-S and X-Bar-R diagrams, monitor the process, and calculate the CPK process capability index

X-Bar-S charts and X-Bar-R charts are two control charts commonly used in statistical quality control to monitor the stability and consistency of the process. They differ primarily in how changes in data are calculated and presented and in the types of questions focused on.

X-Bar-S chart (mean and standard deviation chart):
- X-Bar represents the sample mean and S represents the sample standard deviation.
- X-Bar-S charts are used to monitor the mean and variability of the process.
- Each sample in the process is measured and the mean (X-Bar) and standard deviation (S) of that sample are calculated.
- There are usually two center lines on a control chart: one represents the mean and the other represents the standard deviation.
- The X-Bar-S chart is suitable for situations where both the process mean and the process standard deviation are of concern, for example, when it is necessary to ensure that the size and quality of the product are stable.
X-Bar-R Chart (Average vs. Range Chart):
- X-Bar represents the sample mean and R represents the sample range.
- The X-Bar-R chart is primarily used to monitor the mean and range of a process, rather than the standard deviation.
- Each sample is measured and the mean (X-Bar) and range (R, the difference between the maximum and minimum values) of that sample are calculated.
- There are usually two center lines on a control chart: one represents the mean and the other represents the range.
- The X-Bar-R chart is suitable for situations where you are concerned about the average value of the process, but are less concerned about the standard deviation of the process. For example, you need to ensure that the average value of the production process is near the target value.

In summary, both the X-Bar-S chart and the X-Bar-R chart are used to monitor the stability of the process, but they focus on different parameters. The X-Bar-S chart focuses on both the mean and the standard deviation, while the X-Bar-R chart focuses on the mean and the sample range. Choosing which chart to use depends on your focus on the process and the parameters that need to be monitored.

————————–

The following are the respective control limit calculation formulas for the X-Bar-S chart and the X-Bar-R chart:

The control limit calculation formula of the X-Bar-S chart:

Upper control limits (UCL) and lower control limits (LCL) are used in X-Bar charts (mean charts):
- UCL(X-Bar) = X-Double Bar + A3 * S-Bar
- LCL(X-Bar) = X-Double Bar – A3 * S-Bar
  Among them, X-Double Bar is the average of all sample averages, A3 is a constant related to the sample size, and S-Bar is the average of all sample standard deviations.
Upper control limits (UCL) and lower control limits (LCL) are used in S charts (standard deviation charts):
- UCL(S) = B4 * S-Bar
- LCL(S) = B3 * S-Bar
  Among them, B3 and B4 are constants related to the sample size, and S-Bar is the average of the standard deviations of all samples.

The control limit calculation formula of the X-Bar-R chart:

Upper control limits (UCL) and lower control limits (LCL) are used in X-Bar charts (mean charts):
- UCL(X-Bar) = X-Double Bar + A2 * R-Bar
- LCL(X-Bar) = X-Double Bar – A2 * R-Bar
  Among them, X-Double Bar is the average of all sample averages, A2 is a constant related to the sample size, and R-Bar is the average of all sample ranges.
Upper control limits (UCL) and lower control limits (LCL) are used in R charts (range charts):
- UCL(R) = D4 * R-Bar
- LCL(R) = D3 * R-Bar
  Among them, D3 and D4 are constants related to the sample size, and R-Bar is the average of all sample ranges.

These constants (such as A2, B3, B4, D3, and D4) depend on the sample size and control chart type selected, and the corresponding values can usually be found in the statistical reference table. The goal of control limits is to help detect any anomalies or changes in the process so that corrective actions can be taken promptly to maintain process stability.

—————

The average chart calculation methods of the two X-Bar control charts (X-Bar-S chart and X-Bar-R chart) are different because they focus on different parameters and the purpose of the control chart, which leads to different control limits. Calculation method.

Average graph of X-Bar-S graph (X-Bar graph):
- The X-Bar-S chart is used to monitor the mean and standard deviation of the process. It focuses on the overall average of the process as well as the variability within the process.
- The upper control limit (UCL) and lower control limit (LCL) calculation method of the mean chart includes the combination of the mean (X-Bar) and the standard deviation (S-Bar), because in the X-Bar-S chart, both are all parameters of concern.
- The purpose of control limits is to ensure that the mean and standard deviation of a process remain within acceptable limits.
Mean value chart of X-Bar-R chart (X-Bar chart):
- The X-Bar-R chart is used to monitor the average and range (i.e. the difference between the maximum and minimum values) of the process. It focuses on the average level of the process and the range variation of the process.
- The upper control limit (UCL) and lower control limit (LCL) calculation method of the average chart includes the combination of the average (X-Bar) and the range (R-Bar), because in the X-Bar-R chart, these two Parameters are of concern.
- The purpose of control limits is to ensure that the mean and range of the process remain within acceptable limits.

Therefore, different control chart types have different control limit calculation methods to reflect the parameters they focus on and the purpose of process monitoring. Choosing which chart to use depends on your focus on the process, for example, whether you are more interested in standard deviation or changes in range.

———-

The X-Bar-R control chart is a quality control tool used to monitor process stability and consistency. First, we need to calculate the average and range of each set of data, then draw a control chart, and calculate the CPK and pass rate.

Calculate the mean and range for each set of data: For each set of data, calculate the mean and range. Range (R) is the difference between the maximum and minimum values in each set of data.
To calculate an X-Bar chart: a. Calculate the average of all averages (X-two-point centerline). b. Draw an X-Bar control chart and distribute the average value of each group on the chart.
To calculate an R chart: a. Calculate the average of all range values (R-two-point centerline). b. Draw an R control chart, distributing the range values for each group on the graph.
Calculate control limits: a. To calculate the control limits of the X-Bar control chart, you can use the standard control chart formula, which is usually the average value of X-Bar plus or minus 3 times the average value of R. b. Calculate the control limits for the R chart, usually the mean R multiplied by a constant factor.
Calculate CPK (Process Capability Index): CPK = min[(USL – ] Among them, USL is the upper limit of product specifications, and LSL is the lower limit of product specifications.
sigma represents the standard deviation of the process, which is estimated by the mean range R in the control chart.
In a control chart, R usually represents the range of a set of sample data, and d2 is the control chart constant, which is used to correct the range to estimate the standard deviation.
sigma is calculated as: sigma = (average range R) / d2
Calculate the pass rate between the upper and lower limits of product specifications: Use a normal distribution table or statistical software to calculate the pass rate based on the CPK value and the upper and lower limits of the specification.

These steps require some data processing and graphing and are best performed using statistical software or tools. Ensure accurate calculation of averages, ranges, control limits, CPK, and pass rates.

———-

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import scipy.stats as stats
plt.rcParams['font.sans-serif'] = ['SimHei'] # Prevent Chinese labels from being garbled
plt.rcParams['axes.unicode_minus'] = False

df = pd.read_excel('GuanZi1011.xlsx') #<class 'pandas.core.frame.DataFrame'>
# print(data)

# print(df.columns)
# print(df.info)

# Calculate the average of each group (X-Bar)
x_bar = df.mean(axis=0) # pandas Series

# Calculate the sample standard deviation (S) of each group
s = df.std(axis=0, ddof=1)

# # [10 rows x 10 columns] 10 groups, 10 data/group
# Fixed A2, B3, B4 values
A3 = 0.975
B3 = 0.284
B4 = 1.716

# Calculate the control limits of the X-Bar control chart
X_Double_Bar = x_bar.mean()
S_Bar = s.mean()


UCL_X_Bar = X_Double_Bar + A3 * S_Bar
LCL_X_Bar = X_Double_Bar - A3 * S_Bar

print(f"UCL_X_Bar: {UCL_X_Bar}",f"LCL_X_Bar: {LCL_X_Bar}")

# Calculate the control limits of the S control chart
UCL_S = B4 * S_Bar
LCL_S = B3 * S_Bar

#Create an X-Bar control chart and add upper and lower control limits
plt.figure(figsize=(5.3, 6))
plt.subplot(2, 1, 1)
plt.plot(x_bar, marker='o')
plt.axhline(X_Double_Bar, color='r', linestyle='--', label='Overall Mean')
plt.axhline(UCL_X_Bar, color='g', linestyle='--', label='UCL')
plt.axhline(LCL_X_Bar, color='b', linestyle='--', label='LCL')
plt.title('X-Bar Control Chart')
# plt.xlabel('Sample Group')
plt.ylabel('X-Bar')
plt.legend()

#Create an S control chart and add upper and lower control limits
plt.subplot(2, 1, 2)
plt.plot(s, marker='o')
plt.axhline(S_Bar, color='r', linestyle='--', label='Overall Mean')
plt.axhline(UCL_S, color='g', linestyle='--', label='UCL')
plt.axhline(LCL_S, color='b', linestyle='--', label='LCL')
plt.title('S Control Chart')
# plt.xlabel('Sample Group')
plt.ylabel('S')
plt.legend()

plt.tight_layout()
plt.show()


# X-Bar-S chart control limit calculation formula:
# UCL(X-Bar) = X-Double Bar + A3 * S-Bar
# LCL(X-Bar) = X-Double Bar - A3 * S-Bar
#UCL(S) = B4 * S-Bar
# LCL(S) = B3 * S-Bar

#The control limit calculation formula of the X-Bar-R chart:
#UCL(X-Bar) = X-Double Bar + A2 * R-Bar
# LCL(X-Bar) = X-Double Bar - A2 * R-Bar
#UCL(R) = D4 * R-Bar
# LCL(R) = D3 * R-Bar

import numpy as np
import matplotlib.pyplot as plt

# Generate 10 groups of sample data with 10 data in each group (100 data in total)
np.random.seed(0)
data = np.random.randn(10, 10)

# Calculate the average of each group (X-Bar)
x_bar = np.mean(data, axis=1)

# Calculate the range (R) of each group
r = np.ptp(data, axis=1)

# Fixed A2, D4, D3 values
A2 = 0.308
D4 = 1.777
D3 = 0.223

# Calculate the control limits of the X-Bar control chart
X_Double_Bar = np.mean(x_bar)
R_Bar = np.mean(r)
UCL_X_Bar = X_Double_Bar + A2 * R_Bar
LCL_X_Bar = X_Double_Bar - A2 * R_Bar

# Calculate control limits for R control charts
UCL_R = D4 * R_Bar
LCL_R = D3 * R_Bar

#Create an X-Bar control chart and add upper and lower control limits
plt.figure(figsize=(12, 6))
plt.subplot(2, 1, 1)
plt.plot(x_bar, marker='o')
plt.axhline(X_Double_Bar, color='r', linestyle='--', label='Overall Mean')
plt.axhline(UCL_X_Bar, color='g', linestyle='--', label='UCL')
plt.axhline(LCL_X_Bar, color='b', linestyle='--', label='LCL')
plt.title('X-Bar Control Chart')
plt.xlabel('Sample Group')
plt.ylabel('X-Bar')
plt.legend()

#Create an R control chart and add upper and lower control limits
plt.subplot(2, 1, 2)
plt.plot(r, marker='o')
plt.axhline(R_Bar, color='r', linestyle='--', label='Overall Mean')
plt.axhline(UCL_R, color='g', linestyle='--', label='UCL')
plt.axhline(LCL_R, color='b', linestyle='--', label='LCL')
plt.title('R Control Chart')
plt.xlabel('Sample Group')
plt.ylabel('R')
plt.legend()

plt.tight_layout()
plt.show()

Draw X-Bar-R diagram, monitor the process, and calculate CPK process capability index

import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm
import pandas as pd

plt.rcParams['font.sans-serif'] = ['SimHei'] # Prevent Chinese labels from being garbled
plt.rcParams['axes.unicode_minus'] = False

# Replace the data below with your actual data
data = pd.read_excel('GuanZi1011.xlsx')

# Calculate the average of each group of samples
x_bar = data.mean(axis=0) # Series

# Calculate the range R of each group of samples
r = data.max(axis=0) - data.min(axis=0)


x_bar_avg = np.mean(x_bar)
r_avg = np.mean(r)

print(f'x_bar_avg: {x_bar_avg}')
print(f'r_avg: {r_avg}')

# Control coefficient of X-Bar-R graph # [10 rows x 10 columns]
d2 = 3.078 # The constant of the R control chart can be obtained by looking up the table according to the sample size.
A2 = 0.308 # The constant of the X-Bar control chart can be obtained by looking up the table according to the sample size.
D4 = 1.777 # The constant of the R control chart can be obtained by looking up the table according to the sample size.
D3 = 0.223 # The constant of the R control chart can be obtained by looking up the table according to the sample size.


USL = 5.1 # Upper specification limit
LSL = 4.9 # Lower specification limit

# sigma represents the standard deviation of the process, which is estimated by the average range R in the control chart.
# In a control chart, R usually represents the range of a set of sample data, and d2 is the control chart constant, which is used to correct the range to estimate the standard deviation.
# sigma is calculated as: sigma = (average range R) / d2
sigma = r_avg/d2

# Z-score (i.e. standard score, Z = (x - μ) / σ) describes a data point in terms of its relationship to the mean and standard deviation of a set of points
Zscore = min((USL - x_bar_avg) / sigma, (x_bar_avg - LSL) / sigma)

# Calculate CPK (Process Capability Index):
# CPK = min[(USL - X-two-point centerline) / (3 * standard deviation), (X-two-point centerline - LSL) / (3 * standard deviation)]
# Among them, USL is the upper limit of product specifications, and LSL is the lower limit of product specifications.
cpk = min((USL - x_bar_avg) / (3 * sigma), (x_bar_avg - LSL) / (3 * sigma))
# Calculate pass rate
z_upper = (USL - x_bar_avg) / (sigma)
z_lower = (LSL - x_bar_avg) / (sigma)
cpk_area = norm.cdf(z_upper,0,1) - norm.cdf(z_lower,0,1)
cp= (USL-LSL) / (6* sigma) # How wide is the specification range relative to the variability of the process
ppm = (1 - cpk_area) * 1000000 ## Converted into the number of defective products per million opportunities

print(f'sigma: {sigma}')
print(f'Zscore: {Zscore}')
print(f'CPK: {cpk}')
print(f"cpk_area:{cpk_area}")
print("CP:", cp)
print(f'PPM: {ppm}')

#The control limit calculation formula of the X-Bar-R chart:
#UCL(X-Bar) = X-Double Bar + A2 * R-Bar
# LCL(X-Bar) = X-Double Bar - A2 * R-Bar
#UCL(R) = D4 * R-Bar
# LCL(R) = D3 * R-Bar

x_bar_UCL = x_bar_avg + A2 * r_avg
x_bar_LCL = x_bar_avg - A2 * r_avg

r_UCL = D4 * r_avg
r_LCL = D3 * r_avg

# Draw X-Bar control chart
plt.figure(figsize=(7, 6))
plt.subplot(2, 1, 1)
plt.plot(x_bar, marker='o')
plt.axhline(x_bar_avg, color='r', linestyle='--', label='X-Bar Center Line')
plt.axhline(x_bar_UCL, color='g', linestyle='--', label='X-Bar UCL')
plt.axhline(x_bar_LCL, color='g', linestyle='--', label='X-Bar LCL')
plt.title('X-Bar Control Chart')
plt.legend()
plt.grid()

# Draw R control chart
plt.subplot(2, 1, 2)
plt.plot(r, marker='o')
plt.axhline(r_avg, color='r', linestyle='--', label='R Center Line')
plt.axhline(r_UCL, color='g', linestyle='--', label='R UCL')
plt.axhline(r_LCL, color='g', linestyle='--', label='R LCL')
plt.title('R Control Chart')
plt.legend()
plt.grid()
plt.show()

# X-Bar-S chart control limit calculation formula:
# UCL(X-Bar) = X-Double Bar + A3 * S-Bar
# LCL(X-Bar) = X-Double Bar – A3 * S-Bar
#UCL(S) = B4 * S-Bar
# LCL(S) = B3 * S-Bar

#The control limit calculation formula of the X-Bar-R chart:
#UCL(X-Bar) = X-Double Bar + A2 * R-Bar
# LCL(X-Bar) = X-Double Bar – A2 * R-Bar
#UCL(R) = D4 * R-Bar
# LCL(R) = D3 * R-Bar