[Digital Analog] [Modeling and Implementation] Interpolation Fitting

1. Basic knowledge

What is 1.1

Interpolation–data analysis method: can be used to supplement data

Fitting–Linear regression is a kind of fitting

Fitting refers to building a model based on known data points that best matches these data points. In statistics and machine learning, fitting involves adjusting the parameters of a model so that it best approximates the observed data. The goal of fitting is to find a model that is able to describe the phenomenon described by these points within the range of given data points.

A common fitting method is to use regression analysis to determine a best-fit line or curve to describe the relationship between the independent variable and the dependent variable. The quality of the fit is usually measured by the fit error, which is the difference between the model predictions and the actual observed values.

In practical applications, fitting can be used to predict the results of unknown data points, or it can be used for exploratory analysis of data to understand the characteristics and trends of the data. The choice of fitting method depends on the type of data and the required model complexity.

1.2 Expansion–Overfitting

When the model of a machine learning algorithm is too complex or the training data is too small, the problem of overfitting will occur.

The phenomenon of overfitting means that the machine learning algorithm performs very well on the training data set (high accuracy), but performs very poorly on new data sets, and cannot correctly generalize the data set during testing.

A generalization dataset is a new set of data used to evaluate the performance of a machine learning algorithm or model that is not the data the model was trained on. The purpose of the generalization data set is to test the model’s generalization ability for unknown data, that is, the model’s performance in a real environment.

In machine learning tasks, the data set is often divided into a training set and a test set. The training set is used to train the model, while the test set is used to evaluate the performance of the model after training. However, if you only use the test set to evaluate the model, the model may overfit (overfit) the training set. Therefore, to more accurately evaluate a model’s performance, a generalization dataset (also known as a validation set) is used to measure the model’s performance on new data.

The generalization data set is different from the training data and test data because the model was not trained on these data. By using a generalization dataset, the model’s adaptability and accuracy in real-world situations can be more fully assessed. This helps determine whether the model is overfitting the training data and has good predictive power on unknown data.

2. Interpolation problem

2.1 Lagrangian interpolation

The node basis function is given on the node, and a linear combination of the basis functions is made. The combination coefficient is an interpolation polynomial of the node function value

First find the points individually, then multiply them by the coefficients and accumulate them

2.2 Runge problem of high-order interpolation

Oscillation occurred more than 7 times

2.3 Commonly used

Use piecewise low-order interpolation, such as spline function interpolation

3. One-dimensional interpolation

x, y are interpolation points, xi, yi are the interpolated points and interpolation results, x, y and xi, yi are usually vectors;

‘method’ represents the interpolation method: ‘nearest’-nearest neighbor interpolation, ‘linear’-linear interpolation, ‘spline’-cubic spline interpolation, ‘cubic’-cubic interpolation, default is linear interpolation.

Example 1

Example 2

Grammar

subplot,lagrange

4. Two-dimensional interpolation

In one dimension, x and y are known. By knowing one of the values of x or y, the other value can be determined.

The same is true for two dimensions, x, y, z are known—know two and find three

mesh: curve grid map, contour: establish contour lines

Example 3

x: five lines, y: three columns

Matlab assigns values from top to bottom; —>So there are five rows and three columns, but temps should be written in the form of 3*5

contour(xi,yi,zi,20,’r’);——–Make all points with equal Zi into 20 red contour lines

[i,j]=find(zi==min(min(zi)));——Find the smallest point

x=xi(j),y=yi(i),zmin=zi(i,j)——–Assign values to x, y, z respectively; note that x corresponds to j and y corresponds to i

//But what is the significance of assigning values to x and y?

[i,j]=find(zi==max(max(zi)));

x=xi(j),y=yi(i),zmax=zi(i,j)

Description:

plot3 (space curve), mesh (space surface), surf (space surface), and contour (contour lines) are commonly used commands in three-dimensional drawing. The difference between mesh and surf is:

Mesh draws a surface mesh diagram, while surf draws a curved surface diagram.

meshz – Mesh surface plot with curtains

Example 4

figure(1);

meshz(x,y,z);

xlabel(‘X’),ylabel(‘Y’),zlabel(‘Z’);

[xi,yi]=meshgrid(0:50:5600,0:50:4800);

“[xi,yi]=meshgrid(0:50:5600,0:50:4800);” generates mesh points (xi,yi), which is equivalent to xi=0:50:5600;yi’=0:50 :4800;”, but the xi and yi generated by meshgrid(x,y) are same-dimensional matrices, the rows of xi are all x, and the columns of yi are all y.

Scattered point interpolation —-

The interpolation points (x, y) of the interpolation problem discussed earlier are all continuous. When (x, y) are scattered points, use the griddata(x,y,z,xi,yi, ‘method’) command to perform the calculation Dimensional interpolation.

Example (do it yourself)

1

x=[0 3 5 7 9 11 12 13 14 15];
y=[0 1.2 1.7 2.0 2.1 2.0 1.8 1.2 1.0 1.6];
xi=0:0.1:15;
yi=interp1(x,y,xi,'spline');
plot(x,y,'*',xi,yi)

dy=diff(y);
dx=diff(x);
dy_dx=dy./dx;
k0=dy_dx(1);
x1=x(x<=15 & amp;x>=13);
y1=y(x<=15 &x>=13);
Ymin=min(y1);
yindex=find(y1==Ymin);
Xmin=x1(yindex);
[Xmin,Ymin,k0]

First, the function of `diff(y)` is to calculate the difference between each two adjacent elements in y, and obtain an array whose length is 1 less than the original array, named `dy`.

Then, the function of `diff(x)` is to calculate the difference between two adjacent elements in x, and also obtain an array whose length is 1 less than the original array, named `dx`.

`dy./dx` divides the elements corresponding to `dy` and `dx` to obtain a new array `dy_dx`, in which each element is the slope of the derivative of y at that point. For example, `dy_dx(1)` represents the slope of the derivative of y at the first point.

Assign `dy_dx(1)` to variable `k0` to get the initial derivative slope.

The next code selects data points with x values in the range [13, 15] and finds the point with the smallest y value. First, `x(x<=15 & amp;x>=13)` means selecting the x value that satisfies the conditions `x<=15` and `x>=13`. Similarly, `y(x<=15 & amp;x>=13)` represents the corresponding y value.

Then, use `min(y1)` to find the minimum value in y and store it in the variable `Ymin`.

Then use `find(y1==Ymin)` to find the index of the element in y that is equal to `Ymin`, marked as `yindex`.

Finally, use `x1(yindex)` to find the element with the subscript `yindex` in x1, that is, the coordinates of the point with the x value in the range [13, 15] and the corresponding y value being the smallest are obtained, and stored in the variable `Xmin ` and `Ymin`.

2

1. Symbol settings

x0, y0, z0—-original point

x,y———-interpolate 10m apart, interpolate the elevation corresponding to z

xmax, ymax—–highest point

zmax—–corresponding elevation

! Question 1——-How to input x0, y0, z0—see above

2. Answer

 x0=100:100:400;
y0=100:100:500;
z0=[636 697 624 478 450;698 712 630 478 420; 680 674 598 412 400;662 626 552 334 310]
figure(1);
mesh(x0,y0,z0);
x=100:10:400;
y=100:10:500;
z=interp2(x0,y0,z0,x,y,'cubic');
figure(2);
mesh(x,y,z);

zmax=max(z);
zindex=find(z=zmax);
xmax=x(zindex);
ymax=y(yindex);
[xmax,ymax,zmax]

There is an error here: x,y does not generate grid point form

After improvement:

x0=100:100:400;
y0=100:100:500;
z0=[636 698 680 662;
697 712 674 626;
624 630 598 552;
478 478 412 334;
450 420 400 310];
[x,y]=meshgrid(100:10:400,100:10:500);
z=interp2(x0,y0,z0,x,y,'cubic');
meshz(x,y,z);
zmax=max(max(z));
[i,j]=find(z==zmax);
[x(j),y(i),zmax]

There is a problem that the point does not correspond to the highest point

Then solve:

x0=100:100:400;
y0=100:100:500;
z0=[636 698 680 662;
697 712 674 626;
624 630 598 552;
478 478 412 334;
450 420 400 310];
[x,y]=meshgrid(100:10:400,100:10:500);
z=interp2(x0,y0,z0,x,y,'cubic');
meshz(x,y,z);
zmax=max(max(z));
[i,j]=find(z==zmax);
[x(i,j),y(i,j),zmax]

Standard answer

x0=100:100:400;
y0=100:100:500;
z0=[636 697 624 478 450;698 712 630 478 420;
    680 674 598 412 400;662 626 552 334 310];
pp=csape({x0,y0},z0);
x=100:10:400;y=100:10:500;
z=fnval(pp,{x,y});
zmax=max(max(z));
[dx,dy]=find(z==zmax);
[x(dx),y(dy),zmax]
[X,Y]=meshgrid(x,y);
Z=z';
mesh(X,Y,Z);

Among them, the `csape` function is a function in MATLAB Curve Fitting Toolbox, which is used to interpolate smooth data. Its syntax is `pp = csape({x, y}, z)`, where `{x, y}` represents a cell array containing two vectors `x` and `y`, and `z` is a A matrix of values corresponding to `x` and `y`. This function returns a structure containing smooth interpolated data and splines. The interpolated data can be evaluated with the `ppval` function. //It is a data structure

The `fnval` function is a function in the MATLAB Curve Fitting Toolbox that evaluates interpolated data. Its syntax is `z = fnval(pp, {x, y})`, where `pp` is the smooth interpolation structure returned by `csape` or other interpolation function, `{x, y}` is a Cell array of vectors `x` and `y`, representing query point pairs. This function returns the value of the interpolation expression at the specified point.

Due to different interpolation methods, there may be some deviations in the elevation.

3

Type——scattered point interpolation

Symbol settings omitted

——My code

x=[129,140,103.5,88,185.5,195,105,157.5,107.5,77,81,162,162,117.5];
y=[7.5,141.5,23,147,22.5,137.5,85.5,-6.5,-81,3,56.5,-66.5,84,-33.5];
z=-[4,8,6,8,6,8,8,9,9,8,8,9,4,9];
[xi,yi]=meshgrid(77:5:195,-66.5:5:147);
zi=griddata(x,y,z,xi,yi,'cubic');
figure(1);
plot(x,y,'*')
figure(2);
meshz(xi,yi,zi);

——standard answer

x=[129,140,103.5,88,185.5,195,105,157.5,107.5,77,81,162,162,117.5];
y=[7.5,141.5,23,147,22.5,137.5,85.5,-6.5,-81,3,56.5,-66.5,84,-33.5];
z=-[4,8,6,8,6,8,8,9,9,8,8,9,4,9];
xmm=minmax(x);
ymm=minmax(y);
X=xmm(1):5:xmm(2);
Y=ymm(1):5:ymm(2);
Z=griddata(x,y,z,X,Y','cubic');
subplot(1,2,1), plot(x,y,'*')
subplot(1,2,2), mesh(X,Y,Z)

This code is used to calculate the minimum and maximum values of `x` and `y` and create new `X` and `Y` vectors.

The `minmax(x)` function is used to find the minimum and maximum values of the `x` vector respectively. Likewise, the `minmax(y)` function is used to find the minimum and maximum values of the `y` vector. The results are stored in `xmm` and `ymm`.

Then, by using the minimum and maximum values, a new `X` vector is created, starting from the minimum value of `xmm` and increasing in steps of 5 until it reaches the maximum value of `xmm`.

Similarly, a new `Y` vector is created, starting from the minimum value of `ymm` and increasing in steps of 5 until it reaches the maximum value of `ymm`.

Essentially the same

Please pay attention to this method, X and Y should be in different directions ——–There is a ‘transpose’