[Yugong Series] August 2023 3D Mathematics-Normalized Function

Article directory

  • foreword
  • 1. Normalization function
    • 1. Derivation process
    • 2. Application scenarios
    • 3.Case
  • 2. Normalization function encapsulation
    • 1.Normalization method
    • 2.Min-Max normalization

Foreword

A normalization function is a function that maps a set of data to a specific interval. Common normalization functions include min-max normalization, z-score normalization, average-based normalization, etc. Its purpose is to compare and process data of various scales and units in the same interval to avoid the impact of the scale and units of the data on the analysis results. Common specific intervals include [0,1], [-1,1], etc.

1. Normalization function

1. Derivation process

Let’s take min-max normalization as an example to deduce the normalization process:

Min-max normalization maps the sample data to the range of [0,1]. The specific steps are as follows:

  1. Determine the maximum value (max) and minimum value (min) of the sample data;
  2. For each data point x, normalize it to the value of (x-min)/(max-min);
  3. The result obtained is in the range [0,1].

Suppose you have the following sample data:

Data number Data value
1 3
2 5
3 7
4 9
  1. Determine the maximum and minimum values for sample data:

    max=9

    min = 3

  2. For each data point x, normalize it to the value of (x-min)/(max-min):

    The normalized result of data number 1 is (3-3)/(9-3)=0

    The normalized result of data number 2 is (5-3)/(9-3)=0.33

    The normalized result of data number 3 is (7-3)/(9-3)=0.66

    The normalized result of data number 4 is (9-3)/(9-3)=1

  3. The obtained result is in the range of [0,1], and the final normalized result is as follows:

Data number Data value Normalized result
1 3 0
2 5 0.33
3 7 0.66
4 9 1

In this way, we map data of different scales into the same interval to facilitate subsequent data processing and analysis.

2. Application scenarios

Normalization functions are widely used in data preprocessing. The following are some application scenarios of normalization functions:

  1. Feature scaling

In machine learning, feature scaling is a preprocessing technique that can put different features into the same range to avoid the excessive difference in the value ranges of different features that would affect the training and prediction of the model. The normalization function is a feature scaling method that can scale the data to [0,1] or [-1,1].

  1. Image Processing

In image processing, the normalization function is often used to scale the pixel values of the image to [0,1] or [-1,1] for better image processing and analysis.

  1. data visualization

In data visualization, the normalization function can map data with different attributes or directions into intervals of the same scale, making comparisons between different data more intuitive and accurate.

  1. data mining

In data mining, it is often necessary to preprocess raw data for better model training and prediction. The normalization function can scale the data into the same interval, avoiding the interference of different scales on the model.

The normalization function is a common technique in data processing. It can make the data easier to process and analyze, and it also plays a great role in model training and prediction.

3. Case

Suppose there is a data set that contains four characteristics of a person’s age, gender, height, and weight. Among them, the age range is 18 to 80, the gender is only male and female, the height range is 150 cm to 200 cm, and the weight range is 50 kg to 200 kg.

If these features are directly used to train machine learning models, the ranges and units of different features are different, which may have a certain impact on the training and prediction results of the model. Therefore, these features need to be normalized so that they have the same range and unit, and avoid the influence of extreme values on the model.

A commonly used normalization function is the Min-Max normalization function. Assuming that the value range of a feature is [a,b], then the Min-Max normalization function of the feature is:

x

n

o

r

m

=

x

?

a

b

?

a

x_{norm} = \frac{x – a}{b – a}

xnorm?=b?ax?a?

Among them, x is the original value of the feature,

x

n

o

r

m

x_{norm}

xnorm? is the normalized value. This function maps the original value to the range [0,1].

In this case, the three features of age, height and weight can be normalized using the Min-Max normalization function, the code is as follows:

import pandas as pd

data = pd.read_csv('data.csv') # read data set

#Define Min-Max normalization function
def min_max_scaler(x, a, b):
    return (x - a) / (b - a)

# Normalize age, height and weight
data['age_norm'] = data['age'].apply(min_max_scaler, args=(18, 80))
data['height_norm'] = data['height'].apply(min_max_scaler, args=(150, 200))
data['weight_norm'] = data['weight'].apply(min_max_scaler, args=(50, 200))

# View the normalized data set
print(data.head())

Output result:

 age gender height weight age_norm height_norm weight_norm
0 23 M 170 60 0.089552 0.375000 0.111111
1 28 F 165 65 0.149254 0.291667 0.177778
2 33 M 180 75 0.208955 0.541667 0.333333
3 45 F 155 40 0.462687 0.166667 0.000000
4 50 M 190 100 0.537313 0.708333 0.666667

It can be seen that after normalization, the value ranges of age, height and weight are mapped to [0,1], so that these features can be used for training machine learning models at the same time.

2. Normalized function encapsulation

1.Normalization method

Common normalization methods include:

  1. Min-Max normalization: Linearly map the data to the [0,1] interval, the formula is:

    X

    n

    o

    r

    m

    =

    X

    ?

    X

    m

    i

    n

    X

    m

    a

    x

    ?

    X

    m

    i

    n

    X_{norm} = \frac{X-X_{min}}{X_{max}-X_{min}}

    Xnorm?=XmaxXmin?X?Xmin

  2. Z-Score normalization: Map the data into a normal distribution with a mean of 0 and a standard deviation of 1. The formula is:

    X

    n

    o

    r

    m

    =

    X

    ?

    μ

    σ

    X_{norm} = \frac{X-\mu}{\sigma}

    Xnorm?=σX?μ?

  3. Decimal Scaling normalization: Linearly map the data to the [-1,1] interval by moving the position of the decimal point. The formula is:

    X

    n

    o

    r

    m

    =

    X

    1

    0

    j

    X_{norm} = \frac{X}{10^j}

    Xnorm?=10jX?where j is the number of digits such that the absolute value of all data is less than 1.

  4. Logarithmic normalization: Take the logarithm of the data and then normalize it.

The appropriate normalization method needs to be selected according to the specific situation. For example, for discrete data, methods such as one-hot encoding or binary encoding are generally used.

2.Min-Max normalization

Normalization refers to scaling input data (also called features) to the same range or scale so that the machine learning algorithm can better learn the relationship between features. Common normalization methods include Min-Max Normalization and Standardization.

Here is a simple Javascript function for min-max normalization of data:

function normalize(data) {<!-- -->
  var min = Math.min(...data);
  var max = Math.max(...data);
  var result = data.map(function(x) {<!-- -->
    return (x - min) / (max - min);
  });
  return result;
}

The function accepts an array as input and returns a min-max normalized array. Specific steps are as follows:

  1. Use the Math.min and Math.max methods to calculate the minimum and maximum values of the input array.
  2. Use the Array.prototype.map method to map each element of the array to its normalized value.
  3. The calculation formula is: normalized value = (raw value – minimum value) / (maximum value – minimum value).

Usage example:

var data = [1, 2, 3, 4, 5];
var normalizedData = normalize(data);
console.log(normalizedData); // [0, 0.25, 0.5, 0.75, 1]

This function is just a basic implementation and can be modified and extended as needed. For example, parameters can be added to control the normalization range.

Other normalization functions are as follows:

//Normalization function
function normalized(arr) {<!-- -->
  let sum = 0;

  for (let i = 0; i < arr.length; i + + ) {<!-- -->
    sum + = arr[i] * arr[i]
  }

  const middle = Math.sqrt(sum);

  for (let i = 0; i < arr.length; i + + ) {<!-- -->
    arr[i] = arr[i] / middle;
  }
}