Java combines GIS training algorithm to achieve maximum entropy

Resource download address: https://download.csdn.net/download/sheziqiong/88284338
Resource download address: https://download.csdn.net/download/sheziqiong/88284338
MaxEnt

This is a concise Java implementation of maximum entropy, providing training and prediction interfaces. The training algorithm uses the GIS training algorithm, and comes with a sample training set and a weather forecast demo.

MaxEnt training and prediction

Call method

public static void main(String[] args) throws IOException
{<!-- -->
        String path = "data/train.txt";
        MaxEnt maxEnt = new MaxEnt();
        maxEnt.loadData(path);
        maxEnt.train(200);
        List<String> fieldList = new ArrayList<String>();
        fieldList.add("Sunny"); // If the sky is sunny
        fieldList.add("Humid"); // and wet
        Pair<String, Double>[] result = maxEnt.predict(fieldList); // Predict the probability of going out and staying at home.
        System.out.println(Arrays.toString(result));
}

Java implementation of maximum entropy

Detailed explanation of algorithm
Maximum entropy belongs to the identification model, which can satisfy all known constraints and does not make any excessive assumptions about unknown information.

This is a concise Java implementation of maximum entropy, providing training and prediction interfaces. The training uses GIS training algorithm and comes with a sample training set. This article aims to introduce the principle, classification and implementation of maximum entropy. It does not involve formula derivation or other training algorithms, so please feel free to consume it.

Maximum entropy theory

Introduction

Maximum entropy belongs to the identification model, which can satisfy all known constraints and does not make any excessive assumptions about unknown information.

What are known constraints? This article won’t confuse you with obscure terminology. Here’s an example:

Your friend “goes out” or “stays at home” every day. These two activities are affected by “weather”, “mood” and “humidity” (because she is a girl) at the same time. We can call them characteristics.

Next, we collected some corresponding examples of “activity <-> characteristics” from her Weibo history, such as:

1. "The weather is really nice today and I am very happy, so I went out shopping."
2. “It’s so dry, I need a beauty sleep!”
3. "My spare tire No. 2 came to visit my Weibo homepage again. I'm so angry! The rain in Shanghai is so cold, I might as well watch American TV series!"
4. "My boyfriend invites me to go shopping. Even if the weather is bad, I will go!"
5.……

We can intuitively feel that this is a… green tea bitch ( ̄_ ̄|||) digress… We can intuitively feel that “good weather” is positively related to “going out”, and “good mood” “The same is true. Bad mood is negatively correlated, but this is not absolute. It may only be true when it is “not dry.”

Maximum entropy digitizes our intuition as a feature (or feature function) and calculates how important each feature is. Constraints mean that the distribution of predicted results satisfies the statistical probabilities of features, and these probabilities are uniformly distributed. The final result is that the entropy of the system is maximized.

Maximum entropy does not assume that “weather” and “mood” are independently distributed, nor does it admit that “weather” has an impact on “mood”. Maybe it does, but maximum entropy only ensures that the final result conforms to probability constraints.

If you have deep mathematical knowledge and enough time, you can choose to read the papers and derivation process in the appendix, where you will get rigorous descriptions and formula derivation.

Category

Maximum entropy models can be divided into two types for probability estimation based on sample information: joint maximum entropy models and conditional maximum entropy models. Suppose a is an event and b is the environment (or context) in which event a occurs, then the joint probability of a and b is recorded as p(a, b). Generally speaking, assuming that the set of all possible events is A and the set of all environments is B, then for any given a∈A, b∈B, the joint maximum entropy must be established to find the probability p(a, b) Model. To calculate the probability of event a occurring under the condition of b, that is, probability p(a | b), a conditional maximum entropy model must be established.

The maximum entropy model implemented in this article belongs to the conditional maximum entropy model.

Implementation

The code can be downloaded by yourself

Training set

If we make the Weibo collected above into a computer-readable data set data/train.txt (already included in this open source project):

Outdoor Sunny Happy
Outdoor Sunny Happy Dry
Outdoor Sunny Happy Humid
Outdoor Sunny Sad Dry
Outdoor Sunny Sad Humid
Outdoor Cloudy Happy Humid
Outdoor Cloudy Happy Humid
Outdoor Cloudy Sad Humid
Outdoor Cloudy Sad Humid
Indoor Rainy Happy Humid
Indoor Rainy Happy Dry
Indoor Rainy Sad Dry
Indoor Rainy Sad Humid
Indoor Cloudy Sad Humid
Indoor Cloudy Sad Humid

We see that the data has up to 4 columns. The first column of each row represents the activities of the day, and the rest represents the environment of the day.

Training

The purpose of training is actually to calculate a set of optimal Lagrange multipliers, which correspond to how important each feature function is.

GIS algorithm

Define λi as the Lagrange multiplier of the characteristic function i, C as the maximum number of features for each event, and the numerator and denominator in log represent the empirical distribution expectation and the model estimation expectation respectively.

The GIS algorithm uses the model of the Nth iteration to estimate the distribution of each feature in the training data. If it exceeds the actual one (the fraction is less than 1, log gets a negative number), make the corresponding parameter smaller (adding negative numbers will make it smaller). Otherwise, make them larger. When the characteristic distribution of the training sample is the same as that of the model, the optimal parameters are obtained.

This formula is described in Java as follows:

for (int i = 0; i < maxIt; + + i)
       {<!-- -->
          computeModeE(modelE);
          for (int w = 0; w < weight.length; w + + )
          {<!-- -->
           lastWeight[w] = weight[w];
           weight[w] + = 1.0 / C * Math.log(empiricalE[w] / modelE[w]);
           }
          if (checkConvergence(lastWeight, weight)) break;
        }

Prediction

The most exciting moment has finally arrived. As backup number 2, you are going to ask her to watch a movie tomorrow. You know from the weather forecast that it will be sunny tomorrow with good humidity. So what is the probability that she will agree to go out with you?

 String path = "data/train.txt";
        MaxEnt maxEnt = new MaxEnt();
        maxEnt.loadData(path);
        maxEnt.train(200);
        List<String> fieldList = new ArrayList<String>();
        fieldList.add("Sunny"); // If the sky is sunny
        fieldList.add("Humid"); // and wet
        Pair<String, Double>[] result = maxEnt.predict(fieldList); // Predict the probability of going out and staying at home.
        System.out.println(Arrays.toString(result));

Output

[Outdoor=0.9747657631914007, Indoor=0.025234236808599233]

It seems that the probability of going out is as high as 97%.

Well, I forgot to tell you. You also need to multiply it by the probability that she will agree to do something for you. This probability is probably close to 0. Why? Because you are a coder hahahahahahahahahahahahahahahahahahahaha!

“Single dogs are also dogs, and showing affection is also a form of dog abuse. You don’t have to love them, but please don’t hurt them.”


Resource download address: https://download.csdn.net/download/sheziqiong/88284338
Resource download address: https://download.csdn.net/download/sheziqiong/88284338