Lumos framework implements XOR problem modeling

Lumos framework implements XOR problem modeling

Through the feedforward neural network, construct the XOR function model. Use [Lumos](LumosNet (github.com)) deep learning framework to realize neural network construction, training and testing

Exclusive OR (XOR)

The exclusive OR function XOR is the operation of two binary numbers a and b. If and only if one of them is 1, the XOR result is 1, and the rest are 0.

XOR

label data[a, b]
1 [1, 0] [0, 1]
0 [1, 1] [0, 0]

XOR problem is a typical non-linear problem, compared with logical and, logical or

Logical AND

label data[a, b]
1 [1, 1]
0 [1, 0] [0, 1] [0, 0]

Logical OR

label data[a, b]
1 [1, 0] [0, 1] [1, 1]
0 [0, 0]

The scatter diagram of XOR, logical AND, and logical OR is as follows

It can be seen that the data distribution of logical and and logical or can be divided by a linear function, while XOR cannot be divided by a single linear function, so the XOR function is a typical nonlinear function

Model construction

Dataset

Dataset download XOR

Using the XORed four data as training and testing data, we hope that the constructed model can fully fit

Network structure

Connect Layer : [output= 4, bias=1, active=relu]
Connect Layer : [output= 2, bias=1, active=relu]
Connect Layer : [output= 1, bias=1, active=relu]
Mse Layer : [output= 1]

A total of three layers of fully connected layers, the number of neurons are 4, 2, 1

The fully connected layer has a bias item and uses the relu activation function

The loss function uses Mse (mean square error)

m

S

E.

=

1

no

S

S

E.

=

1

no

i

=

1

no

(

the y

i

^

?

the y

i

)

2

the y

i

^

forecast result,

the y

i

real label

MSE=\frac{1}{n}SSE=\frac{1}{n} \sum_{i=1}^{n}(\hat{y_{i}}-y_i)^{2} \ \ hat{y_{i}} prediction result, y_i true label

MSE=n1?SSE=n1?i=1∑n?(yi?^yi?)2yi?^?prediction result, yi?true label

Code Construction

Tag handler

[lumos](LumosNet (github.com)) framework supports custom label preprocessing, using one-hot encoding to process labels

void xor_label2truth(char **label, float *truth)
{<!-- -->
    int x = atoi(label[0]);
    one_hot_encoding(1, x, truth);
}

Network construction

First create a graph and add all layers to the graph

The data accepted by lumos must be in the form of pictures, so add the im2col layer to convert the image data into a one-dimensional vector

Graph *graph = create_graph("Lumos", 5);
Layer *l1 = make_im2col_layer(1);
Layer *l2 = make_connect_layer(4, 1, "relu");
Layer *l3 = make_connect_layer(2, 1, "relu");
Layer *l4 = make_connect_layer(1, 1, "relu");
Layer *l5 = make_mse_layer(1);
append_layer2grpah(graph, l1);
append_layer2grpah(graph, l2);
append_layer2grpah(graph, l3);
append_layer2grpah(graph, l4);
append_layer2grpah(graph, l5);

Weight Initializer

We initialize with KaimingHe

Initializer init = he_initializer();

Create session

Create a session and bind the network model

Session *sess = create_session("cpu", init);
bind_graph(sess, graph);

Create a training scene

Specify the training data, set the training batch to 4, train for 500 rounds, and the learning rate is 0.01

create_train_scene(sess, 1, 2, 1, 1, 1, xor_label2truth, "./xor/data.txt", "./xor/label.txt");
init_train_scene(sess, 500, 4, 2, NULL);
session_train(sess, 0.01, "./xorw.w");

Create test sessions and scenarios

Session *t_sess = create_session("cpu", init);
bind_graph(t_sess, graph);
create_test_scene(t_sess, 1, 2, 1, 1, 1, xor_label2truth, "./xor/test.txt", "./xor/label.txt");
init_test_scene(t_sess, "./xorw.w");
session_test(t_sess, xor_process_test_information);

Test result display

The lumos framework supports custom result display, printing test results and real label data, and Loss values

void xor_process_test_information(char **label, float *truth, float *predict, float loss, char *data_path)
{<!-- -->
    fprintf(stderr, "Test Data Path: %s\
", data_path);
    fprintf(stderr, "Label: %s\
", label[0]);
    fprintf(stderr, "Truth: %f\
", truth[0]);
    fprintf(stderr, "Predict: %f\
", predict[0]);
    fprintf(stderr, "Loss: %f\
\
", loss);
}

Complete code

#include 
#include 

#include "lumos.h"


void xor_label2truth(char **label, float *truth)
{<!-- -->
    int x = atoi(label[0]);
    one_hot_encoding(1, x, truth);
}

void xor_process_test_information(char **label, float *truth, float *predict, float loss, char *data_path)
{<!-- -->
    fprintf(stderr, "Test Data Path: %s\
", data_path);
    fprintf(stderr, "Label: %s\
", label[0]);
    fprintf(stderr, "Truth: %f\
", truth[0]);
    fprintf(stderr, "Predict: %f\
", predict[0]);
    fprintf(stderr, "Loss: %f\
\
", loss);
}

void xor () {
    Graph *graph = create_graph("Lumos", 5);
    Layer *l1 = make_im2col_layer(1);
    Layer *l2 = make_connect_layer(4, 1, "relu");
    Layer *l3 = make_connect_layer(2, 1, "relu");
    Layer *l4 = make_connect_layer(1, 1, "relu");
    Layer *l5 = make_mse_layer(1);
    append_layer2grpah(graph, l1);
    append_layer2grpah(graph, l2);
    append_layer2grpah(graph, l3);
    append_layer2grpah(graph, l4);
    append_layer2grpah(graph, l5);

    Initializer init = he_initializer();
    Session *sess = create_session("cpu", init);
    bind_graph(sess, graph);
    create_train_scene(sess, 1, 2, 1, 1, 1, xor_label2truth, "./xor/data.txt", "./xor/label.txt");
    init_train_scene(sess, 500, 4, 2, NULL);
    session_train(sess, 0.01, "./xorw.w");

    Session *t_sess = create_session("cpu", init);
    bind_graph(t_sess, graph);
    create_test_scene(t_sess, 1, 2, 1, 1, 1, xor_label2truth, "./xor/test.txt", "./xor/label.txt");
    init_test_scene(t_sess, "./xorw.w");
    session_test(t_sess, xor_process_test_information);
}

int main(){
    xor();
    return 0;
}

Training and results

Compile the code using the following command

gcc -fopenmp xor.c -I/usr/local/lumos/include/ -o main -L/usr/local/lumos/lib -llumos

After compiling, run

As you can see, the printed network structure

[Lumos] max 5 Layers
Im2col Layer : [flag=1]
Connect Layer : [output= 4, bias=1, active=relu]
Connect Layer : [output= 2, bias=1, active=relu]
Connect Layer : [output= 1, bias=1, active=relu]
Mse Layer : [output= 1]

[Lumos] Inputs Outputs
Im2col Layer 2* 1* 1 ==> 1* 2* 1
Connect Layer 1* 2* 1 ==> 1* 4* 1
Connect Layer 1* 4* 1 ==> 1* 2* 1
Connect Layer 1* 2* 1 ==> 1* 1* 1
Mse Layer 1* 1* 1 ==> 1* 1* 1

Finally get the following result

Session Start To Detect Test Cases
Test Data Path: ./xor/data/00.png
Label: 0
Truth: 0.000000
Predict: 0.129547
Loss: 0.016782

Test Data Path: ./xor/data/01.png
Label: 1
Truth: 1.000000
Predict: 0.904523
Loss: 0.009116

Test Data Path: ./xor/data/11.png
Label: 0
Truth: 0.000000
Predict: 0.104283
Loss: 0.010875

Test Data Path: ./xor/data/10.png
Label: 1
Truth: 1.000000
Predict: 0.942409
Loss: 0.003317
data label
[0, 0] 0
[0, 1] 1
[1, 0 ] 1
[1, 1] 0
Data Test Results True Label Loss
[0, 0] 0.129547 0.0 0.016782
[0, 1] 0.904523 1.0 0.009116
[1, 1 ] 0.104283 0.0 0.010875
[1, 0] 0.942409 1.0 0.003317

Exactly as expected, fits perfectly with the XOR function