SVM multi-classification problem with L1 regular term, optimized with MindOpt

The regularization term solves ill-posed problems or prevents overfitting by adding information. It has a wide range of applications, including machine learning, data statistics, and images. Common regularization items include L0, L1, and L2. This article is about the L1 regularization item. The L1 regularization term refers to the sum of the absolute values of the elements in the weight vector x, which can be expressed as **||x||1. **

In machine learning, classification refers to the predictive modeling problem of predicting a class label for a given example in input data. There are four common classification tasks: binary classification, multi-category classification, multi-label classification, and imbalanced classification.

Multi-class classification refers to classification tasks with more than two class labels, and multivariate probability distribution models are usually used to model multi-class classification tasks. Algorithms that can be used for multi-category classification tasks include: k-nearest neighbor algorithm, decision tree, naive Bayesian, random forest, gradient boosting, etc., but algorithms for binary classification logistic regression and support vector machine Also suitable for multi-class classification problems.

Support Vector Machine (SVM) is a class of linear classifiers that classify data in a supervised learning manner. Its core idea is to find the decision boundary that maximizes the distance between samples of different categories in the feature space. Regularization terms are often introduced into the SVM model to improve model robustness or introduce prior knowledge. L1 – regularized SVM is to add L1 regularization term (that is, ||x||1) to the model, and the sparseness of the feature vector (will make some parameters in the feature vector x equal to 0) This prior knowledge is introduced into the model to improve classification efficiency.

This article will describe the use of MindOpt to optimize the SVM multi-classification with L1 regularization problem.

Description of the problem

Classification is one of the most fundamental tasks in the field of machine learning. Its purpose is to establish a mapping relationship between an input vector x and a categorical variable y. For example:

  • In computer vision, we need to classify each picture into categories such as “animal”, “ship”, “plant” according to its content, for use in downstream tasks. In this application, x is a vector of image pixels, and the categorical variable y is a scalar pointing to its specific category.
  • We will classify the received emails into “important emails”, “marketing emails”, “spam emails” etc. according to their title, sender, content, attached pictures or links, etc. In this application, x is the vectorized input of emails, and y is a scalar that points to a specific class.

Question model

Suppose we have m sets of data

{

x

i

,

the y

i

}

i

=

1

m

\{x_i, y_i\}_{i=1}^m

{xi?,yi?}i=1m? , where

x

i

R

d

x_i\in \mathbb{R}^d

xi?∈Rd is the eigenvector,

the y

i

{

1

,

2

,

?
?

,

K

}

y_i\in\{1,2,\cdots, K\}

yi?∈{1,2,?,K} is the eigenvector

x

i

x_i

xi? corresponds to the category label. Therefore, all data belong to these K categories.

In multi-class classification problems, we need to find a parameter matrix

W

R

K

x

d

W\in\mathbb{R}^{K \times d}

W∈RK×d for unseen data samples

x

x

x to classify. For example, after getting the parameter matrix

W

W

After W, we will calculate

the s

=

W

x

R

K

s = Wx\in \mathbb{R}^K

s=Wx∈RK.
by categorical vector

the s

the s

The size of each element of s, we can determine the current data sample

x

x

category of x.

?

1

\ell_1

?1?- The problem model of regularized SVM is as follows:

in

W

W

W is the parameter matrix to be optimized,

w

k

T

w_k^T

wkT? Yes

W

W

the kth row of W,

w

the y

i

T

w_{y_i}^T

wyi? T? Yes

W

W

W’s No.

the y

i

y_i

yi? line, where

the y

i

{

1

,

?
?

,

K

}

y_i\in\{1,\cdots,K\}

yi?∈{1,?,K} is the first

i

i

i samples

x

i

x_i

The category to which xi? belongs.

{

ξ

i

}

i

=

1

m

\{\xi_i\}_{i=1}^m

{ξi?}i=1m? is also a variable to be optimized, where

ξ

i

\xi_i

ξi? is a non-negative scalar. if

the y

i

=

k

y_i=k

yi?=k, we have

e

i

k

=

1

e_i^k=1

eik?=1; if

the y

i

k

y_i \\
eq k

yi?=k, we have

e

i

k

=

0

e_i^k = 0

eik?=0.

Modeled as a linear programming problem

In order to put the above

?

1

\ell_1

?1?- regularized SVM is modeled as a linear programming problem, we introduce auxiliary variables

Z

R

K

x

d

Z\in \mathbb{R}^{K \times d}

Z∈RK×d, where

z

i

j

z_{ij}

The constraint that zij? needs to satisfy is

w

i

j

z

i

j

,

z

i

j

0

|w_{ij}| \le z_{ij}, z_{ij} \ge 0

∣wij?∣≤zij?,zij?≥0. With auxiliary variables

Z

Z

Z, we can model the above SVM problem as a linear programming problem:

in

z

k

T

z_k^T

zkT? is the auxiliary variable

Z

Z

Z’s

k

k

k row. The above problem is further equivalent to

If order

p

k

=

z

k

?

w

k

,

q

k

=

z

k

+

w

k

p_k = z_k – w_k, q_k = z_k + w_k

pk?=zkwk?,qk?=zk? + wk?, we have

Bringing the above relationship into the linear programming problem above, we have

  • variable:

    [

    P

    ,

    Q

    ,

    ξ

    ]

    R

    2

    K

    d

    +

    m

    [P, Q, \xi] \in \mathbb{R}^{2Kd + m}

    [P,Q,ξ]∈R2Kd+m, where each variable is non-negative

  • Inequality constraints:

    K

    m

    Km

    Km inequality constraints

  • The above problem can be transformed into

in:

θ

=

[

p

1

T

,

?
?

,

p

K

T

,

q

1

T

,

?
?

,

q

K

T

,

ξ

1

,

?
?

,

ξ

m

]

T

R

2

K

d

+

m

\theta = [p_1^T,\cdots, p_K^T, q_1^T,\cdots, q_K^T, \xi_1, \cdots, \xi_m]^T \in \mathbb{ R}^{2Kd + m}

θ=[p1T?,?,pKT?,q1T?,?,qKT?,ξ1?,?,ξm?]T∈R2Kd+m,

A

R

K

m

x

(

2

K

d

+

m

)

,

b

R

K

m

A\in \mathbb{R}^{Km \times (2Kd + m)}, b\in \mathbb{R}^{Km}

A∈RKm×(2Kd+m),b∈RKm

Prepare data

Option 1: Use sample data directly

We saved an OK data in the folder src/model/mnist.scale100_tianchi. This sample data can be viewed in Solver on the cloud.

Click to open it, as shown in the screenshot below:
image.png\

Option 2: Generate data through mnist.scale dataset and libSVM

Please refer to the original C/S version of the code to view Choose 2. Generate data (including code) through mnist.scale dataset and libSVM to generate data by yourself, and the required library for data generation is the Notebook Not supported at the moment, please run the generated data on a personal computer, and then upload the data (you can directly drag it into the folder).

Using the MindOpt Solver API

Directly use the API of the solver, you need to consult the API documentation to understand the meaning of the API, which is not as readable as the modeling language. Please refer to https://solver.damo.alibaba.com/doc/html/API reference/API-python/index.html to view the usage instructions of Python API.

Regarding the examples of Python, there are examples of Python in the 5. Modeling and optimization solution chapter of the document. Here is an LP problem, we can refer to: https://solver.damo.alibaba.com/doc/html/model/lp/linear optimization-python.html

Below we describe the operation method in this platform environment in two ways:

Method 1: Enter the code directly in the cell to run

# LP_6_svm.py

from mindoptpy import *
import time

if __name__ == "__main__":
    
    #parameters
    data_folder = "./model/mnist.scale100_tianchi/"
    soln_file = "./model/LP_6_svm.sol"
    MDO_INFINITY = MdoModel. get_infinity()

    
    # read file meta for size of A
    f = open(data_folder + "meta", "r")

    A_size = []
    for line in f:
        num = int(line. split(' ')[-1][:-1])
        A_size.append(num)
    m, n = A_size[1], A_size[0]

    f. close()
    
    # read file b
    b = []
    f = open(data_folder + "b", "r")

    for line in f:
        b_element = float(line[:-1])
        b.append(b_element)

    f. close()
    
    # read file c
    c = []
    f = open(data_folder + "c", "r")

    for line in f:
        c_element = float(line[:-1])
        c.append(c_element)

    f. close()
    
    # --------Step 1. Establish LP model-----------

    model = MdoModel()
    
    try:
        # -----Step 2. Input model. -----------
        # Set minimize problem
        model.set_int_attr("MinSense", 1)


        # Add empty constraints.
        cons = []
        for j in range(m):
            cons_name = "c" + str(j)
            cons.append(model.add_cons(-MDO_INFINITY, b[j], None, cons_name))

        # Input columns.
        f = open(data_folder + "A", "r")

        col = []
        for j in range(n):
            col. append(MdoCol())

        for idx, line in enumerate(f):
            line_lst = line. split('\t')
            row_id, col_id, val = int(line_lst[0]), int(line_lst[1]), float(line_lst[2][:-1])
            col[col_id].add_term(cons[row_id], val)

        # Add variables.
        # Note that the nonzero elements are inputted in a column-wise order here.
        x = []

        for j in range(n):
            x_name = "x" + str(j)
            x.append(model.add_var(0.0, MDO_INFINITY, c[j], col[j], x_name, False))

        print('LP model is established.\\
')


        # ------Step3. Solve the problem and populate the result.-------
        model.solve_prob()
        model.display_results()
        time. sleep(0.01) #for print
        
        status_code, status_msg = model. get_status()
        if status_msg == "OPTIMAL":
            print("----\\
")
            print("The solver terminated with an OPTIMAL status (code {0}).".format(status_code))

            print("The optimal value of the objective function is: {0}".format(model.get_real_attr("PrimalObjVal")))
            print(" - Primal solution : {}".format(model.get_real_attr("PrimalObjVal")))
            print(" - dual solution: {}".format(model.get_real_attr("DualObjVal")))
            print(" - solution time: {} sec.".format(model.get_real_attr("SolutionTime")))
            print("There are too many variables, please open the result file" + soln_file + " to view the variable values")
                
        else:
            print("Optimizer terminated with a(n) {0} status (code {1}).".format(status_msg, status_code))

    except MdoError as e:
        print("Received Mindopt exception.")
        print(" - Code : {}". format(e. code))
        print(" - Reason : {}". format(e. message))
    except Exception as e:
        print("Received exception.")
        print(" - Reason : {}". format(e))
    finally:
        # Step 5. Free the model.
        model. free_mdl()
    

After running the code, the following results are obtained:

Start license validation (current time : 01-MAR-2023 21:00:37).
License validation terminated. Time : 0.002s

LP model is established.

Model summary.
 - Num. variables : 15700
 - Num. constraints : 1000
 - Num. nonzeros : 556788
 - Bound range : [1.0e + 00,1.0e + 00]
 - Objective range : [5.0e-01,1.0e+00]
 - Matrix range : [2.0e-03,1.0e+00]

Presolver started.
Presolver terminated. Time : 0.089s

Simplex method started.

    Iteration Objective Dual Inf. Primal Inf. Time
            0 0.00000e + 00 2.7794e + 11 2.4000e + 01 0.11s
            3 1.69224e + 00 6.9867e + 00 0.0000e + 00 0.12s CG pass 1
           11 1.27594e + 00 2.7885e + 00 0.0000e + 00 0.12s CG pass 2
End of column generation.
            0 0.00000e + 00 2.5246e + 11 3.2000e + 01 0.12s optimizing a block (26.5%)
           41 1.94879e + 00 0.0000e + 00 0.0000e + 00 0.12s
            0 0.00000e + 00 2.1551e + 11 2.2000e + 01 0.12s
            4 2.23388e + 00 3.8777e + 00 0.0000e + 00 0.12s CG pass 1
            8 1.98044e + 00 2.0357e + 01 0.0000e + 00 0.12s CG pass 2
End of column generation.
            0 0.00000e + 00 2.1533e + 11 2.2000e + 01 0.12s
            3 2.79271e + 00 1.0703e + 01 0.0000e + 00 0.12s CG pass 1
            7 2.11345e + 00 3.4990e + 01 0.0000e + 00 0.12s CG pass 2
End of column generation.
Model fingerprint: ==gZ3ZWYhRmZ3Z2bkV2dndmZ
Postsolver started.
Simplex method terminated. Time : 0.018s

Optimizer summary.
 - Optimizer used : Simplex method
 - Optimizer status : OPTIMAL
 - Total time : 0.129s

Solution summary. Primal solution
 - Objective : 6.4074250003e + 00
----

The solver terminated with an OPTIMAL status (code 1).
The optimal value of the objective function is: 6.4074250003085975
 - Original solution: 6.4074250003085975
 - Dual solution: 6.407425000308594
 - Time to solve: 0.129049877 sec.
There are too many variables, please open the result file ./model/LP_6_svm.sol to view the variable values

Method 2: Run the .py file directly from the command line

The above is to run all the scripts directly in the cell. We can also create a new document and store the Python code in the LP_6_svm.py file (source code in method 1) of the src/python_src folder. Note the relative directory change of the file. Then open Terminal in the Launcher and execute the python xx.py file to run.

You can also download this .py file, install the MindOpt solver on your computer, and then run it in your computer environment.

Luancher can be opened by clicking the + box in the upper left corner, and the Terminal is at the bottom, as shown in the screenshot. The opened window can be dragged to adjust the position.
image.png\

Then run the following command on the Terminal command line:

cd src/python_src
python LP_6_svm.py

The result obtained by running is the same as method 1:
image.png\

Solution result

The optimal value of the objective function is: 6.4074250003085975. The solution file exists in src/model/LP_6_svm.sol.

Contact us

Dingding Q&A group: 32451444
DingTalk activity group: 18890022111
E-mail address: [email protected]
More update notices: https://solver.damo.alibaba.com