Go language multipart library analysis

Introduction

This article is a practical part of the previous article. After mastering the basic request format of multipart/form-data in HTTP, now through the official multipart< of Go language /code> library to gain a deeper understanding of how to send and handle requests in the multipart/form-data format

Let’s first look at a piece of code for client requests and a piece of code for server-side processing requests.

1. Client request

package main

import (
    "fmt"
    "io"
    "io/ioutil"
    "net/http"
    "time"
    "mime/multipart"
)

const (
    destURL = "localhost:8080"
)

func main() {<!-- -->
    var bufReader bytes.Buffer
    mpWriter := multipart.NewWriter( & amp;bufReader)
    fw, err := mpWriter.CreateFormFile("upload_file", "a.txt")
    if err != nil {<!-- -->
        fmt.Println("Create form file error: ", err)
        return
    }
    
    f, _ := os.Open("a.txt")
    _, err = io.Copy(fw, f)
    if err != nil {<!-- -->
        return nil, fmt.Errorf("copying f %v", err)
    }
    
    mpWriter.Write([]byte("this is test file"))
    mpWriter.WriteField("name", "Trump")
    mpWriter.Close()

    client := & amp;http.Client{<!-- -->
        Timeout: 10 * time.Second
    }
    
    // resp, err := http.Post(destURL, writer.FormDataContentType(), bufReader)
    
    req, _ := http.NewRequest("POST", destURL, bufReader)
    req.Header.Set("Content-Type", writer.FormDataContentType())
    req.Header.Set("Accept" , "text/html,application/xhtml + xml,application/xml;q=0.9,*/*;q=0.8")
    req.Header.Set("Accept-Charset" , "GBK,utf-8;q=0.7,*;q=0.3")
    req.Header.Set("Accept-Encoding" , "gzip,deflate,sdch")
    response,_ := client.Do(req)
    if response.StatusCode == 200 {<!-- -->
        body, _ := ioutil.ReadAll(response.Body)
        bodystr := string(body)
        fmt.Println(bodystr)
    }
}

Analysis

In the Go language, if you want to send a request body in the multipart/form-data format, you can use the Writer of the officially provided mime/multipart library. >.

This Writer is as follows:

// A Writer generates multipart messages.
type Writer struct {<!-- -->
    w io.Writer
    boundary string
    lastpart *part
}

The three members of the Writer structure are clear and simple, corresponding to the style of the body in the multipart/form-data format.
Where w is a buffer writer that we use to fill the request body into it, boundary is usually a randomly generated random string, lastpart is the ending character --boundary–,

Creating a request body in multipart/form-data format is divided into 4 steps:

(1) Create Writer
(2) Write customized Header into Writer
(3) Write the body content into Writer (the body content can be a file or a field list, etc.)
(4) Write the ending character boundary (just call Writer.Close())

(1) Create Writer

// NewWriter returns a new multipart Writer with a random boundary,
// writing to w.
func NewWriter(w io.Writer) *Writer {<!-- -->
    return & amp;Writer{<!-- -->
        w: w,
        boundary: randomBoundary(),
    }
}

(2) Write the header information of each Part into Writer

Create the header part of Part by calling w.CreatePart(mimeHeader)

A typical Part header includes boundary and Header parts, with the following style:

-----------------------------9051914041544843365972754266
Content-Disposition: form-data; name="file"; filename="a.txt"
Content-Type: text/plain

The w.CreatePart() function is used to create the above content. The parameter it accepts is MIMEHeader, and it returns a Writer. You can continue to write the body part of the content to this Writer.

The steps to call w.CreatePart() are as follows:

h := make(textproto.MIMEHeader)
h.Set("Content-Disposition", fmt.Sprintf(`form-data; name="%s"; filename="%s"`, escapeQuotes(fieldname), escapeQuotes(filename)) )
h.Set("Content-Type", "application/octet-stream")
w.CreatePart(h)

(3) Write the Body content of each Part into Writer

Now, calling w.CreatePart() successfully creates the header part of a Part, and we also need to write the contents of the Body part of the Part into it. Depending on the content, we divide it into two parts:

<1> Body content is a file
For files, we can directly call io.Copy(w, f) to write the file content into the Writer just returned by w.CreatePart()
For the file stream fileReader, directly call io.Copy() to copy to w, as follows:

f, err := os.Open(filename)
_, err = io.Copy(w, f)
if err != nil {<!-- -->
    return nil, fmt.Errorf("copying file to w error: %v", err)
}

Explanation:
Essentially, forms are in the form of key-value. Key is the control (field) name, and value is the specific value. In the header information of the Part, we write that the key is filename, and the value is the filename we want to write. the file contents.

Tip:
For the Part header of the file type, the Go language's multipart.Writer provides the CreateFormFile() function, which encapsulates the process of creating the Part header. We can directly call w.CreateFromFile() to create the Part header content. as follows:

// CreateFormFile is a convenience wrapper around CreatePart. It creates
// a new form-data header with the provided field name and file name.
func (w *Writer) CreateFormFile(fieldname, filename string) (io.Writer, error){<!-- -->
    h := make(textproto.MIMEHeader)
    h.Set("Content-Disposition", fmt.Sprintf(`form-data; name="%s"; filename="%s"`, escapeQuotes(fieldname), escapeQuotes(filename)) )
    h.Set("Content-Type", "application/octet-stream")
    return w.CreatePart(h)
}

<2> Body content is Field field
For field types, the method is similar to file processing. First create the Part header, and then create the corresponding Body content. multipart.Writer provides the CreateFormField() function to create the Part header, which also calls CreatePart() internally, and ultimately returns a Writer. We can continue to fill this Writer with body content.

Of course, if you already have a multipart.Writer, you can directly call its WriteField() function to write fields in it, because WriteField() internally encapsulates the above-mentioned CreateFormField() function. The example is as follows:

func main() {<!-- -->
    writer := multipart.NewWriter(body)
    
    fields := map[string]string{<!-- -->
        "filename": filename,
        "age": "88",
        "ip": "198.162.5.1",
        "city": "New York",
    }

    for k, v := range fields {<!-- -->
        _ = writer.WriteField(k, v)
    }
}

Note: For field fields created by multipart, each Part can only have one pair, that is, one part can only correspond to one field.

The generated request pattern is as follows:

(4) Write the ending character --boudary–
This step is very important. If the end character is not written, the server can only parse the first Part after receiving the request.
Simply call the Close() method of multipart.Writer to write the end character.

// Close finishes the multipart message and writes the trailing
// boundary end line to the output.
func (w *Writer) Close() error {<!-- -->
    if w.lastpart != nil {<!-- -->
        if err := w.lastpart.close(); err != nil {<!-- -->
            return err
        }
        w.lastpart = nil
    }
    _, err := fmt.Fprintf(w.w, "\r\\
--%s--\r\\
", w.boundary)
    return err
}

2. Server-side processing

package main

import (
    "fmt"
    "net/http"
)

func SimpleHandler(w http.ResponseWriter, r *http.Request) {<!-- -->
    contentType := r.Header.Get("Content-Type")
    mediatype, _, _ := mime.ParseMediaType(contentType)
    
    w.Header().Set("Content-Type", "text/plain")
    // w.WriteHeader(http.StatusOK)
    w.WriteString("Hello world!")
    // w.Write([]byte("This is an example.\\
"))
}

func main() {<!-- -->
    http.HandleFunc("/", IndexHandler)
    http.ListenAndServe("127.0.0.0:8000", nil)
}

After we get the Request, we can decide how to process the corresponding data based on the "Content-Type" in the request header.

For example, the header information of a request is as follows:

POST /t2/upload.do HTTP/1.1
User-Agent: SOHUWapRebot
Accept-Language: zh-cn,zh;q=0.5
Accept-Charset: GBK,utf-8;q=0.7,*;q=0.7
Connection: keep-alive
Content-Length: 10780
Content-Type:multipart/form-data; boundary=--------------------------9051914041544843365972754266
Host: w.sohu.com

We parse the header "Content-Type" field as follows. If it is "multipart/form-data", create a multipart.Reader based on the body of the Request.

func ReceiveHandler(w http.ResponseWriter, r *http.Request)
    contentType := r.Header.Get("Content-Type")
    mediatype, param, err := mime.ParseMediaType(contentType)
    if mediatype == "multipart/form-data" {<!-- -->
        boundary, _ := params["boundary"]
        reader := multipart.NewReader(r.Body, boundary)
        ...
    }
}

In the above code, we finally create a multipart.Reader type through the NewReader() function

// NewReader creates a new multipart Reader reading from r using the given MIME boundary.
// The boundary is usually obtained from the "boundary" parameter of the message's "Content-Type" header.
// Use mime.ParseMediaType to parse such headers.
func NewReader(r io.Reader, boundary string) *Reader {<!-- -->
    b := []byte("\r\\
--" + boundary + "--")
    return &Reader{<!-- -->
        bufReader: bufio.NewReaderSize( & amp;stickyErrorReader{<!-- -->r: r}, peekBufferSize),
        nl: b[:2],
        nlDashBoundary: b[:len(b)-2],
        dashBoundaryDash: b[2:],
        dashBoundary: b[2 : len(b)-2],
    }
}

In fact, the Request structure provides a MultipartReader() to simplify the above steps. Its source code is as follows:

func (r *Request) MultipartReader() (*multipart.Reader, error) {<!-- -->
    if r.MultipartForm == multipartByReader {<!-- -->
        return nil, errors.New("http: MultipartReader called twice")
    }
    if r.MultipartForm != nil {<!-- -->
        return nil, errors.New("http: multipart handled by ParseMultipartForm")
    }
    r.MultipartForm = multipartByReader
    return r.multipartReader()
}

func (r *Request) multipartReader() (*multipart.Reader, error) {<!-- -->
    v := r.Header.Get("Content-Type")
    if v == "" {<!-- -->
        return nil, ErrNotMultipart
    }
    d, params, err := mime.ParseMediaType(v)
    if err != nil || d != "multipart/form-data" {<!-- -->
        return nil, ErrNotMultipart
    }
    boundary, ok := params["boundary"]
    if !ok {<!-- -->
        return nil, ErrMissingBoundary
    }
    return multipart.NewReader(r.Body, boundary), nil
}

Now look at the definition of multipart.Reader:

// Reader is an iterator over parts in a MIME multipart body.
// Reader's underlying parser consumes its input as needed. Seeking isn't supported.
type Reader struct {<!-- -->
    bufReader *bufio.Reader

    currentPart *Part
    partsRead int

    nl []byte // "\r\\
" or "\\
" (set after seeing first boundary line)
    nlDashBoundary []byte // nl + "--boundary"
    dashBoundaryDash []byte // "--boundary--"
    dashBoundary []byte // "--boundary"
}

Analysis:
Through the members of the Reader structure, we once again understand the request body in the multipart/form-data format.
Among them, this structure mainly contains the definitions of bufReader, currentPart, and boundar.

  • bufReader corresponds to w in the Writer structure and reads content from w.
  • currentPart is a pointer to the Part type. As the name suggests, the Part type represents each Part in the multipart.
  • The boudary variable defines the boundary identifier and terminator between Parts.

Let’s look at the definition of the Part structure

// A Part represents a single part in a multipart body.
type Part struct {<!-- -->
    Header textproto.MIMEHeader

    mr *Reader

    disposition string
    dispositionParams map[string]string

    // r is either a reader directly reading from mr, or it's a
    // wrapper around such a reader, decoding the
    // Content-Transfer-Encoding
    r io.Reader

    n int // known data bytes waiting in mr.bufReader
    total int64 // total data bytes read already
    err error // error to return when n == 0
    readErr error // read error observed from mr.bufReader
}

For a Part of the following request body, we can see from Content-Disposition that it is a Part of file type.
All parts of the file content are unprintable characters

--49d03132746bfd98bffc0be04783d061e8acaeec7e0054b4bea84fc0ea2c
Content-Disposition: form-data; name="file"; filename="husky.jpg"
Content-Type: application/octet-stream

JFIF ( %!%) + ...383,7(-. +


-- + + + + -- + - + + + + - + + + -- + ------ + --7--- + 77- + -- + + + 7 + + + + 76! 1AQa"qB
BHf[tTN4t'(4"?i\m=,52?1Nf%* OCW6jWIlWZ.P3< + 7V9u?
jeIp=z-v$_e\YZω4 CvXdY(?8wHv%:h?`? 1*6L + X3\9 i)z
?K{<!-- -->j
K{<!-- -->@)9>$#r'gE?-CA1V{<!-- -->qZ?,^SdIdWu;e\1KJJЭ
-G('db}HaHVKmU521XRjc|dFO1fY[ \WYpF9`}e

The Header member of the Part structure corresponds to the Content-Disposition, Content-Type and other attributes below the boundary. After a newline is left empty, it is the file content.

By calling Reader's NextPart() function, we can traverse all Parts in a multipart request body. The implementation is as follows (simplified):

func (r *Reader) NextPart() (*Part, error) {<!-- -->
    if r.currentPart != nil {<!-- -->
        r.currentPart.Close()
    }

    for {<!-- -->
        line, err := r.bufReader.ReadSlice('\\
')

        if r.isBoundaryDelimiterLine(line) {<!-- -->
            r.partsRead++
            bp, err := newPart(r)
            if err != nil {<!-- -->
                return nil, err
            }
            r.currentPart = bp
            return bp, nil
        }

        if r.isFinalBoundary(line) {<!-- -->
            // Expected EOF
            return nil, io.EOF
        }

        return nil, fmt.Errorf("multipart: unexpected line in Next(): %q", line)
    }
}
syntaxbug.com © 2021 All Rights Reserved.