The cause of Jsoup exception org.jsoup.UnsupportedMimeTypeException: Unhandled content type

Exception that occurred:

Exception in thread “main” org.jsoup.UnsupportedMimeTypeException: Unhandled content type. Must be text/*, application/xml, or application/xhtml + xml. Mimetype=application/json;charset=UTF-8, URL=https:/ /xxx.com/apiname

image.png

I read several articles on the Internet about setting parameters ignoreContentType(true) for Jsoup’s Connection, but no one said why, so let’s record it below.

Pseudo-code of Jsoup request: here is the breakpoint debugging of line 28 execute

 public static String searchTest() throws IOException {<!-- -->
        Map<String, String> params = new HashMap<>();
        params.put("p1", "v1");
        params.put("p2", 2);
        Connection connect = Jsoup.connect("https://xxx.com/apiname");

        // Serialize the parameter into a json string
        String p = JSON.toJSONString(params);

        // org.jsoup.UnsupportedMimeTypeException: Unhandled content type. Must be text/*, application/xml, or application/xhtml + xml. Mimetype=application/json;charset=UTF-8, URL=https://xxx.com /apiname
        /*
         * Jsoup initiates a POST request and sends a json parameter to the server.
    *
         * Set the request body and request method. When executing execute, the parameters will be written to the output stream of the link created by conn, and the receiving end can receive and analyze and convert
         * ignoreContentType(true) Ignore this state, which will determine whether the Content-Type in the backend response after the request contains text/* ... etc.
         * */
        Connection.Response execute = connect
                .method(Connection.Method.POST)
            // set request header
                .header("Content-Type", "application/json; charset=UTF-8")
                .header("Accept", "text/plain, */*; q=0.01")
                .header("Accept-Encoding", "gzip,deflate,sdch")
                .header("Accept-Language", "es-ES,es;q=0.8")
                .header("Connection", "keep-alive")
                .header("X-Requested-With", "XMLHttpRequest")
                .ignoreContentType(true)
                .requestBody(p)
                .execute();

        String body = execute. body();
        System.err.println(body);
        return body;
    }

Debugging process:

  1. The first step: enter the breakpoint

image.png
image.png

  1. Step 2: Res response header after the request, Content-Type type judgment

image.png

Key points:
Jsoup requires the type of the backend Response must contain Must be text/*, application/xml, or application/xhtml + xml, but the requested server does not return any Several types of Content-Type values, so you need to set ignoreContentType(true) to ignore it, and an exception will not be thrown directly.

Source code analysis:
Some comments are written on the code, focusing on lines 15, 27, 64, 134

 static HttpConnection.Response execute(org.jsoup.Connection.Request req, HttpConnection.Response previousResponse) throws IOException {<!-- -->
            Validate.notNull(req, "Request must not be null");
            String protocol = req.url().getProtocol();
            if (!protocol.equals("http") & amp; & amp; !protocol.equals("https")) {<!-- -->
                throw new MalformedURLException("Only http & amp; https protocols supported");
            } else {<!-- -->
                boolean methodHasBody = req.method().hasBody(); // Determine whether the request method is included: GET, POST...
                boolean hasRequestBody = req.requestBody() != null; // Determine whether the request body is included
                if (!methodHasBody) {<!-- -->
                    Validate.isFalse(hasRequestBody, "Cannot set a request body for HTTP method " + req.method());
                }

                String mimeBoundary = null;
                if (req.data().size() > 0 & amp; & amp; (!methodHasBody || hasRequestBody)) {<!-- -->
                    // If the request method and data request parameters are not set, the parameters will be serialized directly to the url
                    serializeRequestUrl(req);
                } else if (methodHasBody) {<!-- -->
                    mimeBoundary = setOutputContentType(req);
                }

                HttpURLConnection conn = createConnection(req);

                HttpConnection.Response res;
                try {<!-- -->
                    conn. connect();
                    if (conn.getDoOutput()) {<!-- -->
                        // Write parameters to the request, see below: writePost method
                        writePost(req, conn. getOutputStream(), mimeBoundary);
                    }

                    int status = conn. getResponseCode();
                    res = new HttpConnection. Response(previousResponse);
                    res. setupFromConnection(conn, previousResponse);
                    res.req = req;
                    String contentType;
                    if (res. hasHeader("Location") & amp; & amp; req. followRedirects()) {<!-- -->
                        if (status != 307) {<!-- -->
                            req. method(Method. GET);
                            req.data().clear();
                        }

                        contentType = res. header("Location");
                        if (contentType != null & amp; & amp; contentType.startsWith("http:/") & amp; & amp; contentType.charAt(6) != '/') {<!-- ->
                            contentType = contentType. substring(6);
                        }

                        URL redir = StringUtil. resolve(req. url(), contentType);
                        req.url(HttpConnection.encodeUrl(redir));
                        Iterator var11 = res. cookies. entrySet(). iterator();

                        while(var11.hasNext()) {<!-- -->
                            Entry<String, String> cookie = (Entry)var11. next();
                            req.cookie((String)cookie.getKey(), (String)cookie.getValue());
                        }

                        HttpConnection.Response var21 = execute(req, res);
                        return var21;
                    }

                    if ((status < 200 || status >= 400) & amp; & amp; !req.ignoreHttpErrors()) {<!-- -->
                        throw new HttpStatusException("HTTP error fetching URL", status, req.url().toString());
                    }

                    // Whether the Content-Type in the response contains the following logical match
                    contentType = res. contentType();
                    if (contentType != null & amp; & amp; !req.ignoreContentType() & amp; & amp; !contentType.startsWith("text/") & amp; & amp; !xmlContentTypeRxp.matcher(contentType). matches()) {<!-- -->
                        throw new UnsupportedMimeTypeException("Unhandled content type. Must be text/*, application/xml, or application/xhtml + xml", contentType, req.url().toString());
                    }

                    if (contentType != null & amp; & amp; xmlContentTypeRxp.matcher(contentType).matches() & amp; & amp; req instanceof HttpConnection.Request & amp; & amp; !((HttpConnection.Request)req).parserDefined ) {<!-- -->
                        req.parser(Parser.xmlParser());
                    }

                    res.charset = DataUtil.getCharsetFromContentType(res.contentType);
                    if (conn.getContentLength() != 0 & amp; & amp; req.method() != Method.HEAD) {<!-- -->
                        Object bodyStream = null;

                        try {<!-- -->
                            bodyStream = conn.getErrorStream() != null ? conn.getErrorStream() : conn.getInputStream();
                            if (res.hasHeaderWithValue("Content-Encoding", "gzip")) {<!-- -->
                                bodyStream = new GZIPInputStream((InputStream)bodyStream);
                            }

                            res.byteData = DataUtil.readToByteBuffer((InputStream)bodyStream, req.maxBodySize());
                        } finally {<!-- -->
                            if (bodyStream != null) {<!-- -->
                                ((InputStream)bodyStream).close();
                            }

                        }
                    } else {<!-- -->
                        res.byteData = DataUtil.emptyByteBuffer();
                    }
                } finally {<!-- -->
                    conn. disconnect();
                }

                res.executed = true;
                return res;
            }
        }



private static void writePost(org.jsoup.Connection.Request req, OutputStream outputStream, String bound) throws IOException {<!-- -->
            Collection<org.jsoup.Connection.KeyVal> data = req.data();
            BufferedWriter w = new BufferedWriter(new OutputStreamWriter(outputStream, req. postDataCharset()));
            if (bound != null) {<!-- -->
                for(Iterator var5 = data.iterator(); var5.hasNext(); w.write("\r\\
")) {<!-- -->
                    org.jsoup.Connection.KeyVal keyVal = (org.jsoup.Connection.KeyVal)var5.next();
                    w.write("--");
                    w. write(bound);
                    w.write("\r\\
");
                    w.write("Content-Disposition: form-data; name="");
                    w.write(HttpConnection.encodeMimeName(keyVal.key()));
                    w.write(""");
                    if (keyVal.hasInputStream()) {<!-- -->
                        w.write("; filename="");
                        w.write(HttpConnection.encodeMimeName(keyVal.value()));
                        w.write(""\r\\
Content-Type: application/octet-stream\r\\
\r\\
");
                        w. flush();
                        DataUtil.crossStreams(keyVal.inputStream(), outputStream);
                        outputStream. flush();
                    } else {<!-- -->
                        w.write("\r\\
\r\\
");
                        w.write(keyVal.value());
                    }
                }

                w.write("--");
                w. write(bound);
                w.write("--");
            } else if (req. requestBody() != null) {<!-- -->
                // Include the request body, write the request parameters to outputStream, and submit the outputStream to the server for parsing
                w.write(req.requestBody());
            } else {<!-- -->
                boolean first = true;
                Iterator var9 = data. iterator();

                while(var9.hasNext()) {<!-- -->
                    org.jsoup.Connection.KeyVal keyVal = (org.jsoup.Connection.KeyVal)var9.next();
                    if (!first) {<!-- -->
                        w.append(' &');
                    } else {<!-- -->
                        first = false;
                    }

                    w.write(URLEncoder.encode(keyVal.key(), req.postDataCharset()));
                    w. write(61);
                    w.write(URLEncoder.encode(keyVal.value(), req.postDataCharset()));
                }
            }

            w. close();
        }

Reference article:
https://blog.csdn.net/dietime1943/article/details/78974194