Build your own HTTP server in Java in less than an hour, supporting only the GET method

Original address: https://dev.to/mateuszjarzyna/build-your-own-http-server-in-java-in-less-than-one-hour-only-get-method-2k02

Introduction to HTTP protocol

***The most frequently used protocol in the entire Internet
In the OSI model, it belongs to layer 7
Every time you visit a website, your web browser uses the HTTP protocol to communicate with the web server and obtain the content of the page. And, when you implement a backend application and have to communicate with other backend applications – in 80% (or more) of the cases you will use HTTP.
Long story short – when you want to be a good software developer, you have to understand how the HTTP protocol works. Also, writing an HTTP server is a better way to understand the protocol, I think.

What does the web browser send to the web server?

good question. Of course, you can use the “Developer Tools”, let’s give it a try.
Hmm…but what now? What does this mean? We can see some URLs, some methods, some status, version (what?), header information, etc. is that useful? Yes, but only for analyzing web application problems. We still don’t know how HTTP works.

Wireshark, my old friend

The source of truth. Wireshark is an application used to analyze network traffic. You can use it to view every packet sent by (or to) your PC.
But let’s be honest – if you know how to use Wireshark – you probably know how HTTP and TCP work. This is a fairly advanced program.

You are right – protocol specification

Every good (and I mean – used by more than 5 people) protocol should have specifications. In this case, it’s called an RFC. But don’t lie – you’ll never read it because it’s too long – https://tools.ietf.org/html/rfc2616

Just run the server and test

Are you kidding? no. You may already have a very powerful tool installed on your PC called netcat which is quite an advanced tool.
One function of netcat is a TCP server. You can run netcat to listen on a specific port and print everything received. Netcat is a command line application.
nc -v -l -p 8080
Netcat (nc), please listen (-l) on port 8080 (-p 8080) and print everything (-v< /em>).
Now open in your web browser and type
http://localhost:8080/. Your browser will send an HTTP request to a server run by netcat. certainly,
nc will print the entire request and ignore it, and the browser will wait for a response (and timeout quickly). according to
ctrl + c to terminate
nc. So, finally, we have an HTTP request!
GET/HTTP/1.1
Host: localhost:8080
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:74.0) Gecko/20100101 Firefox/74.0
Accept: text/html,application/xhtml + xml,application/xml;q=0.9,image/webp,/;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: keep-alive
Cookie: JSESSIONID=D3AF43EBFC0C9D92AD9C37823C4BB299
Upgrade-Insecure-Requests: 1
As you can see – it’s entirely a text protocol. There are no bits to parse, just plain text.

HTTP request

This can be a bit confusing. Maybe nc parses the request before printing? The HTTP protocol should be complex, where is the sequence of 0’s and 1’s? absolutely not. HTTP is really very simple, it is a text protocol. There’s just one little gotcha (which I’ll explain at the end of this section).
We can divide the request into 4 main parts:
GET/HTTP/1.1
This is the main request.
GET – This is the HTTP method. Maybe you know there are many ways.
GET means give me/- the resource.
/ means default.
When you open localhost:8080/my_gf_nudes.html, the resource will be /my_gf_nudes.html
HTTP/1.1 – HTTP version. There are several versions, 1.1 is commonly used.
Host: localhost:8080
Host. A server can host multiple domains, using this field the browser tells the server the exact domain it wants.
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:74.0) Gecko/20100101 Firefox/74.0
Accept: text/html,application/xhtml + xml,application/xml;q=0.9,image/webp,/;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: keep-alive
Cookie: JSESSIONID=D3AF43EBFC0C9D92AD9C37823C4BB299
Upgrade-Insecure-Requests: 1
Headers. In short: some extra information. I believe you know what header is

Surprise – blank line. This means: end of request. In general – a blank line in HTTP indicates the end of a chapter.

Trap

In HTTP, each new line separator is the Windows newline character. Yes\r\
, not\
. remember.

Response

OK We have a request. What does the response look like? Send a request to any server and see, it doesn’t get any easier than that.
On your laptop you can find another very useful tool – telnet. Using telenet, you can open a TCP connection, write content to the server and print the response.
Try doing it yourself. run
telnet google.com 80 (80 is the default HTTP port) and enter the request manually (you know what it should look like). To close the connection, press
ctrl + ]
Then enter
Quit.

alright. we got a response
HTTP/1.1 301 Moved Permanently
Location: http://www.google.com/
Content-Type: text/html; charset=UTF-8
Date: Wed, 25 Mar 2020 18:53:12 GMT
Expires: Fri, 24 Apr 2020 18:53:12 GMT
Cache-Control: public, max-age=2592000
Server: gws
Content-Length: 219
X-XSS-Protection: 0
X-Frame-Options: SAMEORIGIN

301 Moved

301 Moved

The document has moved
here. We can divide it into 4 parts HTTP/1.1 301 Moved Permanently HTTP/1.1 – Version 301 – Status code. I’m sure you’re familiar with Moved Permanently – Human-readable status codes Location: http://www.google.com/ Content-Type: text/html; charset=UTF-8 Date: Wed, 25 Mar 2020 18: 53:12 GMT Expires: Fri, 24 Apr 2020 18:53:12 GMT Cache-Control: public, max-age=2592000 Server: gws Content-Length: 219 X-XSS-Protection: 0 X-Frame-Options: SAMEORIGIN Headers A blank line indicates that the content will be sent in the next section.
301 Moved

301 Moved

The document has moved
here. Content, HTML or binary or something. Empty line, indicating *end of request*. **Remember: each new line is\r\
**

Programming time!

We know what the request looks like, we know what the response looks like, now it’s time to implement our server.

What do we hope for

We want to achieve a very simple thing – display an HTML page and an image in the browser.
Let’s prepare two HTML files and an image
?pwd
/tmp/www
?ls
gallery.html index.html me.jpg
? cat index.html

My homepage!

Welcome!

Here you can look at my pictures

? cat gallery.html
Gallery

My sexi photos

? ## Schedule Schedule is very simple: – Open TCP socket and listen – Accept client and read request – Parse request – Find request on disk Resources – Send Response – Test ## Open TCP Socket In this article, we will use the ServerSocket class to handle TCP connections. As a homework, you can reimplement the server to use classes from the nio package. So, open your IDE and let’s get started. “`java public static void main( String[] args ) throws Exception { try (ServerSocket serverSocket = new ServerSocket(8080)) { while (true) { // implement client handler here } } } “` I want the code to be concise And clean – that’s why I use `throwsException` instead to implement good exception handling. As I said, we have to open the socket on port 8080 (why not 80? Because to use the low port you need root privileges). We also need an infinite loop to ‘pause the server’. Test the socket using `telnet`: “` telnet localhost 8080 “` Perfect, it works.

Accept client connection

try (ServerSocket serverSocket = new ServerSocket(8080)) {<!-- -->
    while (true) {<!-- -->
        try (Socket client = serverSocket.accept()) {<!-- -->
            handleClient(client);
        }
    }
}

To accept a connection from a client, we must call the blocking accept() method. The Java program will wait for the client at this line.
It’s time to implement the client handler:

private static void handleClient(Socket client) throws IOException {<!-- -->
    System.out.println("Debug: got new client " + client.toString());
    BufferedReader br = new BufferedReader(new InputStreamReader(client.getInputStream()));
    StringBuilder requestBuilder = new StringBuilder();
    String line;
    while (!(line = br.readLine()).isBlank()) {<!-- -->
        requestBuilder.append(line + "\r\
");
    }
    String request = requestBuilder.toString();
    System.out.println(request);
}
We have to read the request. How to read it? Just read the input stream from the client's socket. It's not that simple in Java, that's why I wrote this ugly line of code: `new BufferedReader(new InputStreamReader(client.getInputStream()))`.
Okay, Java.
Requests end with a blank line (`\r\
`), remember? The client will send an empty line, but the input stream is still open and we must read until we receive an empty line.
Run the server, go to http://localhost:8080/ and watch the logs:
It works! We can log the entire request!

## Parse the request
Parsing the request is really simple and I don't think it needs any further explanation
```java
String request = requestBuilder.toString();
String[] requestsLines = request.split("\r\
");
String[] requestLine = requestsLines[0].split(" ");
String method = requestLine[0];
String path = requestLine[1];
String version = requestLine[2];
String host = requestsLines[1].split(" ")[1];
List<String> headers = new ArrayList<>();
for (int h = 2; h < requestsLines.length; h + + ) {<!-- -->
    String header = requestsLines[h];
    headers.add(header);
}
String accessLog = String.format("Client %s, method %s, path %s, version %s, host %s, headers %s",
                            client.toString(), method, path, version, host, headers.toString());
System.out.println(accessLog);

Some segmentation. The only thing I can’t understand is why we start looping from 2? Because the first line (index 0) is GET/HTTP/1.1 and the second line is host. The header starts on the third line of the request.

Send response

We will send the response to the client’s output stream.

OutputStream clientOutput = client.getOutputStream();
clientOutput.write("HTTP/1.1 200 OK\r\
".getBytes());
clientOutput.write(("ContentType: text/html\r\
").getBytes());
clientOutput.write("\r\
".getBytes());
clientOutput.write("<b>It works!</b>".getBytes());
clientOutput.write("\r\
\r\
".getBytes());
clientOutput.flush();
client.close();

Remember what the response should look like?
version status code
headers
(empty line)
content
(empty line)
Don’t forget to close the output stream.

The requested resource was found

We first need to implement two methods

private static String guessContentType(Path filePath) throws IOException {<!-- -->
    return Files.probeContentType(filePath);
}

private static Path getFilePath(String path) {<!-- -->
    if ("/".equals(path)) {<!-- -->
        path = "/index.html";
    }
    return Paths.get("/tmp/www", path);
}

guessContentType – We have to tell the browser what type of content we are sending. This is called content type. Fortunately, there are built-in mechanisms in Java to handle this problem. We don’t need to write a long switch statement block.
getFilePath – Before returning the file, we need to know its location.
This condition is worth noting

if ("/".equals(path)) {<!-- -->
    path = "/index.html";
}

If the user requests default resource, then return index.html.

Send response

Remember the block of code that sends the response to the user (the block of clientOutput.write)? We need to move it to method

private static void sendResponse(Socket client, String status, String contentType, byte[] content) throws IOException {<!-- -->
    OutputStream clientOutput = client.getOutputStream();
    clientOutput.write(("HTTP/1.1 \r\
" + status).getBytes());
    clientOutput.write(("ContentType: " + contentType + "\r\
").getBytes());
    clientOutput.write("\r\
".getBytes());
    clientOutput.write(content);
    clientOutput.write("\r\
\r\
".getBytes());
    clientOutput.flush();
    client.close();
}

Okay, finally we can return to the file

Path filePath = getFilePath(path);
if (Files.exists(filePath)) {<!-- -->
    // file exists
    String contentType = guessContentType(filePath);
    sendResponse(client, "200 OK", contentType, Files.readAllBytes(filePath));
} else {<!-- -->
    //404
    byte[] notFoundContent = "<h1>Not found :(</h1>".getBytes();
    sendResponse(client, "404 Not Found", "text/html", notFoundContent);
}

Done!

Finally, we can view the HTML page served by our web server!

Homework

  • Change it to multithreading.
  • Create a thread pool.
  • Move the handleClient method into a separate class and run it in a new thread.
  • Rewritten using non-blocking IO.
  • Implement the POST method.
  • Start netcat.
  • Send some HTML form.
  • Analysis request.
syntaxbug.com © 2021 All Rights Reserved.