Performance comparison of seven WebSocket frameworks

1. Original address colobu.com

The previous article used four frameworks to implement a server with millions of websocket constant connections. It introduced the testing methods and basic data of the four websocket frameworks.

The previous article used four frameworks to implement a server with millions of websocket constant connections. It introduced the testing methods and basic data of the four websocket frameworks. Recently, I used several frameworks to implement the prototype of the websocket push server, and specifically tested these seven implementations. This article records the test results and some analysis of the results.
The seven frameworks are:

Netty
Undertow
Jetty
Vert.x
Grizzly
spray-websocket
nodejs-websocket/Node.js

Recently, I implemented the eighth method using Golang, and Go performed pretty well.

1. Test environment

Three C3.4xlarge AWS servers were used for testing. One is used as a server and two are used as client machines. Each client machine starts 10 clients, a total of 20 clients.
The configuration of C3.4xlarge is as follows:

Model	vCPU	Memory (GiB)	SSD Storage (GB)
c3.large	2	3.75	2 x 16
c3.xlarge	4	7.5	2 x 40
c3.2xlarge	8	15	2 x 80
c3.4xlarge	16	30	2 x 160
c3.8xlarge	32	60	2 x 320

The server and client machines have been basically optimized according to the previous article.

The following is the configuration data for the test:

20 clients
The setup rate is set to 500 * 20 requests/second = 10000 request/second
Each client is responsible for establishing 50,000 websocket connections
Wait for 1,000,000 websockets to be established and send a message (timestamp) to all clients. The client calculates latency based on the timestamp.
If the server setup rate is very slow, proactively stop the test
Monitor performance indicators in three stages: during setup, when the application is idle after setup is completed, when sending messages

2. Test results

2.1, Netty

When Setup

CPU idle: 90%
minor gc: Few
full gc: No

Setup completed, when applying Idle

CPU idle: 100%
memory usage: 1.68G
server free memory: 16.3G

When sending a message

CPU idle: 75%
minor gc: few
full gc: No

Message latency (one client)

count = 50000
         min = 0
         max=18301
        mean=2446.09
      stddev = 3082.11
      median = 1214.00
        75% <= 3625.00
        95% <= 8855.00
        98% <= 12069.00
        99% <= 13274.00
      99.9% <= 18301.00

2.2, Vert.x

When Setup

CPU idle: 95%
minor gc: Few
full gc: No

Setup completed, when applying Idle

CPU idle: 100%
memory usage: 6.37G
server free memory: 16.3G

When sending a message

cpu idle: 47% ~ 76%
minor gc: few
full gc: few

Message latency (one client)

count = 50000
         min = 49
         max=18949
        mean=10427.00
      stddev = 5182.72
      median = 10856.00
        75% <= 14934.00
        95% <= 17949.00
        98% <= 18458.00
        99% <= 18658.00
      99.9% <= 18949.00

2.3, Undertow

When Setup

CPU idle: 90%
minor gc: Few
full gc: No

Setup completed, when applying Idle

CPU idle: 100%
memory usage: 4.02G
server free memory: 14.2G

When sending a message

CPU idle: 65%
minor gc: few
full gc: No

Message latency

count = 50000
         min=1
         max=11948
        mean = 1366.86
      stddev = 2007.77
      median = 412.00
        75% <= 2021.00
        95% <= 5838.00
        98% <= 7222.00
        99% <= 8051.00
      99.9% <= 11948.00

2.4, Jetty

When Setup

CPU idle: 2%
minor gc: Many
full gc: No
memory usage: 5G
Server free memory: 17.2G

When about 360,000 websockets were created, setup was very slow, gc was frequent, websockets could not be established normally, and the test was actively terminated.

2.5, Grizzly

When Setup

CPU idle: 20%
minor gc: Some
full gc: Some
memory usage: 11.5G
server free memory: 12.3G

When about 500,000 websockets are created, setup is very slow, gc is frequent, websockets cannot be established normally, and the test is terminated actively.

2.6, Spray

When Setup

CPU idle: 80%
minor gc: Many
full gc: No

When about 500,000 websockets are created, setup is very slow, gc is frequent, websockets cannot be established normally, and the test is terminated actively.

2.7, Node.js

When Setup

CPU idle: 94%

Setup completed, when applying Idle

CPU idle: 100%
memory usage: 5.0G
server free memory: 16.3G

When sending a message

CPU idle: 94%
Message latency (one client)

Message latency

count = 50000
         min = 0
         max=18
        mean=1.27
      stddev=3.08
      median = 1.00
        75% <= 1.00
        95% <= 1.00
        98% <= 1.00
        99% <= 1.00
      99.9% <= 15.00

2.8, Go

When Setup

CPU idle: 94%

Setup completed, when applying Idle

CPU idle: 100%
memory usage: 15G
server free memory: 6G

When sending a message

CPU idle: 94%
Message latency (one client)

Message latency

count = 50000
         min = 0
         max = 35
        mean = 1.89
      stddev = 1.83
      median = 1.00
        75% <= 1.00
        95% <= 2.00
        98% <= 2.00
        99% <= 4.00
      99.9% <= 34.00

3. Test result analysis

Netty, Go, Node.js, Undertow, Vert.x can all establish millions of connections normally. Jetty, Grizzly and Spray fail to complete million connections
Netty performed best. The memory usage is very small and the CPU usage is not high. Especially the memory usage is much smaller than other frameworks
Jetty, Grizzly and Spray will generate a large number of intermediate objects, resulting in frequent garbage collection. Jetty performs worst
Node.js performs very well. Especially using a single instance and single thread in the test, the creation speed is very fast, and the latency of the message is also very good. Memory usage is also good
Undertow also performs well, and its memory usage is higher than Netty, but the rest is about the same.
Here’s another bad thing about Spray that I haven’t tested yet. In the case of a large number of connections, Spray will take up 40% of the CPU time even if no messages are sent.