Performance comparison of seven WebSocket frameworks

1. Original address colobu.com

The previous article used four frameworks to implement a server with millions of websocket constant connections. It introduced the testing methods and basic data of the four websocket frameworks.

The previous article used four frameworks to implement a server with millions of websocket constant connections. It introduced the testing methods and basic data of the four websocket frameworks. Recently, I used several frameworks to implement the prototype of the websocket push server, and specifically tested these seven implementations. This article records the test results and some analysis of the results.
The seven frameworks are:

  • Netty
  • Undertow
  • Jetty
  • Vert.x
  • Grizzly
  • spray-websocket
  • nodejs-websocket/Node.js

Recently, I implemented the eighth method using Golang, and Go performed pretty well.

  • Go

1. Test environment

Three C3.4xlarge AWS servers were used for testing. One is used as a server and two are used as client machines. Each client machine starts 10 clients, a total of 20 clients.
The configuration of C3.4xlarge is as follows:

Model vCPU Memory (GiB) SSD Storage (GB)
c3.large 2 3.75 2 x 16
c3.xlarge 4 7.5 2 x 40
c3.2xlarge 8 15 2 x 80
c3.4xlarge 16 30 2 x 160
c3.8xlarge 32 60 2 x 320

The server and client machines have been basically optimized according to the previous article.

The following is the configuration data for the test:

  • 20 clients
  • The setup rate is set to 500 * 20 requests/second = 10000 request/second
  • Each client is responsible for establishing 50,000 websocket connections
  • Wait for 1,000,000 websockets to be established and send a message (timestamp) to all clients. The client calculates latency based on the timestamp.
  • If the server setup rate is very slow, proactively stop the test
  • Monitor performance indicators in three stages: during setup, when the application is idle after setup is completed, when sending messages

2. Test results

2.1, Netty

When Setup

  • CPU idle: 90%
  • minor gc: Few
  • full gc: No

Setup completed, when applying Idle

  • CPU idle: 100%
  • memory usage: 1.68G
  • server free memory: 16.3G

When sending a message

  • CPU idle: 75%

  • minor gc: few

  • full gc: No

  • Message latency (one client)

    count = 50000
             min = 0
             max=18301
            mean=2446.09
          stddev = 3082.11
          median = 1214.00
            75% <= 3625.00
            95% <= 8855.00
            98% <= 12069.00
            99% <= 13274.00
          99.9% <= 18301.00
    

2.2, Vert.x

When Setup

  • CPU idle: 95%
  • minor gc: Few
  • full gc: No

Setup completed, when applying Idle

  • CPU idle: 100%
  • memory usage: 6.37G
  • server free memory: 16.3G

When sending a message

  • cpu idle: 47% ~ 76%

  • minor gc: few

  • full gc: few

  • Message latency (one client)

    count = 50000
             min = 49
             max=18949
            mean=10427.00
          stddev = 5182.72
          median = 10856.00
            75% <= 14934.00
            95% <= 17949.00
            98% <= 18458.00
            99% <= 18658.00
          99.9% <= 18949.00
    

2.3, Undertow

When Setup

  • CPU idle: 90%
  • minor gc: Few
  • full gc: No

Setup completed, when applying Idle

  • CPU idle: 100%
  • memory usage: 4.02G
  • server free memory: 14.2G

When sending a message

  • CPU idle: 65%

  • minor gc: few

  • full gc: No

  • Message latency

    count = 50000
             min=1
             max=11948
            mean = 1366.86
          stddev = 2007.77
          median = 412.00
            75% <= 2021.00
            95% <= 5838.00
            98% <= 7222.00
            99% <= 8051.00
          99.9% <= 11948.00
    

2.4, Jetty

When Setup

  • CPU idle: 2%
  • minor gc: Many
  • full gc: No
  • memory usage: 5G
  • Server free memory: 17.2G

When about 360,000 websockets were created, setup was very slow, gc was frequent, websockets could not be established normally, and the test was actively terminated.

2.5, Grizzly

When Setup

  • CPU idle: 20%
  • minor gc: Some
  • full gc: Some
  • memory usage: 11.5G
  • server free memory: 12.3G

When about 500,000 websockets are created, setup is very slow, gc is frequent, websockets cannot be established normally, and the test is terminated actively.

2.6, Spray

When Setup

  • CPU idle: 80%
  • minor gc: Many
  • full gc: No

When about 500,000 websockets are created, setup is very slow, gc is frequent, websockets cannot be established normally, and the test is terminated actively.

2.7, Node.js

When Setup

  • CPU idle: 94%

Setup completed, when applying Idle

  • CPU idle: 100%
  • memory usage: 5.0G
  • server free memory: 16.3G

When sending a message

  • CPU idle: 94%

  • Message latency (one client)

  • Message latency

    count = 50000
             min = 0
             max=18
            mean=1.27
          stddev=3.08
          median = 1.00
            75% <= 1.00
            95% <= 1.00
            98% <= 1.00
            99% <= 1.00
          99.9% <= 15.00
    

2.8, Go

When Setup

  • CPU idle: 94%

Setup completed, when applying Idle

  • CPU idle: 100%
  • memory usage: 15G
  • server free memory: 6G

When sending a message

  • CPU idle: 94%

  • Message latency (one client)

  • Message latency

    count = 50000
             min = 0
             max = 35
            mean = 1.89
          stddev = 1.83
          median = 1.00
            75% <= 1.00
            95% <= 2.00
            98% <= 2.00
            99% <= 4.00
          99.9% <= 34.00
    

3. Test result analysis

  • Netty, Go, Node.js, Undertow, Vert.x can all establish millions of connections normally. Jetty, Grizzly and Spray fail to complete million connections
  • Netty performed best. The memory usage is very small and the CPU usage is not high. Especially the memory usage is much smaller than other frameworks
  • Jetty, Grizzly and Spray will generate a large number of intermediate objects, resulting in frequent garbage collection. Jetty performs worst
  • Node.js performs very well. Especially using a single instance and single thread in the test, the creation speed is very fast, and the latency of the message is also very good. Memory usage is also good
  • Undertow also performs well, and its memory usage is higher than Netty, but the rest is about the same.
  • Here’s another bad thing about Spray that I haven’t tested yet. In the case of a large number of connections, Spray will take up 40% of the CPU time even if no messages are sent.