ELK handles SpringCloud logs

In the process of troubleshooting online anomalies, query logs are always an indispensable part. In most microservice architectures used today, logs are scattered on different machines, making log query extremely difficult. If a worker wants to do his job well, he must first sharpen his tools. If there is a unified real-time log analysis platform at this time, it can be said to be a timely help, which will definitely improve our efficiency in troubleshooting online problems. This article takes you through the construction and use of the open source real-time log analysis platform ELK.

ELK Introduction

ELK is an open source real-time log analysis platform, which mainly consists of three parts: Elasticsearch, Logstash and Kiabana.

Logstash

Logstash is mainly used to collect server logs. It is an open source data collection engine with real-time pipeline capabilities. Logstash dynamically unifies data from disparate data sources and standardizes the data to a destination of your choice.

An aside, if you are planning to interview and change jobs in the near future, it is recommended to answer questions online at ddkk.com, covering 10,000+ Java interview questions, covering almost all mainstream technical interview questions, and the most comprehensive 500 sets of technology stacks on the market, which are high-quality products. A series of tutorials, available for free.

Logstash’s data collection process is mainly divided into the following three parts:

  • Input: Data (including but not limited to logs) are often stored in different systems in different forms and formats, and Logstash supports collecting data from multiple data sources (File, Syslog, MySQL, message middleware, etc.) .

  • Filters: Parse and transform data in real time, identifying named fields to build structures and converting them into a common format.

  • Output: Elasticsearch is not the only choice for storage, Logstash offers many output options.

Elasticsearch

Elasticsearch (ES) is a distributed Restful-style search and data analysis engine with the following features:

  • Query: Allows you to perform and combine multiple types of searches-structured, unstructured, geolocation, metrics-how you search.

  • Analysis: Elasticsearch aggregations allow you to see the big picture and explore trends and patterns in your data.

  • Speed: Very fast, it can handle billions of data and return it in milliseconds.

  • Scalability: Runs on a laptop or hundreds or thousands of servers hosting petabytes of data.

  • Elasticity: Running in a distributed environment, this was taken into consideration from the beginning.

  • Flexibility: Multiple case scenarios are available. Supports numbers, text, geographical location, structured, unstructured, all data types are welcome.

Kibana

Kibana makes massive amounts of data understandable. Its simple, browser-based interface lets you quickly create and share dynamic data dashboards to track real-time data changes in Elasticsearch. The setup process is also very simple. You can install Kibana in minutes and start exploring Elasticsearch’s indexed data – no code and no additional infrastructure required.

The above three components are specifically introduced in the article “ELK Protocol Stack Introduction and Architecture” and will not be repeated here.

In ELK, the general workflow of the three major components is shown in the figure below. Logstash collects logs from various services and stores them in Elasticsearch. Then Kiabana queries the logs from Elasticsearch and displays them to end users.

An aside, if you are planning to interview and change jobs in the near future, it is recommended to answer questions online at ddkk.com, covering 10,000+ Java interview questions, covering almost all mainstream technical interview questions, and the most comprehensive 500 sets of technology stacks on the market, which are high-quality products. A series of tutorials, available for free.

Figure 1. General workflow of ELK

Picture

ELK Implementation Solution

Usually our services are deployed on different servers, so how to collect log information from multiple servers is a key point. The solution provided in this article is shown below:

Figure 2. ELK implementation provided in this article

Picture

As shown in the figure above, the entire ELK operation process is as follows:

1. Deploy a Logstash on the microservice (the service that generates logs). As the Shipper role, it is mainly responsible for collecting data from the log files generated by the service on the machine and pushing the messages to Redis messages. queue;
2. Use another server to deploy Logstash with the Indexer role, which is mainly responsible for reading data from the Redis message queue, and outputting it to the Elasticsearch cluster for storage after parsing and processing by Filter in the Logstash pipeline. ;
3. Data synchronization between Elasticsearch primary and secondary nodes;
4. Deploy Kibana on a single server to read the log data in Elasticsearch and display it on the Web page;

Through this picture, I believe you have a rough understanding of the workflow of the ELK platform we are going to build, as well as the required components. Let’s start building it together.

An aside, if you are planning to interview and change jobs in the near future, it is recommended to answer questions online at ddkk.com, covering 10,000+ Java interview questions, covering almost all mainstream technical interview questions, and the most comprehensive 500 sets of technology stacks on the market, which are high-quality products. A series of tutorials, available for free.

ELK platform construction

This section mainly introduces the establishment of the ELK log platform, including the installation of the three components of Logstash, Elasticsearch and Kibana with the Indexer role. To complete this section, you need to make the following preparations:

1. An Ubuntu machine or virtual machine. As an introductory tutorial, the establishment of the Elasticsearch cluster is omitted here, and Logstash (Indexer), Elasticsearch and Kibana are installed on the same machine;
2. Install JDK on Ubuntu. Note that Logstash requires JDK to be version 1.7 or above;
3. Logstash, Elasticsearch, and Kibana installation packages can be downloaded on this page;

Install Logstash

Unzip the compressed package:

tar -xzvf logstash-7.3.0.tar.gz

To display more simple test cases, go to the unzipped directory and start a pipeline that takes console input and output to the console.

cd logstash-7.3.0
elk@elk:~/elk/logstash-7.3.0$ bin/logstash -e 'input { stdin {} } output { { stdout {} } }'

Show more If you see the following log, it means Logstash started successfully.

Figure 3. Logstash startup success log

Picture

Enter Hello Logstash on the console. If you see the following effect, Logstash is successfully installed.

Listing 1. Verify whether Logstash starts successfully Hello Logstash

{
    "@timestamp" => 2019-08-10T16:11:10.040Z,
          "host" => "elk",
      "@version" => "1",
       "message" => "Hello Logstash"
}

Install Elasticsearch

Unzip the installation package:

tar -xzvf elasticsearch-7.3.0-linux-x86_64.tar.gz

Start Elasticsearch:

cd elasticsearch-7.3.0/
bin/elasticsearch

During the process of starting Elasticsearch, I encountered two problems and listed them here for everyone’s convenience.

Problem 1: The memory is too small. If your machine memory is less than the value set by Elasticsearch, the error shown in the figure below will be reported. The solution is to modify the following configuration in the elasticsearch-7.3.0/config/jvm.options file to suit the memory size of your machine. If this error is still reported after modification, you can reconnect to the server and try again.

An aside, if you are planning to interview and change jobs in the near future, it is recommended to answer questions online at ddkk.com, covering 10,000+ Java interview questions, covering almost all mainstream technical interview questions, and the most comprehensive 500 sets of technology stacks on the market, which are high-quality products. A series of tutorials, available for free.

Figure 4. Too small memory causes Elasticsearch startup error

Picture

Question 2: If you start as the root user, the error shown in the figure below will be reported. The solution is naturally to add a new user to start Elasticsearch. There are many ways to add new users on the Internet, so I won’t go into details here.

Figure 5. Root user starts Elasticsearch and reports an error

Picture

After the startup is successful, open a new session window and execute the curl http://localhost:9200 command. If the following results appear, it means that Elasticsearch is installed successfully.

Listing 2. Checking whether Elasticsearch starts successfully

elk@elk:~$ curl http://localhost:9200
{
  "name" : "elk",
  "cluster_name" : "elasticsearch",
  "cluster_uuid" : "hqp4Aad0T2Gcd4QyiHASmA",
  "version" : {
    "number" : "7.3.0",
    "build_flavor" : "default",
    "build_type" : "tar",
    "build_hash" : "de777fa",
    "build_date" : "2019-07-24T18:30:11.767338Z",
    "build_snapshot" : false,
    "lucene_version" : "8.1.0",
    "minimum_wire_compatibility_version" : "6.8.0",
    "minimum_index_compatibility_version" : "6.0.0-beta1"
  },
  "tagline" : "You Know, for Search"
}

Install Kibana

Unzip the installation package:

tar -xzvf kibana-7.3.0-linux-x86_64.tar.gz

Modify the configuration file config/kibana.yml to mainly specify Elasticsearch information.

Listing 3. Kibana configuration information #Elasticsearch host address

elasticsearch.hosts: "http://ip:9200"
#Allow remote access
server.host: "0.0.0.0"
# Elasticsearch username This is actually the username I used to start Elasticsearch on the server.
elasticsearch.username: "es"
# Elasticsearch authentication password This is actually the password I used to start Elasticsearch on the server.
elasticsearch.password: "es"

Start Kibana:

cd kibana-7.3.0-linux-x86_64/bin
./kibana

Access http://ip:5601 in the browser. If the following interface appears, it means that Kibana is installed successfully.

Figure 6. Kibana startup success interface

Picture

After the ELK log platform is installed, we will look at how to use ELK through specific examples. The following will introduce how to submit Spring Boot logs and Nginx logs to ELK for analysis.

Using ELK with Spring Boot

First we need to create a Spring Boot project. I wrote an article before about how to use AOP to uniformly process Spring Boot’s Web logs. The Spring Boot project in this article is based on this article.

Modify and deploy Spring Boot project

Create the spring-logback.xml configuration file in the project resources directory.

Listing 4. Configuration of Spring Boot project Logback

<?xml version="1.0" encoding="UTF-8"?>
<configuration debug="false">
    <contextName>Logback For demo Mobile</contextName>
    <property name="LOG_HOME" value="/log" />
    <springProperty scope="context" name="appName" source="spring.application.name"
                    defaultValue="localhost" />
    ...

    <appender name="ROLLING_FILE" class="ch.qos.logback.core.rolling.RollingFileAppender">
        ...
        <encoder class="ch.qos.logback.classic.encoder.PatternLayoutEncoder">
            <pattern>%d{yyyy-MM-dd HH:mm:ss.SSS} [%thread] %-5level %logger{25} ${appName} -%msg%n</pattern>
        </encoder>
        ...
    </appender>
    ...
</configuration>

The above content omits a lot of content, which you can get in the source code. In the above configuration, we defined an Appender named ROLLING_FILE to output logs in a specified format to the log file. The pattern tag above is the configuration of the specific log format. Through the above configuration, we specify the output information such as time, thread, log level, logger (usually the full path of the class where the log is printed), and service name.

Package the project and deploy it to an Ubuntu server.

Listing 5. Packaging and deploying the Spring Boot project

# Packaging command
mvn package -Dmaven.test.skip=true
# Deployment command
java -jar sb-elk-start-0.0.1-SNAPSHOT.jar

Check the log file. In the logback configuration file, I store the log in the /log/sb-log.log file and execute the more /log/sb-log.log command. If the following results appear, the deployment is successful.

Figure 7. Spring Boot log file

Picture

Configure Shipper role Logstash

After the Spring Boot project is successfully deployed, we also need to install and configure the Logstash of the Shipper role on the currently deployed machine. The installation process of Logstash has been mentioned in the ELK platform construction section and will not be repeated here. After the installation is complete, we need to write the Logstash configuration file to support collecting logs from log files and outputting them to the Redis message pipeline. The Shipper configuration is as follows.

Listing 6. Logstash configuration for the Shipper role

input {
    file {
        path => [
            # Fill in the files that need to be monitored here
            "/log/sb-log.log"
        ]
    }
}

output {
    # Output to redis
    redis {
        host => "10.140.45.190" # redis host address
        port => 6379 # redis port number
        db => 8 # redis database number
        data_type => "channel" # Use publish/subscribe mode
        key => "logstash_list_0" # Release channel name
    }
}

In fact, the configuration of Logstash corresponds to the three parts (input, filter, and output) of the Logstash pipeline mentioned earlier. However, we do not need the filter here, so we have not written it out. The data source used by Input in the above configuration is of file type. You only need to configure the path of the local log file that needs to be collected. Output describes how to output data. The configuration here is to output to Redis.

An aside, if you are planning to interview and change jobs in the near future, it is recommended to answer questions online at ddkk.com, covering 10,000+ Java interview questions, covering almost all mainstream technical interview questions, and the most comprehensive 500 sets of technology stacks on the market, which are high-quality products. A series of tutorials, available for free.

Redis configuration data_type optional values include channel and list. Channel is the publish/subscribe communication mode of Redis, and list is the queue data structure of Redis. Both can be used to implement ordered asynchronous communication of messages between systems. The advantage of channel over list is that it decouples publishers and subscribers. For example, an Indexer is continuously reading records in Redis, and now wants to add a second Indexer. If you use list, the previous record will be taken by the first Indexer, and the next record will be taken by the second Indexer. In the case of removal, competition occurs between the two Indexers, resulting in neither party reading the complete log. channel can avoid this situation. Channel is used in both the configuration file of the Shipper role here and the configuration file of the Indexer role mentioned below.

Configure Indexer role Logstash

After configuring Logstash for the Shipper role, we also need to configure Logstash for the Indexer role to support receiving log data from Redis, parsing it through filters and storing it in Elasticsearch. The configuration content is as follows.

Listing 7. Logstash configuration for Indexer role

input {
    redis {
        host => "192.168.142.131" # redis host address
        port => 6379 # redis port number
        db => 8 # redis database number
        data_type => "channel" # Use publish/subscribe mode
        key => "sb-logback" # Release channel name
    }
}

filter {
     #Define the format of data
     grok {
       match => { "message" => "%{TIMESTAMP_ISO8601:time} \[%{NOTSPACE:threadName}\] %{LOGLEVEL:level} %{DATA:logger} %{NOTSPACE:applicationName} -(?:.* =%{NUMBER:timetaken}ms|)"}
     }
}

output {
    stdout {}
    elasticsearch {
        hosts => "localhost:9200"
        index => "logback"
   }
}

Different from Shipper, we define filters in Indexer’s pipeline, and it is here that logs are parsed into structured data. The following is the content of a logback log I intercepted:

Listing 8. A log output by the Spring Boot project

2019-08-11 18:01:31.602 [http-nio-8080-exec-2] INFO c.i.s.aop.WebLogAspect sb-elk - interface log
POST request test interface end call: time consumption=11ms, result=BaseResponse{code=10000, message='Operation successful'}

In Filter, we use the Grok plug-in to parse the time, thread name, Logger, service name, and interface time-consuming fields from the above log. How does Grok work?

1. The message field is the field where Logstash stores the collected data. match={“message”=>…} means processing the log content;
2. Grok actually parses data through regular expressions. TIMESTAMP_ISO8601, NOTSPACE, etc. that appear above are all Grok’s built-in patterns;
3. The parsed string we wrote can be tested using GrokDebugger to see if it is correct, thus avoiding repeated verification of the correctness of the parsing rules in the real environment;

View results

After the above steps, we have completed the construction of the entire ELK platform and the access to the Spring Boot project. Let’s follow the steps below to perform some operations to see the effect.

Start Elasticsearch. The startup command is mentioned in the ELK platform construction section and will not be described here (the same is true for Kibana startup). Start Logstash for the Indexer role.

# Enter the Logstash decompression directory and execute the following command
bin/logstash -f indexer-logstash.conf

Start Kibana.

Start Logstash for the Shipper role.

# Enter the Logstash decompression directory and execute the following command
bin/logstash -f shipper-logstash.conf

Call the Spring Boot interface, and data should have been written to ES at this time.

Visit http://ip:5601 in the browser, open Kibana’s web interface, and add the logback index as shown in the figure below.

Figure 8. Adding Elasticsearch index in Kibana

Picture

Enter the Discover interface, select the logback index, and you can see the log data, as shown in the figure below.

Figure 9. ELK log viewing

Picture

Using ELK with Nginx

I believe that through the above steps, you have successfully built your own ELK real-time logging platform and connected Logback type logs. However, in actual scenarios, it is almost impossible to have only one type of log. Next, we will access Nginx logs based on the above steps. Of course, the prerequisite for this step is that we need to install Nginx on the server. There are many introductions to the specific installation process online, so I won’t go into details here. Check the Nginx log as follows (Nginx access log is in the /var/log/nginx/access.log file by default).

Listing 9. Nginx access log

192.168.142.1 - - [17/Aug/2019:21:31:43 + 0800] "GET /weblog/get-test?name=elk HTTP/1.1"
200 3 "http://192.168.142.131/swagger-ui.html" "Mozilla/5.0 (Windows NT 10.0; Win64; x64)
AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.100 Safari/537.36"

Again, we need to write a Grok parsing rule for this log as follows:

Listing 10. Grok parsing rules for Nginx access logs

%{IPV4:ip} \- \- \[%{HTTPDATE:time}\] "%{NOTSPACE:method} %{DATA:requestUrl}
HTTP/%{NUMBER:httpVersion}" %{NUMBER:httpStatus} %{NUMBER:bytes}
"%{DATA:referer}" "%{DATA:agent}"

The key point after completing the above is that Indexer type Logstash needs to support two types of input, filter and output. How to support it? First, you need to specify the type of input, and then use different filters and outputs according to different input types, as shown below (for space reasons, the configuration file is not fully displayed here, you can click here to obtain it).

Listing 11. Logstash configuration for Indexer role supporting two log inputs

input {
    redis {
        type => "logback"
        ...
    }
    redis {
       type => "nginx"
       ...
    }
}

filter {
     if [type] == "logback" {
         ...
     }
     if [type] == "nginx" {
         ...
     }
}

output {
    if [type] == "logback" {
        ...
    }
    if [type] == "nginx" {
       ...
    }
}

My Nginx and Spring Boot projects are deployed on the same machine, so I need to modify the Shipper type Logstash configuration to support two types of log input and output. The content of its configuration file can be obtained by clicking here. After the above configuration is completed, we follow the steps in the View Effects chapter to start the Logstash, Nginx and Spring Boot projects of the ELK platform, Shipper role, and then add the Nignx index on Kibana to view the logs of Spring Boot and Nginx at the same time, as follows As shown in the figure.

Figure 10. ELK viewing Nginx logs

Picture

ELK Start

In the above steps, the ELK startup process requires us to execute the startup commands of the three major components one by one. And it is still started in the foreground, which means that if we close the session window, the component will stop and the entire ELK platform cannot be used. This is unrealistic in the actual work process. Our remaining problem is how to make ELK in the background run. According to the recommendation of the book “Logstash Best Practices”, we will use Supervisor to manage the start and stop of ELK. First we need to install Supervisor, just execute apt-get install supervisor on Ubuntu. After successful installation, we also need to configure the three major components of ELK in the Supervisor configuration file (the default configuration file is the /etc/supervisor/supervisord.conf file).

An aside, if you are planning to interview and change jobs in the near future, it is recommended to answer questions online at ddkk.com, covering 10,000+ Java interview questions, covering almost all mainstream technical interview questions, and the most comprehensive 500 sets of technology stacks on the market, which are high-quality products. A series of tutorials, available for free.

Listing 12. ELK background startup

[program:elasticsearch]
environment=JAVA_HOME="/usr/java/jdk1.8.0_221/"
directory=/home/elk/elk/elasticsearch
user=elk
command=/home/elk/elk/elasticsearch/bin/elasticsearch

[program:logstash]
environment=JAVA_HOME="/usr/java/jdk1.8.0_221/"
directory=/home/elk/elk/logstash
user=elk
command=/home/elk/elk/logstash/bin/logstash -f /home/elk/elk/logstash/indexer-logstash.conf

[program:kibana]
environment=LS_HEAP_SIZE=5000m
directory=/home/elk/elk/kibana
user=elk
command=/home/elk/elk/kibana/bin/kibana

[After configuring according to the above content, execute sudo supervisorctl reload to complete the startup of the entire ELK, and its default is to start automatically at boot. Of course, we can also use sudo supervisorctl start/stop program_name] to manage individual applications.

The knowledge points of the article match the official knowledge files, and you can further learn related knowledge. Java Skill TreeHomepageOverview 136763 people are learning the system