1. ELK + Filebeat + kafka + zookeeper architecture
Architecture diagrams are demonstrated separately
The first layer: data collection layer
- The data collection layer is located on the leftmost business service cluster. Filebead is installed on each business server for log collection, and then the collected original logs are sent to the kafka + zookeeper cluster.
Second layer: message queue layer
- After the original log is sent to the kafka + zookeeper cluster, it will be stored centrally. At this time, filebead is the producer of the message, and the stored message can be consumed at any time.
The third layer: data analysis layer
- As a consumer, logstash pulls the original logs back to the kafka + zookeeper cluster node, then analyzes and formats the obtained original logs according to the rules, and finally forwards the formatted logs to the Elasticsearch cluster.
Level 4: Data persistence storage
- After receiving the data sent by logstash, the Elasticsearch cluster performs operations such as writing to disk, establishing indexes, and finally stores the structured data on the Elasticsearch cluster.
The fifth layer: data query, display layer
- Kibana is a visual data display platform. When there is a data retrieval request, it reads data from the Elasticsearch cluster, and then performs visual plotting and multi-dimensional analysis.
2. Build ELFK + zookeeper + kafka
Host name | ip address | Cluster | Installation software package |
---|---|---|---|
filebead | 20.0.0.55 | Data hierarchy layer | filebead + apache |
kafka1 | 20.0.0.56 | kafka + zookeeper cluster | kafka + zookeeper |
kafka2 | 20.0.0.57 | kafka + zookeeper cluster | kafka + zookeeper |
kafka3 | 20.0.0.58 | kafka + zookeeper cluster | kafka + zookeeper |
logstash | 20.0.0.59 | Data processing layer | logstash |
node1 | 20.0.0.60 | ES Cluster | Eslasticsearch + node + phantomis + head |
node2 | 20.0.0.61 | ES cluster + kibana display | Elasticsearch + node + phantomis + head + kibana |
1. Install kafka + zookeeper cluster (20.0.0.55, 20.0.0.56, 20.0.0.57)
2. Install zookeeper service
Turn off the firewall, core protection, and modify the host name
Installation environment, decompression software
Modify configuration file
Create data directory and log directory
Set the myid of three machines
Set up execution scripts for three machines
Put the startup scripts of the three machines into the system management
Start three zookeepers respectively
3. Install kafka service
Upload the installation packages for all three machines and extract them to the specified directory
Backup configuration files
Modify configuration file
- Configuration file for the 20.0.0.55 host
Configuration file for 20.0.0.56
20.0.0.57 configuration file
Add kafka to environment variables
Configure kafka startup script
Set automatic startup
Start kafka separately
3.1 kafka command line operation
Create topic
kafka-topics.sh --create --zookeeper 20.0.0.55:2181,20.0.0.56:2181,20.0.0.57:2181 --replication-factor 2 --partitions 3 --topic test #--zookeeper: Define the zookeeper cluster server address. If there are multiple IPs, separate them with commas. #--replication-factor: Define partition copy, 1 represents single copy, 2 is recommended #--partitions: Define the number of partitions #--topic: define topic name
View all topics in the current server
kafka-topics.sh --list --zookeeper 20.0.0.55:2181,20.0.0.56:2181,20.0.0.57:2181
View details of a topic
kafka-topics.sh --describe --zookeeper 20.0.0.55:2181,20.0.0.56:2181,20.0.0.57:2181
Post a message
kafka-console-producer.sh --broker-list 20.0.0.55:9092,20.0.0.56:9092,20.0.0.57:9092 --topic test
Consuming messages
kafka-console-consumer.sh --bootstrap-server 20.0.0.55:9092,20.0.0.56:9092,20.0.0.57:9092 --topic test --from-beginning #--from-beginning: All previous data in the topic will be read out
Modify the number of partitions
kafka-topics.sh --zookeeper 20.0.0.55:2181,20.0.0.56:2181,20.0.0.57:2181 --alter --topic test --partitions 6
Delete topic
kafka-topics.sh --delete --zookeeper 20.0.0.55:2181,20.0.0.56:2181,20.0.0.57:2181 --topic test
3.2 Create topic for testing (can be operated on any host)
Create topic
Publish news and consume news
3. Configure the data collection layer filebead (20.0.0.58)
Close the firewall and change the host name
Install the httpd service and start it
Install filebead and cut to the specified directory
Modify configuration file
Start filebeat service
4. Deploy ES services (20.0.0.60, 20.0.0.61)
Install JDK
4.1 Install ES service
Configure local resolution, upload the installation package, install and start
Modify configuration file
View configuration files and create data directories
4.2 Install node plug-in
Installing the operating environment
Compile
Installation
4.3 Install phantomjs plug-in
Upload and decompress the compressed package
Add the executable file to the environment variables
4.4 Install ES-head
Upload the compressed package and decompress it
Installation
4.5 Modify ES configuration file
4.6 Start ES service
4.7 Start ES-head service
5. Deploy logstash (20.0.0.59)
Install java environment
Install logstash
Create soft links
Create execution docking file
Start service
6. Use ES-head interface to access
7. Install kibana to point to visualization
No demonstration here, refer to the previous blog