Directory of series articles
Chapter 1 es cluster construction
Chapter 2 Basic Operation Commands of ES Cluster
Chapter 3 es implements encryption authentication based on search-guard plug-in
Chapter 4 es commonly used plug-ins
Chapter 5 es data migration elasticdump
Article directory
- Table of Contents of Series Articles
- Preface
- 1. What is logstash?
- 2. Full data migration steps
-
- 1. Install logstash
- 2. Modify logstash configuration
- 3. Create a logstash file for full migration
- 4. Execute the migration command and check the results
- 3. Data incremental migration steps
-
- 1. Create incremental migration files
- 2. Start the incremental migration and check whether the incremental migration is successful
- Summarize
Foreword
Through the content of Chapter 5, we learned that the elasticdump tool is only suitable for situations where the amount of es data is small and there are not many or many indexes, that is, in most cases it is used to back up a single index. However, in the actual production environment, the entire es cluster data is migrated a lot, so elasticdump is not suitable. If used forcefully, it will cause the server disk IO and CPU to be too high, and it is easy to generate alarms. Therefore, this article recommends it to everyone. Logstash, a migration tool suitable for production environments.
1. What is logstash?
Logstash is an open source data collection engine with real-time pipeline capabilities. Logstash can dynamically unify data from disparate sources and normalize data to a target output of your choice. It provides a large number of plugins that help us parse, enrich, transform and buffer any type of data
2. Full data migration steps
Note: The premise is that the source es must be connected to the target es network
1. Install logstash
Download the appropriate LogStash installation package and unzip it for installation wget https://artifacts.elastic.co/downloads/logstash/logstash-7.10.0-linux-x86_64.tar.gz tar -zvxf logstash-7.10.0-linux-x86_64.tar.gz
2. Modify logstash configuration
Modify the heap memory usage of Logstash vi config/jvm.options, modify the Logstash configuration file config/jvm.options, and add -Xms2g and -Xmx2g. Modifying the number of records written in Logstash batches can speed up the migration efficiency of cluster data. viconfig/pipelines.yml pipeline.batch.size changed from 125 to 5000
3. Create a logstash file for full migration
Configure logstash Go to the installation directory cd /export/server/logstash/confing/ Create vi es2es_all.conf file input {<!-- --> elasticsearch {<!-- --> hosts => "http://ip:9200" ##Source es cluster user => "username" ##Authentication information password => "password" index => "index name" ##? Supports wildcards, * table? All indexes, if the index has multiple data volumes, it can be configured separately query => '{ "sort": [ "_doc" ] }' slices => 4 ##Whether to use ?slice scroll to speed up migration, the value should not exceed the number of single index shards scroll => "5m" ##scroll session retention time size => 1000 docinfo => true ssl => false ##Do you want to use ssl? } } filter {<!-- --> # Remove some fields added by Logstash itself. mutate {<!-- --> remove_field => ["@timestamp", "@version"] } } output {<!-- --> elasticsearch {<!-- --> hosts => "http://ip:9200" ##Purpose es cluster user => "username" password => "password" index => "index name" #Just keep it consistent with the source es index #index => "%{[@metadata][_index]}" #Fill in the peer information based on the original information document_type => "%{[@metadata][_type]}" #Target end index type, the following configuration indicates that the index type is consistent with the source end document_id => "%{[@metadata][_id]}" #The id of the target data. If you do not need to retain the original id, you can delete the following line. After deletion, the performance will be better. ssl => false #Close ssl ssl_certificate_verification => false ilm_enabled => false manage_template => false } }
4. Execute the migration command and check the results
Start the full Logstash migration task nohup bin/logstash -f config/es2es_all.conf >es_all.log 2> & amp;1 & amp; \t Check the es_all.log log to see if there is a migration error. If not, execute the following command to check whether the size of the source index and the target index after migration are consistent. curl -X GET http://ip:9200/_cat/indices?v?
3. Incremental data migration steps
Note: The premise is that the source es must be connected to the target es network
1. Create incremental migration files
1. Create incremental migration files To install logstash and adjust the configuration, see the Step 2’ above. \tillustrate: The configuration parameters of Logstash in version 8.5 have been adjusted, and document_type => "%{[@metadata][_type]}" needs to be removed. After modifying the Logstash configuration file according to the following script, enable the Logstash scheduled task to trigger incremental migration. vim logstash/config/es_add.conf input{<!-- --> elasticsearch{<!-- --> # Source ES address. hosts => ["http://localhost:9200"] # Configure the login user name and password for the secure cluster. user => "xxxxxx" password => "xxxxxx" # List of indexes that need to be migrated. Use commas (,) to separate multiple indexes. index => "kibana_sample_data_logs" # Query incremental data by time range. The following configuration indicates querying the data of the last 5 minutes. query => '{"query":{"range":{"@timestamp":{"gte":"now-5m","lte":" now/m"}}}}' # Scheduled tasks, the following configuration indicates execution once every minute. schedule => "* * * * *" scroll => "5m" docinfo=>true size => 5000 } } filter {<!-- --> # Remove some fields added by Logstash itself. mutate {<!-- --> remove_field => ["@timestamp", "@version"] } } output{<!-- --> elasticsearch{<!-- --> # The target ES address can be obtained from the basic information page of the Alibaba Cloud Elasticsearch instance. hosts => ["http://ip:9200"] # Secure cluster configuration login user name and password. user => "elastic" password => "xxxxxx" # Target index name. The following configuration indicates that the index is consistent with the source index. index => "%{[@metadata][_index]}" # Target index type. The following configuration indicates that the index type is consistent with the source index type. document_type => "%{[@metadata][_type]}" # The ID of the target data. If you do not need to retain the original ID, you can delete the following line. After deletion, the performance will be better. document_id => "%{[@metadata][_id]}" ilm_enabled => false manage_template => false } }
2. Start incremental migration and check whether the incremental migration is successful
Execute the following command nohup bin/logstash -f config/es_add.conf >es_add.log 2> & amp;1 & amp; \t Check if successful curl -XGET http://localhost:9200/kibana_sample_data_logs/_search {<!-- --> "query": {<!-- --> "range": {<!-- --> "@timestamp": {<!-- --> "gte": "now-5m", "lte": "now/m" } } }, "sort": [ {<!-- --> "@timestamp": {<!-- --> "order": "desc" } } ] } Check whether the returned result contains the word "success". If it does, the incremental migration is successful.
Summary
The most important point of the logstash migration method shared this time is that the source es and target es networks must be connected to each other. If they are not, this method will fail to migrate.