Task Scheduler – DolphinScheduler3 cluster installation

【Background】

I am engaged in big data development. In some external projects or some immature companies, there is no task scheduler available. Previously, mysql stored procedures were scheduled with built-in event events, and generally used Kettle + crontab for scheduling. , although it can also be used, but these schedulers do not have a visual interface, and query logs, task progress, and complement numbers are not easy to use. Later, I came into contact with DolphinScheduler. This scheduler is open source and free, and supports Relational database scheduling, Hive, Spark, Flink, Shell, Datax, task dependencies, ClickHouse, etc., support alarm, anyway, basically support everything you can use, and you can’t put it down after using it , strongly recommend to everyone~

This article is a tidy article long after the fact. I forgot to take screenshots during the installation, so some pictures from the official website were used in it.

The supported task types are as follows, very comprehensive:

Supports workload DGA configurations

【Text】

1. Preparation before installation

1. Dependent component installation

  • JDK1.8+
  • Mysql5.7+
  • Zookeeper3.4.6+

Those who have several years of experience in development and deployment should be familiar with these components. During the installation process, there are a lot of random searches on the Internet, so I won’t list them one by one.

2. Server allocation

server Installed components Note
192.168.0.1

master-server

alter-server

api-server

Mysql5.7

zookeeper

192.168.0.2

worker-server

zookeeper

192.168.0.3

worker-server

zookeeper

3. Create a new user dolphinscheduler, and configure sudo permissions, and do secret-free between servers, because remote installation will be required later

2. Module introduction

  • The dolphinscheduler-master master module provides workflow management and orchestration services.
  • The dolphinscheduler-worker worker module provides task execution management services.
  • dolphinscheduler-alert alert module, providing AlertServer service.
  • dolphinscheduler-api web application module, providing ApiServer service.

3. Download

Address: https://dolphinscheduler.apache.org/zh-cn/download/deployment?t=standalone

Choose the latest version to download. 3.x is very stable now. I used 3.0.1 at first, but I was cheated many times.

4. Modify the configuration file

Upload the downloaded installation package to one of the servers for decompression and configuration, and the corresponding components will be remotely installed to other servers during installation.

5. Initialize the database

1. Upload mysql-connect-java.jar package

Download the mysql-connector-java driver (8.0.16) and move it to the libs directory of each module of DolphinScheduler, including api-server/libs and alert-server/libs code> and master-server/libs and worker-server/libs.

2. Create a database

mysql> CREATE DATABASE dolphinscheduler DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;

# Modify {user} and {password} to your desired username and password
mysql> GRANT ALL PRIVILEGES ON dolphinscheduler.* TO '{user}'@'%' IDENTIFIED BY '{password}';
mysql> GRANT ALL PRIVILEGES ON dolphinscheduler.* TO '{user}'@'localhost' IDENTIFIED BY '{password}';

mysql> flush privileges;

3. Initialize the database

./tools/bin/upgrade-schema.sh

6. Start the installation

Execute the following command, wait for the execution to complete, and all servers will be installed, which is very convenient

./bin/install.sh

7. Start the cluster

One-click start of all cluster services

./bin/start-all.sh

The command to stop the cluster is as follows

./bin/stop-all.sh

8. Login to DolphinScheduler

Browser access address http://192.168.0.1:12345/dolphinscheduler/ui to log in to the system UI. The default username and password are admin/dolphinscheduler123