Your resource for web content, online publishing
and the distribution of digital products.
S M T W T F S
 
 
 
 
 
1
 
2
 
3
 
4
 
5
 
6
 
7
 
8
 
9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
29
 
30
 

Quick Start Guide for DolphinScheduler: Installation and Configuration Using Docker Compose

DATE POSTED:October 17, 2024

\ DolphinScheduler is a powerful open-source distributed task scheduling system widely used in the big data field for managing complex workflows. This article will provide a detailed guide on how to install and configure DolphinScheduler using Docker Compose, allowing you to quickly set up and start using the system.

1. Environment Preparation

First, ensure that Docker and Docker Compose are installed on your system. Docker is an open-source containerization platform that allows developers to package applications and their dependencies into containers, providing high portability and consistency. Docker Compose is a tool for defining and managing multi-container applications. It uses a YAML file to configure the services and provides a single command to start or stop those services.

1.1 Verifying Docker and Docker Compose Installation

You can verify if Docker and Docker Compose are installed correctly using the following commands:

docker --versiondocker-compose --version

If you see the version information, the installation was successful.

2. Downloading the DolphinScheduler Docker Compose Configuration File

Before installing and running DolphinScheduler, you need to obtain its Docker Compose configuration file. This file defines the runtime environment for DolphinScheduler and its dependent services. Follow these steps to get the configuration file:

2.1 Clone the DolphinScheduler Project

First, use Git to clone the official DolphinScheduler repository:

git clone https://github.com/apache/dolphinscheduler.git

This will download the DolphinScheduler project to your local machine. Next, navigate to the project directory:

cd dolphinscheduler/docker

In this directory, you will find a file named docker-compose.yml, which is the core configuration file for Docker Compose.

3. Configuring the Docker Compose File

The docker-compose.yml file defines the services needed to run DolphinScheduler, including a MySQL database, ZooKeeper cluster, and DolphinScheduler's Master and Worker nodes. You can modify this file as needed to adjust the configuration of each service.

3.1 Overview of the Docker Compose File

The docker-compose.yml file has the following basic structure:

\

version: '3.1' services: zookeeper: image: zookeeper:3.5.6 ports: - "2181:2181" mysql: image: mysql:5.7 environment: MYSQL_ROOT_PASSWORD: root MYSQL_DATABASE: dolphinscheduler ports: - "3306:3306" dolphinscheduler-master: image: apache/dolphinscheduler:latest depends_on: - mysql - zookeeper ports: - "12345:12345" environment: - DOLPHINSCHEDULER_OPTS="-Xms512m -Xmx512m" dolphinscheduler-worker: image: apache/dolphinscheduler:latest depends_on: - dolphinscheduler-master environment: - DOLPHINSCHEDULER_OPTS="-Xms512m -Xmx512m"

In this configuration file:

  • zookeeper: Responsible for cluster coordination and service discovery.
  • mysql: Stores the metadata for DolphinScheduler.
  • dolphinscheduler-master: The master node responsible for scheduling and managing tasks.
  • dolphinscheduler-worker: The worker node responsible for executing tasks.
4.Starting DolphinScheduler

Once the docker-compose.yml file is configured correctly, you can start DolphinScheduler using Docker Compose:

docker-compose up -d

This command will start all the services defined in the docker-compose.yml file in the background. You can check the status of the services with the following command:

docker-compose ps

If all services are listed as Up, DolphinScheduler has been successfully started.

5.Configuring DolphinScheduler 5.1 Initial Setup

Once started, you can access DolphinScheduler’s web UI through a browser. By default, the access URL is:

http://localhost:12345

At the login screen, use the default admin credentials (username: admin, password: admin). After logging in, you may want to change the default password to enhance system security.

5.2 Creating Projects and Tasks

In the web UI, you can create projects and define tasks. DolphinScheduler supports various task types such as Shell, Python, and SQL. You can create workflows by dragging and dropping tasks and setting dependencies between them.

5.3 System Monitoring and Log Management

DolphinScheduler offers rich monitoring and logging features. Users can view task execution statuses, monitor cluster health in real-time, and access detailed execution logs, which help debug and optimize workflows.

6.Common Issues and Solutions

During usage, you may encounter some issues. Below are common problems and their solutions.

6.1 Service Startup Failures

If a service fails to start, you can check the logs to diagnose the issue using the following command:

docker-compose logs

For example:

docker-compose logs dolphinscheduler-master

The log information can help identify errors such as database connection failures or port conflicts.

6.2 Database Connection Issues

If there are database connection failures during startup, it may be due to the MySQL service not starting in time. In this case, try restarting DolphinScheduler manually:

docker-compose restart dolphinscheduler-master dolphinscheduler-worker

7.Advantages and Use Cases of DolphinScheduler

DolphinScheduler excels in big data processing and ETL task scheduling. Some key advantages include:

  • User-Friendly Interface: Its graphical interface makes task management and monitoring easy, lowering the barrier to entry.
  • Flexible Task Dependency Management: It allows defining complex task dependencies, making scheduling more efficient.
  • High Scalability and Availability: Its distributed architecture is suitable for large-scale data processing.
8.Conclusion

By following the above steps, you have successfully installed and configured DolphinScheduler using Docker Compose. Its powerful features and flexible configuration make it an ideal choice for distributed task scheduling. Whether for enterprise-level big data processing or small-to-medium-sized data integration projects, DolphinScheduler is a reliable solution.

\ If you encounter issues during usage, you can refer to DolphinScheduler’s official documentation or community resources for more detailed technical support. With continued learning and exploration, you will be able to fully leverage DolphinScheduler’s potential, significantly improving your workflow management.