Docker Swarm is Native orchestration tool. It makes service management easy. For example, it stops a container which is unhealthy state and starts new container for a replacement. When we update the container we can update one by one and if the container doesn’t work as expected we can easily rollback the change because swarm store the previous state. It’s very good to use swarm for production environment.

This is one of Docker learning series posts.

  1. Start Docker from scratch
  2. Docker volume
  3. Bind host directory to Docker container for dev-env
  4. Communication with other Docker containers
  5. Run multi Docker containers with compose file
  6. Container’s dependency check and health check
  7. Override Docker compose file to have different environments
  8. Creating a cluster with Docker swarm and handling secrets
  9. Update and rollback without downtime in swarm mode
  10. Container optimization
  11. Visualizing log info with Fluentd, Elasticsearch and Kibana
Manager and Worker

There are two roles in swarm mode, Manager and Worker. A machine can be either a Manager or a Worker. Manager’s role is to control everything something like to store secret keys and config files, monitoring, scheduling and receives Docker commands from us. Workers simply run containers and report the status back to the Managers. By the way, a machine is called node in swarm mode. When a node is assigned as a manager it works as worker as well by default. All nodes needs to be in the same network because they have to communicate with each other. A cluster has shared endpoint called ingress. From there, its request is sent to one of containers and consumed.

Switch to Swarm mode

First of all, we need to switch to Swarm mode. What we need to do here is simply to initialize a swarm.

docker swarm init

How to join a cluster

All machines which you want to use somehow need to join the cluster. Docker offers simple way to do it. The steps are following.

  1. Check the hash to join as worker or manager
  2. Execute the command in a machine where you want to join
  3. Check if the machine joined the cluster
docker swarm join-token worker
docker swarm join-token manager
docker node ls

My result is following. You can see the commands which we need to execute in a machine to join the cluster. These commands below are to join my cluster. If you execute these commands your machine can’t join it.

$ docker swarm join-token worker
To add a worker to this swarm, run the following command:

    docker swarm join --token SWMTKN-1-5wx7bmxhnkcl9i71bcgqtdnomombtztf90nylke5cwsvpudbi1-chw5yv4lhi6hnfx40ate78d2a

$ docker swarm join-token manager
To add a manager to this swarm, run the following command:

    docker swarm join --token SWMTKN-1-5wx7bmxhnkcl9i71bcgqtdnomombtztf90nylke5cwsvpudbi1-0su1h62zv3dfniskexpv24f5e

$ docker node ls
ID                            HOSTNAME            STATUS              AVAILABILITY        MANAGER STATUS      ENGINE VERSION
zwfh3t5x51nmlu0vgnyzn2j9q *   docker-desktop      Ready               Active              Leader              19.03.13

Create overlay network

Docker containers in the same cluster can communicate each other even if the containers are in different machines. However, we need to create a Docker network for it. There are several network types in Docker but what we need for swarm mode is overlay network. It creates virtual network over the actual network and all communications go through the virtual network. Execute following command to create the overlay network.

docker network create --driver overlay swarm-test-net

Let’s check existing networks. There are maybe default networks like bridge/host/ingress/none and you can see swarm-test-net was created correctly.

$ docker network ls
NETWORK ID          NAME                              DRIVER              SCOPE
f80fe395dc37        bridge                            bridge              local
148b9f06e730        host                              host                local
gf9kotnet4yr        ingress                           overlay             swarm
cdff6cc4ab16        log-test-nat                      bridge              local
9bdf43300154        none                              null                local
ejq0ijjq16ms        swarm-test-net                    overlay             swarm

Keep service running with Health Check

We are ready to start swarm! Let’s start up containers with health-check and check if the Docker swarm replaces an unhealthy container with new one to keep service running. We can specify arguments on a command but it’s harder than using compose file. Docker compose file is available for swarm mode too and we should use it here. The compose file for this sample is following.

If you haven’t read following post yet, you should read it first before going further. It explains about health check functionality.
containers dependency check and health

version: "3.7"

    image: health-check-server:v2
      - "8003:80"
      - log-server
      interval: 10s
      timeout: 10s
      retries: 3
      start_period: 1m30s
      - app-net
      replicas: 2
          cpus: "0.20"
          memory: 100M

    image:  log-server
      - "8001:80"
      - app-net
      replicas: 2
          cpus: "0.20"
          memory: 100M

    external: true
    name: swarm-test-net

There is deploy option which specifies number of replicas and resource limits. If they are too small to start the container Docker swarm stops it and start new container. If we change the memory to 10M for health-check-server it doesn’t work as expected. I spent 1 or 2 hours to recognize it. Without limits options the containers eat CPU and memory resources as much as they can.

Let’s run container with swarm mode.

$ docker stack deploy -c health-check.yml health-check
Creating service health-check_health-check-server
Creating service health-check_log-server

Then, let’s check if the service is running correctly. The number of replicas of health-check-server can be 0/2 at first because log-server must be there before starting health-check-server. If health-check-server starts up first it exits immediately because health-check returns exit code 1. Docker swarm doesn’t control the order of the container startup. Following result looks good. Port number is 8003 for health-check-server even though 2 replicas are running because Docker swarm has public endpoint called ingress and it manages the request. If it receives a lot of requests it does load balancing.

$ docker stack ls
NAME                SERVICES            ORCHESTRATOR
health-check        2                   Swarm

$ docker service ls
ID                  NAME                               MODE                REPLICAS            IMAGE                    PORTS
5tzyu5xvjblp        health-check_health-check-server   replicated          2/2                 health-check-server:v2   *:8003->80/tcp
k8w81etyykuo        health-check_log-server            replicated          2/2                 log-server:latest        *:8001->80/tcp

Current status of the containers are following.

$ docker ps
CONTAINER ID        IMAGE                    COMMAND                  CREATED             STATUS                   PORTS               NAMES
5088fff5c1b4        health-check-server:v2   "docker-entrypoint.s窶ヲ"   6 minutes ago       Up 6 minutes (healthy)   80/tcp              health-check_health-check-server.1.mrh8zoa485n8dntgx3vkrjp6h
7845b5c84511        health-check-server:v2   "docker-entrypoint.s窶ヲ"   6 minutes ago       Up 6 minutes (healthy)   80/tcp              health-check_health-check-server.2.2f7hv1guerjuumq3rey2wxe8n
351e0fc4cff4        log-server:latest        "docker-entrypoint.s窶ヲ"   6 minutes ago       Up 6 minutes             80/tcp              health-check_log-server.1.mhrtuxcw4c40w4v0zysoowbni
4c8dd806e9c3        log-server:latest        "docker-entrypoint.s窶ヲ"   6 minutes ago       Up 6 minutes             80/tcp              health-check_log-server.2.onqfz85hj8ar4d8q2y395jdp9

Let’s check if the Docker swarm stops a unhealthy container and starts new one. health-check-server will be unhealthy if we browse it by http://localhost:8003/hello/boss. Let’s enter the URL to a browser and wait for about 30-40 seconds because health-check is done every 10 seconds and if it fails 3 times in a row the container turns unhealthy.

$ docker ps
CONTAINER ID        IMAGE                    COMMAND                  CREATED             STATUS                     PORTS               NAMES
5088fff5c1b4        health-check-server:v2   "docker-entrypoint.s窶ヲ"   9 minutes ago       Up 9 minutes (unhealthy)   80/tcp              health-check_health-check-server.1.mrh8zoa485n8dntgx3vkrjp6h
7845b5c84511        health-check-server:v2   "docker-entrypoint.s窶ヲ"   9 minutes ago       Up 9 minutes (healthy)     80/tcp              health-check_health-check-server.2.2f7hv1guerjuumq3rey2wxe8n
351e0fc4cff4        log-server:latest        "docker-entrypoint.s窶ヲ"   9 minutes ago       Up 9 minutes               80/tcp              health-check_log-server.1.mhrtuxcw4c40w4v0zysoowbni
4c8dd806e9c3        log-server:latest        "docker-entrypoint.s窶ヲ"   9 minutes ago       Up 9 minutes               80/tcp              health-check_log-server.2.onqfz85hj8ar4d8q2y395jdp9

After a while, Docker swarm starts new container automatically. Great.

$ docker ps
CONTAINER ID        IMAGE                    COMMAND                  CREATED             STATUS                            PORTS               NAMES
356b2b2f1b61        health-check-server:v2   "docker-entrypoint.s窶ヲ"   11 seconds ago      Up 9 seconds (health: starting)   80/tcp              health-check_health-check-server.1.48pzrgta1h72nt8rmcm80c8kx
7845b5c84511        health-check-server:v2   "docker-entrypoint.s窶ヲ"   10 minutes ago      Up 10 minutes (healthy)           80/tcp              health-check_health-check-server.2.2f7hv1guerjuumq3rey2wxe8n
351e0fc4cff4        log-server:latest        "docker-entrypoint.s窶ヲ"   10 minutes ago      Up 10 minutes                     80/tcp              health-check_log-server.1.mhrtuxcw4c40w4v0zysoowbni
4c8dd806e9c3        log-server:latest        "docker-entrypoint.s窶ヲ"   10 minutes ago      Up 10 minutes                     80/tcp              health-check_log-server.2.onqfz85hj8ar4d8q2y395jdp9

Secrets info and configs

We can pass config files to a container in Dockerfile but we may not have the config file during our development because it may be created by other team. In this case we create default config files and use them for development or test. If we finish our work we don’t want to build the image again. We want to pass the production config files when it’s deployed in a server. Docker compose file offers the function to pass config and secrets files. Config and secrets are basically the same but the difference is whether the contents are encrypted or not. The secrets info is encrypted and stored in manager’s database and sent to the container where the secrets are required. Then, the info is decrypted in the container.

How to create secrets

Secrets can be created either before or when starting swarm service. To create the secrets in advance, we can create it with following command.

$ echo happy-birthday | docker secret create test-secret -

$ docker secret ls
ID                          NAME                        DRIVER              CREATED             UPDATED
ub3h355y2lgzyud57plw8kepj   test-secret                                     5 seconds ago       5 seconds ago

$ docker secret inspect --pretty test-secret
ID:              ub3h355y2lgzyud57plw8kepj
Name:              test-secret
Created at:        2020-11-21 13:21:16.7354076 +0000 utc
Updated at:        2020-11-21 13:21:16.7354076 +0000 utc

This way is to create the secret from stdin and - option at the end is for it. If you want to create a secret from a file you can run following command.

docker secret create <secret name> <file name>

The original content is “happy-birthday” and it was encrypted. We can see that the secret was added to the secret list but cannot see the original content in the result of 3rd command because it’s decrypted only in a container where it’s required.

How to create config

Config can be create in the same way as secrets. It’s just replacing the keyword secret with config.

$ echo {date: "20201120", time:"15:00:00"} | docker config create test-config -

$ docker config ls
ID                          NAME                CREATED             UPDATED
xsoh92qlelztpqiycetu7ntfs   test-config         7 seconds ago       7 seconds ago

$ docker config inspect --pretty test-config
ID:                     xsoh92qlelztpqiycetu7ntfs
Name:                   test-config
Created at:             2020-11-21 14:08:41.0661721 +0000 utc
Updated at:             2020-11-21 14:08:41.0661721 +0000 utc
{date: 20201120, time:15:00:00}

There is Data section at the bottom in the result of 3rd command where secrets didn’t provide.

Run containers with secrets

Let’s try to run multiple containers with secrets. This is the compose file for it.

version: "3.7"

x-labels: &app-net
    - app-net

x-labels: &deploy
    replicas: 2
        cpus: "0.20"
        memory: 100M

    image: show-env
      - source: test-secrets
        target: /src/config/secrets.json

    image: health-check-server:v2
      - "8003:80"
      - log-server
      interval: 10s
      timeout: 10s
      retries: 2
      start_period: 1m30s
    <<: *app-net
    <<: *deploy

    image:  log-server
      - "8001:80"
    <<: *app-net
    <<: *deploy

    image: restify-server
        - log-server
    <<: *app-net
    <<: *deploy

    external: true
    name: swarm-test-net

    file: ./config/secrets.json

I defined app-net and deploy labels which can be used like variable because I want DRY code. The same contents are applied to health-check-server, log-server and restify-server.

$ docker stack deploy -c ./config-secrets.yml secrets-test
Creating secret secrets-test_test-secrets
Creating service secrets-test_restify-server
Creating service secrets-test_show-env
docker psCreating service secrets-test_health-check-server
Creating service secrets-test_log-server

$ docker stack ls
NAME                SERVICES            ORCHESTRATOR
secrets-test        4                   Swarm

$ docker service ls
ID                  NAME                               MODE                REPLICAS            IMAGE                    PORTS
t9yrqrdl1wmv        secrets-test_health-check-server   replicated          2/2                 health-check-server:v2   *:8003->80/tcp
lg29zfcosrdz        secrets-test_log-server            replicated          2/2                 log-server:latest        *:8001->80/tcp
rlfig4mkw4am        secrets-test_restify-server        replicated          2/2                 restify-server:latest
pzk3noukflmr        secrets-test_show-env              replicated          0/1                 show-env:latest

$ docker secret ls
ID                          NAME                        DRIVER              CREATED             UPDATED
lrl0x8rannot0fw0jlop5nl9r   secrets-test_test-secrets                       2 minutes ago       2 minutes ago
ub3h355y2lgzyud57plw8kepj   test-secret                                     22 minutes ago      22 minutes ago

Replica for show-env is 0 because it just outputs some info to the console and has no event loop in it. But Docker swarm starts new container again and again to keep the service running. But the logging info is hard to read.

$ docker service logs secrets-test_show-env
secrets-test_show-env.1.61grjjjdgpdw@docker-desktop    | === START ===
secrets-test_show-env.1.u5r95non0zub@docker-desktop    | === START ===
secrets-test_show-env.1.61grjjjdgpdw@docker-desktop    | Running for undefined
secrets-test_show-env.1.u5r95non0zub@docker-desktop    | Running for undefined
secrets-test_show-env.1.61grjjjdgpdw@docker-desktop    | FOO : undefined
secrets-test_show-env.1.u5r95non0zub@docker-desktop    | FOO : undefined
secrets-test_show-env.1.61grjjjdgpdw@docker-desktop    | HOGE: undefined
secrets-test_show-env.1.u5r95non0zub@docker-desktop    | HOGE: undefined
secrets-test_show-env.1.d1114vxiwvjv@docker-desktop    | === START ===
secrets-test_show-env.1.61grjjjdgpdw@docker-desktop    | ----secrets----
secrets-test_show-env.1.61grjjjdgpdw@docker-desktop    | user: production-swarm-user
secrets-test_show-env.1.u5r95non0zub@docker-desktop    | ----secrets----
secrets-test_show-env.1.d1114vxiwvjv@docker-desktop    | Running for undefined
secrets-test_show-env.1.u5r95non0zub@docker-desktop    | user: production-swarm-user
secrets-test_show-env.1.d1114vxiwvjv@docker-desktop    | FOO : undefined
secrets-test_show-env.1.u5r95non0zub@docker-desktop    | pass: production-swarm-pass
secrets-test_show-env.1.d1114vxiwvjv@docker-desktop    | HOGE: undefined
secrets-test_show-env.1.d1114vxiwvjv@docker-desktop    | ----secrets----
secrets-test_show-env.1.d1114vxiwvjv@docker-desktop    | user: production-swarm-user
secrets-test_show-env.1.d1114vxiwvjv@docker-desktop    | pass: production-swarm-pass
secrets-test_show-env.1.d1114vxiwvjv@docker-desktop    | === END ===
secrets-test_show-env.1.h19bpv94yau5@docker-desktop    | === START ===
secrets-test_show-env.1.61grjjjdgpdw@docker-desktop    | pass: production-swarm-pass
secrets-test_show-env.1.h19bpv94yau5@docker-desktop    | Running for undefined
secrets-test_show-env.1.61grjjjdgpdw@docker-desktop    | === END ===
secrets-test_show-env.1.h19bpv94yau5@docker-desktop    | FOO : undefined
secrets-test_show-env.1.u5r95non0zub@docker-desktop    | === END ===
secrets-test_show-env.1.h19bpv94yau5@docker-desktop    | HOGE: undefined
secrets-test_show-env.1.h19bpv94yau5@docker-desktop    | ----secrets----
secrets-test_show-env.1.h19bpv94yau5@docker-desktop    | user: production-swarm-user
secrets-test_show-env.1.h19bpv94yau5@docker-desktop    | pass: production-swarm-pass
secrets-test_show-env.1.h19bpv94yau5@docker-desktop    | === END ===

By doing like this below, it’s easy to read the log now. Env variables are undefined because I didn’t define env_file section but user and password can be read as expected in the container.

$ docker service logs secrets-test_show-env 2>&1 | grep 6
secrets-test_show-env.1.krsps60pezt9@docker-desktop    | === START ===
secrets-test_show-env.1.krsps60pezt9@docker-desktop    | Running for undefined
secrets-test_show-env.1.krsps60pezt9@docker-desktop    | FOO : undefined
secrets-test_show-env.1.krsps60pezt9@docker-desktop    | HOGE: undefined
secrets-test_show-env.1.krsps60pezt9@docker-desktop    | ----secrets----
secrets-test_show-env.1.krsps60pezt9@docker-desktop    | user: production-swarm-user
secrets-test_show-env.1.krsps60pezt9@docker-desktop    | pass: production-swarm-pass
secrets-test_show-env.1.krsps60pezt9@docker-desktop    | === END ===

When the services are removed secret is also removed.

$ docker stack rm secrets-test
Removing service secrets-test_health-check-server
Removing service secrets-test_log-server
Removing service secrets-test_restify-server
Removing service secrets-test_show-env
Removing secret secrets-test_test-secrets
Removing network secrets-test_default

$ docker secret ls
ID                          NAME                DRIVER              CREATED             UPDATED
ub3h355y2lgzyud57plw8kepj   test-secret                             44 minutes ago      44 minutes ago


Creating a cluster with Docker swarm was easy. The steps were following.

  1. Initialize a swarm
  2. Join a cluster
  3. Create a network/config/secrets
  4. Specify network/config/secrets in compose file
  5. Start services

Docker swarm starts new container for replacement when one of them becomes unhealthy state or exits. It keeps a service running which is very nice for production environment. However, it means we may not recognize an error because services are running. It’s necessary either to

  • have a function to notify the error to an administrator of the system
  • check the logs regularly


