wiki:keepalived

Keepalived active stand by

On master site ip 10.7.168.221, file /etc/keepalived/keepalived.conf. All config in /etc/keepalived can be found here

root@n4:/etc/keepalived# pwd
/etc/keepalived
root@n4:/etc/keepalived# more keepalived.conf 
vrrp_script chk_service_health {
    #script /etc/keepalived/check_proc.sh
    #script /etc/keepalived/check_ping.sh
    weight 10
    # how often script run in period 10 sec
    interval 5
    # How long to wait for the script to return (5 seconds).
    timeout 5
    # How many times the script must return successfully in order for the host to be considered healthy.
    rise 3
    # How many times the script must return unsuccessfully (or time out) in order for the host to be considered unhealthy
    fall 3
}

vrrp_instance VI_1 {
        #state BACKUP
        state MASTER
        interface ens3
        virtual_router_id 51
        priority 201
        advert_int 10
        authentication {
              auth_type PASS
              auth_pass 1234
        }
        unicast_src_ip 10.7.168.221
        unicast_peer {
             10.7.168.58
        }
        virtual_ipaddress {
             10.7.168.200/24
        }
        track_script {
            chk_service_health
        }
        notify /etc/keepalived/keepalived_notify.sh
        notify_master /etc/keepalived/keepalived_notify_master.sh
        notify_backup /etc/keepalived/keepalived_notify_backup.sh
}

At user directory /home/ubuntu, pls see files check_containner.sh, notifyFile_3.sh, stop_notifyFile_3.sh use for start/stop docker. For sync with unison, we need to place file default.prf in ./unison/default.prf user folder /home/ubuntu/.unison. Pls note, unison need to use ssh-keygen to authentication and place in user as /home/ubuntu/.ssh/authorized_keys.

ubuntu@n4:~$ tree -L 2
├── check_containner.sh
├── default.prf
├── notifyFile_3.sh
├── nodedb1_start.sh
├── omd-labs-docker
│   ├── Dockerfile.omd-labs-debian
│   ├── Makefile.omd-labs-debian
│   ├── README.md
│   ├── ansible_dropin
│   ├── hooks
│   ├── run.sh
│   ├── scripts
│   └── site
└── stop_notifyFile_3.sh

After wake up, we need to start docker database and sync to mysql group replication. In command docker exec -i, we use only -i not -it because -t was a tty console when run as a script it was not from tty console so we need to use only -i (interactive mode) refhere. This file was in nodedb1_start.sh where the keepalived_notify_master.sh should call this file.

#!/usr/bin/bash

logfile=/tmp/nodedb1.log
container_name=nodedb1
/usr/bin/docker start $container_name
echo "docker start $container_name" >> $logfile
sleep 10
if [ "$( docker container inspect -f '{{.State.Status}}' $container_name )" == "running" ]; then 
   echo "$container_name is running" >> $logfile
else
   echo "$container_name is not running, we will start again" >> $logfile
   /usr/bin/docker start $container_name
   echo "start $container_name Done!!" >> $logfile
fi

if [ "$( docker container inspect -f '{{.State.Status}}' $container_name )" == "running" ]; then
   cmd1="/usr/bin/docker exec -i $container_name mysql -uroot -pmypass -e \"STOP GROUP_REPLICATION;\" \
                      -e \"SET @@GLOBAL.group_replication_group_name='aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee';\" \
                      -e \"SET @@GLOBAL.group_replication_local_address='$container_name:33061';\"\
                      -e \"SET @@GLOBAL.group_replication_group_seeds='$container_name:33061,nodedb2:33061,nodedb3:33061';\" \
                      -e \"change master to master_user='repl' for channel 'group_replication_recovery';\" \
                      -e \"START GROUP_REPLICATION;\"  " 
   echo "$cmd1" >> $logfile
   eval "$cmd1" 2> /tmp/err.log 1>> $logfile
   sleep 10
   echo "$container_name join group_replication" >> $logfile
   cmd2="/usr/bin/docker exec -i $container_name mysql -uroot -pmypass -e \"SELECT * FROM performance_schema.replication_group_members;\" "
   echo "$cmd2" >> $logfile
   # we need " double quote around "$cmd2" because * wildcard will not process as * in Linux command
   # eval $cmd2 not work we must use 
   # eval "$cmd2" 
   eval "$cmd2" 2> /tmp/err.log 1>> $logfile
else
   echo "$container_name still not start" >> $logfile 
fi

For standby site (backup site ip 10.7.168.58) config can be found here


Keepalived Backup site check DB Group Replication

On Backup site, we need to check IP_MASTER who had virtual IP "192.168.81.7" and it own IP. We also need to check DB Group Replication that already start and had ONLINE status. If not ONLINE, we switch "192.168.81.7" to backup site. Since sometime DB docker start but DB not join Group Replication.

#!/usr/bin/bash
logfile=/tmp/keepalived.log
echo "Time: $(date). keepalive log -------" >> $logfile 

IP_MASTER="192.168.81.12"
STR_CHECK="ONLINE"
DB_NODE="nodedb1"

#/usr/bin/unison > /dev/null 2>&1
/bin/ping -c 2 -W 1 192.168.81.12 > /dev/null 2>&1
status=$?
echo 'ping ' $status
if [ $status -eq 0 ]
then
   all_result=`/usr/bin/docker exec -i $DB_NODE mysql -uroot -pmypass -e 'SELECT * FROM performance_schema.replication_group_members;'`
   online=$(echo $all_result | grep "$IP_MASTER" | awk '{ print $12}' )
   # *"$STR_CHECK"*  wildcard * means don't care about prefix or suffix of that $STR_CHECK word. 
   if [[ "$online" == *"$STR_CHECK"* ]]
   then
       echo "Master node OK" >> $logfile
       exit 1 # All good. DB master reachable
   else
       echo "Master node DB Fail" >> $logfile
       exit 0 # Failover trigger
   fi
else
   echo "Master node Ping Fail" >> $logfile
   exit 0 # Failover trigger
fi
Last modified 2 years ago Last modified on 09/20/22 00:23:16

Attachments (8)

Download all attachments as: .zip