= Keepalived active stand by = On master site ip 10.7.168.221, file '''/etc/keepalived/keepalived.conf'''. All config in '''/etc/keepalived''' can be found [attachment:keepalived_master.tar here] {{{ root@n4:/etc/keepalived# pwd /etc/keepalived root@n4:/etc/keepalived# more keepalived.conf vrrp_script chk_service_health { #script /etc/keepalived/check_proc.sh #script /etc/keepalived/check_ping.sh weight 10 # how often script run in period 10 sec interval 5 # How long to wait for the script to return (5 seconds). timeout 5 # How many times the script must return successfully in order for the host to be considered healthy. rise 3 # How many times the script must return unsuccessfully (or time out) in order for the host to be considered unhealthy fall 3 } vrrp_instance VI_1 { #state BACKUP state MASTER interface ens3 virtual_router_id 51 priority 201 advert_int 10 authentication { auth_type PASS auth_pass 1234 } unicast_src_ip 10.7.168.221 unicast_peer { 10.7.168.58 } virtual_ipaddress { 10.7.168.200/24 } track_script { chk_service_health } notify /etc/keepalived/keepalived_notify.sh notify_master /etc/keepalived/keepalived_notify_master.sh notify_backup /etc/keepalived/keepalived_notify_backup.sh } }}} At user directory '''/home/ubuntu''', pls see files [attachment:check_containner.sh check_containner.sh], [attachment:notifyFile_3.sh notifyFile_3.sh], [attachment:stop_notifyFile_3.sh stop_notifyFile_3.sh] use for start/stop docker. For sync with unison, we need to place file [attachment:default.prf default.prf] in '''./unison/default.prf''' user folder /home/ubuntu/.unison. Pls note, unison need to use ssh-keygen to authentication and place in user as /home/ubuntu/.ssh/authorized_keys. {{{ ubuntu@n4:~$ tree -L 2 ├── check_containner.sh ├── default.prf ├── notifyFile_3.sh ├── nodedb1_start.sh ├── omd-labs-docker │   ├── Dockerfile.omd-labs-debian │   ├── Makefile.omd-labs-debian │   ├── README.md │   ├── ansible_dropin │   ├── hooks │   ├── run.sh │   ├── scripts │   └── site └── stop_notifyFile_3.sh }}} After wake up, we need to start docker database and sync to mysql group replication. In command '''docker exec -i''', we use only '''-i''' not '''-it''' because '''-t''' was a tty console when run as a script it was not from tty console so we need to use only '''-i''' (interactive mode) ref[https://stackoverflow.com/questions/43099116/error-the-input-device-is-not-a-tty here]. This file was in [attachment:nodedb1_start.sh nodedb1_start.sh] where the keepalived_notify_master.sh should call this file. {{{ #!/usr/bin/bash logfile=/tmp/nodedb1.log container_name=nodedb1 /usr/bin/docker start $container_name echo "docker start $container_name" >> $logfile sleep 10 if [ "$( docker container inspect -f '{{.State.Status}}' $container_name )" == "running" ]; then echo "$container_name is running" >> $logfile else echo "$container_name is not running, we will start again" >> $logfile /usr/bin/docker start $container_name echo "start $container_name Done!!" >> $logfile fi if [ "$( docker container inspect -f '{{.State.Status}}' $container_name )" == "running" ]; then cmd1="/usr/bin/docker exec -i $container_name mysql -uroot -pmypass -e \"STOP GROUP_REPLICATION;\" \ -e \"SET @@GLOBAL.group_replication_group_name='aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee';\" \ -e \"SET @@GLOBAL.group_replication_local_address='$container_name:33061';\"\ -e \"SET @@GLOBAL.group_replication_group_seeds='$container_name:33061,nodedb2:33061,nodedb3:33061';\" \ -e \"change master to master_user='repl' for channel 'group_replication_recovery';\" \ -e \"START GROUP_REPLICATION;\" " echo "$cmd1" >> $logfile eval "$cmd1" 2> /tmp/err.log 1>> $logfile sleep 10 echo "$container_name join group_replication" >> $logfile cmd2="/usr/bin/docker exec -i $container_name mysql -uroot -pmypass -e \"SELECT * FROM performance_schema.replication_group_members;\" " echo "$cmd2" >> $logfile # we need " double quote around "$cmd2" because * wildcard will not process as * in Linux command # eval $cmd2 not work we must use # eval "$cmd2" eval "$cmd2" 2> /tmp/err.log 1>> $logfile else echo "$container_name still not start" >> $logfile fi }}} For standby site (backup site ip 10.7.168.58) config can be found [attachment:keepalived_backup.tar here] ------------------------------------------ == Keepalived Backup site check DB Group Replication== On Backup site, we need to check IP_MASTER who had virtual IP "192.168.81.7" and it own IP. We also need to check DB Group Replication that already start and had '''ONLINE''' status. If not '''ONLINE''', we switch "192.168.81.7" to backup site. Since sometime DB docker start but DB not join Group Replication. {{{ #!sh #!/usr/bin/bash logfile=/tmp/keepalived.log echo "Time: $(date). keepalive log -------" >> $logfile IP_MASTER="192.168.81.12" STR_CHECK="ONLINE" DB_NODE="nodedb1" #/usr/bin/unison > /dev/null 2>&1 /bin/ping -c 2 -W 1 192.168.81.12 > /dev/null 2>&1 status=$? echo 'ping ' $status if [ $status -eq 0 ] then all_result=`/usr/bin/docker exec -i $DB_NODE mysql -uroot -pmypass -e 'SELECT * FROM performance_schema.replication_group_members;'` online=$(echo $all_result | grep "$IP_MASTER" | awk '{ print $12}' ) # *"$STR_CHECK"* wildcard * means don't care about prefix or suffix of that $STR_CHECK word. if [[ "$online" == *"$STR_CHECK"* ]] then echo "Master node OK" >> $logfile exit 1 # All good. DB master reachable else echo "Master node Fail" >> $logfile exit 0 # Failover trigger fi else echo "Master node Fail" >> $logfile exit 0 # Failover trigger fi }}}