Date   
Re: Master/backup not working. Both as master

Aaron Holtz
 

What does the output of `firewall-cmd --list-all` look like? Can you
see that 224.0.0.0/8 is allowed? tcpdump will show the traffic but that doesn't
mean your firewall is permitting it through to keepalived.

-- Aaron

On Nov 25, Antonio Augusto Nunes Godinho molded the electrons to say....

Date: Mon, 25 Nov 2019 09:30:45
From: Antonio Augusto Nunes Godinho <to@...>
Reply-To: keepalived-users@groups.io
To: "keepalived-users@groups.io" <keepalived-users@groups.io>
Subject: Re: [keepalived-users] Master/backup not working. Both as master


Hi,

 

I can see muticast traffic on 224.0.0.18 address using tcpdump, I think that means VRRP is able to communicate. The NODES IP's are node3 -> 193.137.78.83
and node2 ->193.137.78.82 (node 1 is down for now).

-----------------------------------------------------------------------------------------------------------------------------------------------------------
--------------------------------

NODE 3:

[root@Srv-NginxHaproxy3 ~]# tcpdump -vvv -n -i ens192 host 224.0.0.18

tcpdump: listening on ens192, link-type EN10MB (Ethernet), capture size 262144 bytes

14:17:54.447759 IP (tos 0xc0, ttl 255, id 55038, offset 0, flags [none], proto VRRP (112), length 76)

    193.137.78.81 > 224.0.0.18: vrrp 193.137.78.81 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 200, authtype simple, intvl 1s, length 56, addrs(10):
193.137.78.71,193.137.78.72,193.137.78.73,193.137.78.74,193.137.78.75,193.137.78.76,193.137.78.77,193.137.78.78,193.137.78.79,193.137.78.100 auth "Passw0rd"

14:17:54.562886 IP (tos 0xc0, ttl 255, id 54788, offset 0, flags [none], proto VRRP (112), length 76)

    193.137.78.82 > 224.0.0.18: vrrp 193.137.78.82 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 255, authtype simple, intvl 1s, length 56, addrs(10):
193.137.78.71,193.137.78.72,193.137.78.73,193.137.78.74,193.137.78.75,193.137.78.76,193.137.78.77,193.137.78.78,193.137.78.79,193.137.78.100 auth "Passw0rd"

 

[root@Srv-NginxHaproxy3 ~]# tcpdump -i ens192 -c 2 host vrrp.mcast.net

tcpdump: verbose output suppressed, use -v or -vv for full protocol decode

listening on ens192, link-type EN10MB (Ethernet), capture size 262144 bytes

14:24:07.506258 IP 193.137.78.81 > vrrp.mcast.net: VRRPv2, Advertisement, vrid 51, prio 200, authtype simple, intvl 1s, length 56

14:24:07.595230 IP Srv-NginxHaproxy3 > vrrp.mcast.net: VRRPv2, Advertisement, vrid 51, prio 255, authtype simple, intvl 1s, length 56

2 packets captured

4 packets received by filter

0 packets dropped by kernel

-----------------------------------------------------------------------------------------------------------------------------------------------------------
--------------------------------

NODE 2:

[root@Srv-NginxHaproxy2 ~]# tcpdump -vvv -n -i ens192 host 224.0.0.18

tcpdump: listening on ens192, link-type EN10MB (Ethernet), capture size 262144 bytes

14:18:15.450540 IP (tos 0xc0, ttl 255, id 55059, offset 0, flags [none], proto VRRP (112), length 76)

    193.137.78.81 > 224.0.0.18: vrrp 193.137.78.81 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 200, authtype simple, intvl 1s, length 56, addrs(10):
193.137.78.71,193.137.78.72,193.137.78.73,193.137.78.74,193.137.78.75,193.137.78.76,193.137.78.77,193.137.78.78,193.137.78.79,193.137.78.100 auth "Passw0rd"

14:18:15.564497 IP (tos 0xc0, ttl 255, id 54809, offset 0, flags [none], proto VRRP (112), length 76)

    193.137.78.82 > 224.0.0.18: vrrp 193.137.78.82 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 255, authtype simple, intvl 1s, length 56, addrs(10):
193.137.78.71,193.137.78.72,193.137.78.73,193.137.78.74,193.137.78.75,193.137.78.76,193.137.78.77,193.137.78.78,193.137.78.79,193.137.78.100 auth "Passw0rd"

 

[root@Srv-NginxHaproxy2 ~]# tcpdump -i ens192 -c 2 host vrrp.mcast.net

tcpdump: verbose output suppressed, use -v or -vv for full protocol decode

listening on ens192, link-type EN10MB (Ethernet), capture size 262144 bytes

14:24:49.513089 IP Srv-NginxHaproxy2 > vrrp.mcast.net: VRRPv2, Advertisement, vrid 51, prio 200, authtype simple, intvl 1s, length 56

14:24:49.598981 IP 193.137.78.82 > vrrp.mcast.net: VRRPv2, Advertisement, vrid 51, prio 255, authtype simple, intvl 1s, length 56

2 packets captured

3 packets received by filter

0 packets dropped by kernel

-----------------------------------------------------------------------------------------------------------------------------------------------------------
--------------------------------

 

 

IGMP Snooping is not enabled, I still can’t understand why… Any other advice?

 

 

Best regards,

 

António Godinho

 

 

 

-----Mensagem original-----

De: keepalived-users@groups.io <keepalived-users@groups.io> Em Nome De Graeme Fowler

Enviada: Sunday, November 24, 2019 4:49 PM

Para: keepalived-users@groups.io

Assunto: Re: [keepalived-users] Master/backup not working. Both as master

 

On 23 Nov 2019, at 13:17, Aaron Holtz <aholtz@...> wrote:

Check out that multicast traffic is being permitted via
iptables/firewalld/netfilter.  VRRP may not be able to communicate
between the nodes if it you aren't allowing the 224.0.0.0/8 network on
the proper interfaces.
 

Additionally, ensure that IGMP Snooping is not enabled on the device(s) joining the two keepalived instances together.

 

I almost always use a tun interface for instance-instance comms to get round that, especially if I don’t have control over the network between the two.

 

Graeme

Re: Master/backup not working. Both as master

Antonio Augusto Nunes Godinho
 

Hi,

 

I can see muticast traffic on 224.0.0.18 address using tcpdump, I think that means VRRP is able to communicate. The NODES IP's are node3 -> 193.137.78.83 and node2 ->193.137.78.82 (node 1 is down for now).

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

NODE 3:

[root@Srv-NginxHaproxy3 ~]# tcpdump -vvv -n -i ens192 host 224.0.0.18

tcpdump: listening on ens192, link-type EN10MB (Ethernet), capture size 262144 bytes

14:17:54.447759 IP (tos 0xc0, ttl 255, id 55038, offset 0, flags [none], proto VRRP (112), length 76)

    193.137.78.81 > 224.0.0.18: vrrp 193.137.78.81 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 200, authtype simple, intvl 1s, length 56, addrs(10): 193.137.78.71,193.137.78.72,193.137.78.73,193.137.78.74,193.137.78.75,193.137.78.76,193.137.78.77,193.137.78.78,193.137.78.79,193.137.78.100 auth "Passw0rd"

14:17:54.562886 IP (tos 0xc0, ttl 255, id 54788, offset 0, flags [none], proto VRRP (112), length 76)

    193.137.78.82 > 224.0.0.18: vrrp 193.137.78.82 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 255, authtype simple, intvl 1s, length 56, addrs(10): 193.137.78.71,193.137.78.72,193.137.78.73,193.137.78.74,193.137.78.75,193.137.78.76,193.137.78.77,193.137.78.78,193.137.78.79,193.137.78.100 auth "Passw0rd"

 

[root@Srv-NginxHaproxy3 ~]# tcpdump -i ens192 -c 2 host vrrp.mcast.net

tcpdump: verbose output suppressed, use -v or -vv for full protocol decode

listening on ens192, link-type EN10MB (Ethernet), capture size 262144 bytes

14:24:07.506258 IP 193.137.78.81 > vrrp.mcast.net: VRRPv2, Advertisement, vrid 51, prio 200, authtype simple, intvl 1s, length 56

14:24:07.595230 IP Srv-NginxHaproxy3 > vrrp.mcast.net: VRRPv2, Advertisement, vrid 51, prio 255, authtype simple, intvl 1s, length 56

2 packets captured

4 packets received by filter

0 packets dropped by kernel

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

NODE 2:

[root@Srv-NginxHaproxy2 ~]# tcpdump -vvv -n -i ens192 host 224.0.0.18

tcpdump: listening on ens192, link-type EN10MB (Ethernet), capture size 262144 bytes

14:18:15.450540 IP (tos 0xc0, ttl 255, id 55059, offset 0, flags [none], proto VRRP (112), length 76)

    193.137.78.81 > 224.0.0.18: vrrp 193.137.78.81 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 200, authtype simple, intvl 1s, length 56, addrs(10): 193.137.78.71,193.137.78.72,193.137.78.73,193.137.78.74,193.137.78.75,193.137.78.76,193.137.78.77,193.137.78.78,193.137.78.79,193.137.78.100 auth "Passw0rd"

14:18:15.564497 IP (tos 0xc0, ttl 255, id 54809, offset 0, flags [none], proto VRRP (112), length 76)

    193.137.78.82 > 224.0.0.18: vrrp 193.137.78.82 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 255, authtype simple, intvl 1s, length 56, addrs(10): 193.137.78.71,193.137.78.72,193.137.78.73,193.137.78.74,193.137.78.75,193.137.78.76,193.137.78.77,193.137.78.78,193.137.78.79,193.137.78.100 auth "Passw0rd"

 

[root@Srv-NginxHaproxy2 ~]# tcpdump -i ens192 -c 2 host vrrp.mcast.net

tcpdump: verbose output suppressed, use -v or -vv for full protocol decode

listening on ens192, link-type EN10MB (Ethernet), capture size 262144 bytes

14:24:49.513089 IP Srv-NginxHaproxy2 > vrrp.mcast.net: VRRPv2, Advertisement, vrid 51, prio 200, authtype simple, intvl 1s, length 56

14:24:49.598981 IP 193.137.78.82 > vrrp.mcast.net: VRRPv2, Advertisement, vrid 51, prio 255, authtype simple, intvl 1s, length 56

2 packets captured

3 packets received by filter

0 packets dropped by kernel

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

 

 

IGMP Snooping is not enabled, I still can’t understand why… Any other advice?

 

 

Best regards,

 

António Godinho

 

 

 

-----Mensagem original-----

De: keepalived-users@groups.io <keepalived-users@groups.io> Em Nome De Graeme Fowler

Enviada: Sunday, November 24, 2019 4:49 PM

Para: keepalived-users@groups.io

Assunto: Re: [keepalived-users] Master/backup not working. Both as master

 

On 23 Nov 2019, at 13:17, Aaron Holtz <aholtz@...> wrote:

> Check out that multicast traffic is being permitted via

> iptables/firewalld/netfilter.  VRRP may not be able to communicate

> between the nodes if it you aren't allowing the 224.0.0.0/8 network on

> the proper interfaces.

 

Additionally, ensure that IGMP Snooping is not enabled on the device(s) joining the two keepalived instances together.

 

I almost always use a tun interface for instance-instance comms to get round that, especially if I don’t have control over the network between the two.

 

Graeme

 

 

Re: Master/backup not working. Both as master

Graeme Fowler
 

On 23 Nov 2019, at 13:17, Aaron Holtz <aholtz@...> wrote:
Check out that multicast traffic is being permitted via
iptables/firewalld/netfilter. VRRP may not be able to communicate between
the nodes if it you aren't allowing the 224.0.0.0/8 network on the proper
interfaces.
Additionally, ensure that IGMP Snooping is not enabled on the device(s) joining the two keepalived instances together.

I almost always use a tun interface for instance-instance comms to get round that, especially if I don’t have control over the network between the two.

Graeme

Re: Master/backup not working. Both as master

Aaron Holtz
 

Check out that multicast traffic is being permitted via
iptables/firewalld/netfilter. VRRP may not be able to communicate between
the nodes if it you aren't allowing the 224.0.0.0/8 network on the proper
interfaces.

Also you appear to have the state set as "MASTER" on the backup. Though I
think that is just the initial state it comes up under, it still will
honor the priority once it finds the other nodes in the VRRP group and
it'll transition to the proper role.

-- Aaron

On Nov 22, Antonio Augusto Nunes Godinho molded the electrons to say....

Date: Fri, 22 Nov 2019 10:37:56
From: Antonio Augusto Nunes Godinho <to@...>
Reply-To: keepalived-users@groups.io
To: "keepalived-users@groups.io" <keepalived-users@groups.io>
Subject: [keepalived-users] Master/backup not working. Both as master


Hi,

 

I have a problem with my configuration. I have 2 nodes, one supposed to be the master and the other backup. For some reason it looks like they don’t
communicate.

This was working on CentOS7, maybe it’s a CentOS8 issue?

 

Master:

--------------------------------------------------------------------------------------------

global_defs {

        notification_email {

                sysadmin@...

        }

        notification_email_from Srv-NginxHaproxy3@...

        smtp_server smtpin.isec.pt

        smtp_connect_timeout 30

        router_id VRRP-NginxHaproxy

}

 

vrrp_script check_services {

        script "/etc/keepalived/check_services"

        interval 2

        #weight 2

}

 

vrrp_instance VI_1 {

        interface               ens192

        state                   MASTER

        virtual_router_id       51

        priority                255

        advert_int              1

        smtp_alert

 

        virtual_ipaddress {

                193.137.78.71 label ens192:0    # files.isec.pt

                193.137.78.72 label ens192:1    # www.isec.pt

                193.137.78.73 label ens192:2    # my.isec.pt

                193.137.78.74 label ens192:3    # inqueritos.isec.pt

                193.137.78.75 label ens192:4    # moodle.isec.pt

                193.137.78.76 label ens192:5    # ws.isec.pt

                193.137.78.77 label ens192:6    # cloud.isec.pt

                193.137.78.78 label ens192:7    # lvm.isec.pt

                193.137.78.79 label ens192:8    # biblioteca.isec.pt

                193.137.78.100  #

        }

 

        authentication {

            auth_type PASS

            auth_pass Passw0rd

        }

 

        track_script {

            check_services

        }

}

 

Backup

--------------------------------------------------------------------------------------------

global_defs {

        notification_email {

                sysadmin@...

        }

        notification_email_from Srv-NginxHaproxy2@...

        smtp_server smtpin.isec.pt

        smtp_connect_timeout 30

        router_id VRRP-NginxHaproxy

}

 

vrrp_script check_services {

        script "/etc/keepalived/check_services"

        interval 2

        #weight 2

}

 

vrrp_instance VI_1 {

        interface               ens192

        #state                  BACKUP

        state                   MASTER

        virtual_router_id       51

        priority                200

        advert_int              1

        smtp_alert

 

        virtual_ipaddress {

                193.137.78.71   # files.isec.pt

                193.137.78.72   # www.isec.pt

                193.137.78.73   # my.isec.pt

                193.137.78.74   # inqueritos.isec.pt

                193.137.78.75   # moodle.isec.pt

               193.137.78.76   # ws.isec.pt

                193.137.78.77   # cloud.isec.pt

                193.137.78.78   # lvm.isec.pt

                193.137.78.79   # biblioteca.isec.pt

                193.137.78.100  #

        }

 

        authentication {

            auth_type PASS

            auth_pass Passw0rd

        }

 

        track_script {

            check_services

        }

}

 

I can see the traffic on both machines:

 

[root@Srv-NginxHaproxy3 ~]# tcpdump -vvv -n -i ens192 host 224.0.0.18

tcpdump: listening on ens192, link-type EN10MB (Ethernet), capture size 262144 bytes

15:34:56.557641 IP (tos 0xc0, ttl 255, id 1345, offset 0, flags [none], proto VRRP (112), length 76)

    193.137.78.82 > 224.0.0.18: vrrp 193.137.78.82 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 255, authtype simple, intvl 1s, length 56, addrs(10):
193.137.78.71,193.137.78.72,193.137.78.73,193.137.78.74,193.137.78.75,193.137.78.76,193.137.78.77,193.137.78.78,193.137.78.79,193.137.78.100 auth "Passw0rd"

15:34:56.658658 IP (tos 0xc0, ttl 255, id 756, offset 0, flags [none], proto VRRP (112), length 76)

    193.137.78.81 > 224.0.0.18: vrrp 193.137.78.81 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 200, authtype simple, intvl 1s, length 56, addrs(10):
193.137.78.71,193.137.78.72,193.137.78.73,193.137.78.74,193.137.78.75,193.137.78.76,193.137.78.77,193.137.78.78,193.137.78.79,193.137.78.100 auth "Passw0rd"

 

[root@Srv-NginxHaproxy2 ~]# tcpdump -vvv -n -i ens192 host 224.0.0.18

tcpdump: listening on ens192, link-type EN10MB (Ethernet), capture size 262144 bytes

15:34:35.555473 IP (tos 0xc0, ttl 255, id 1324, offset 0, flags [none], proto VRRP (112), length 76)

    193.137.78.82 > 224.0.0.18: vrrp 193.137.78.82 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 255, authtype simple, intvl 1s, length 56, addrs(10):
193.137.78.71,193.137.78.72,193.137.78.73,193.137.78.74,193.137.78.75,193.137.78.76,193.137.78.77,193.137.78.78,193.137.78.79,193.137.78.100 auth "Passw0rd"

15:34:35.656008 IP (tos 0xc0, ttl 255, id 735, offset 0, flags [none], proto VRRP (112), length 76)

    193.137.78.81 > 224.0.0.18: vrrp 193.137.78.81 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 200, authtype simple, intvl 1s, length 56, addrs(10):
193.137.78.71,193.137.78.72,193.137.78.73,193.137.78.74,193.137.78.75,193.137.78.76,193.137.78.77,193.137.78.78,193.137.78.79,193.137.78.100 auth "Passw0rd"

 

Both want to be master and use all the VIP’s:

[root@Srv-NginxHaproxy3 ~]# ip addr | grep "inet" | grep "ens192"

    inet 193.137.78.82/26 brd 193.137.78.127 scope global noprefixroute ens192

    inet 193.137.78.71/32 scope global ens192:0

    inet 193.137.78.72/32 scope global ens192:1

    inet 193.137.78.73/32 scope global ens192:2

    inet 193.137.78.74/32 scope global ens192:3

    inet 193.137.78.75/32 scope global ens192:4

    inet 193.137.78.76/32 scope global ens192:5

    inet 193.137.78.77/32 scope global ens192:6

    inet 193.137.78.78/32 scope global ens192:7

    inet 193.137.78.79/32 scope global ens192:8

    inet 193.137.78.100/32 scope global ens192

 

[root@Srv-NginxHaproxy2 ~]# ip addr | grep "inet" | grep "ens192"

    inet 193.137.78.81/26 brd 193.137.78.127 scope global noprefixroute ens192

    inet 193.137.78.71/32 scope global ens192

    inet 193.137.78.72/32 scope global ens192

    inet 193.137.78.73/32 scope global ens192

    inet 193.137.78.74/32 scope global ens192

    inet 193.137.78.75/32 scope global ens192

    inet 193.137.78.76/32 scope global ens192

    inet 193.137.78.77/32 scope global ens192

    inet 193.137.78.78/32 scope global ens192

    inet 193.137.78.79/32 scope global ens192

    inet 193.137.78.100/32 scope global ens192

 

 

What I’m doing wrong?

 

Best regards,

 

António Godinho

Master/backup not working. Both as master

Antonio Augusto Nunes Godinho
 

Hi,

 

I have a problem with my configuration. I have 2 nodes, one supposed to be the master and the other backup. For some reason it looks like they don’t communicate.

This was working on CentOS7, maybe it’s a CentOS8 issue?

 

Master:

--------------------------------------------------------------------------------------------

global_defs {

        notification_email {

                sysadmin@...

        }

        notification_email_from Srv-NginxHaproxy3@...

        smtp_server smtpin.isec.pt

        smtp_connect_timeout 30

        router_id VRRP-NginxHaproxy

}

 

vrrp_script check_services {

        script "/etc/keepalived/check_services"

        interval 2

        #weight 2

}

 

vrrp_instance VI_1 {

        interface               ens192

        state                   MASTER

        virtual_router_id       51

        priority                255

        advert_int              1

        smtp_alert

 

        virtual_ipaddress {

                193.137.78.71 label ens192:0    # files.isec.pt

                193.137.78.72 label ens192:1    # www.isec.pt

                193.137.78.73 label ens192:2    # my.isec.pt

                193.137.78.74 label ens192:3    # inqueritos.isec.pt

                193.137.78.75 label ens192:4    # moodle.isec.pt

                193.137.78.76 label ens192:5    # ws.isec.pt

                193.137.78.77 label ens192:6    # cloud.isec.pt

                193.137.78.78 label ens192:7    # lvm.isec.pt

                193.137.78.79 label ens192:8    # biblioteca.isec.pt

                193.137.78.100  #

        }

 

        authentication {

            auth_type PASS

            auth_pass Passw0rd

        }

 

        track_script {

            check_services

        }

}

 

Backup

--------------------------------------------------------------------------------------------

global_defs {

        notification_email {

                sysadmin@...

        }

        notification_email_from Srv-NginxHaproxy2@...

        smtp_server smtpin.isec.pt

        smtp_connect_timeout 30

        router_id VRRP-NginxHaproxy

}

 

vrrp_script check_services {

        script "/etc/keepalived/check_services"

        interval 2

        #weight 2

}

 

vrrp_instance VI_1 {

        interface               ens192

        #state                  BACKUP

        state                   MASTER

        virtual_router_id       51

        priority                200

        advert_int              1

        smtp_alert

 

        virtual_ipaddress {

                193.137.78.71   # files.isec.pt

                193.137.78.72   # www.isec.pt

                193.137.78.73   # my.isec.pt

                193.137.78.74   # inqueritos.isec.pt

                193.137.78.75   # moodle.isec.pt

               193.137.78.76   # ws.isec.pt

                193.137.78.77   # cloud.isec.pt

                193.137.78.78   # lvm.isec.pt

                193.137.78.79   # biblioteca.isec.pt

                193.137.78.100  #

        }

 

        authentication {

            auth_type PASS

            auth_pass Passw0rd

        }

 

        track_script {

            check_services

        }

}

 

I can see the traffic on both machines:

 

[root@Srv-NginxHaproxy3 ~]# tcpdump -vvv -n -i ens192 host 224.0.0.18

tcpdump: listening on ens192, link-type EN10MB (Ethernet), capture size 262144 bytes

15:34:56.557641 IP (tos 0xc0, ttl 255, id 1345, offset 0, flags [none], proto VRRP (112), length 76)

    193.137.78.82 > 224.0.0.18: vrrp 193.137.78.82 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 255, authtype simple, intvl 1s, length 56, addrs(10): 193.137.78.71,193.137.78.72,193.137.78.73,193.137.78.74,193.137.78.75,193.137.78.76,193.137.78.77,193.137.78.78,193.137.78.79,193.137.78.100 auth "Passw0rd"

15:34:56.658658 IP (tos 0xc0, ttl 255, id 756, offset 0, flags [none], proto VRRP (112), length 76)

    193.137.78.81 > 224.0.0.18: vrrp 193.137.78.81 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 200, authtype simple, intvl 1s, length 56, addrs(10): 193.137.78.71,193.137.78.72,193.137.78.73,193.137.78.74,193.137.78.75,193.137.78.76,193.137.78.77,193.137.78.78,193.137.78.79,193.137.78.100 auth "Passw0rd"

 

[root@Srv-NginxHaproxy2 ~]# tcpdump -vvv -n -i ens192 host 224.0.0.18

tcpdump: listening on ens192, link-type EN10MB (Ethernet), capture size 262144 bytes

15:34:35.555473 IP (tos 0xc0, ttl 255, id 1324, offset 0, flags [none], proto VRRP (112), length 76)

    193.137.78.82 > 224.0.0.18: vrrp 193.137.78.82 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 255, authtype simple, intvl 1s, length 56, addrs(10): 193.137.78.71,193.137.78.72,193.137.78.73,193.137.78.74,193.137.78.75,193.137.78.76,193.137.78.77,193.137.78.78,193.137.78.79,193.137.78.100 auth "Passw0rd"

15:34:35.656008 IP (tos 0xc0, ttl 255, id 735, offset 0, flags [none], proto VRRP (112), length 76)

    193.137.78.81 > 224.0.0.18: vrrp 193.137.78.81 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 200, authtype simple, intvl 1s, length 56, addrs(10): 193.137.78.71,193.137.78.72,193.137.78.73,193.137.78.74,193.137.78.75,193.137.78.76,193.137.78.77,193.137.78.78,193.137.78.79,193.137.78.100 auth "Passw0rd"

 

Both want to be master and use all the VIP’s:

[root@Srv-NginxHaproxy3 ~]# ip addr | grep "inet" | grep "ens192"

    inet 193.137.78.82/26 brd 193.137.78.127 scope global noprefixroute ens192

    inet 193.137.78.71/32 scope global ens192:0

    inet 193.137.78.72/32 scope global ens192:1

    inet 193.137.78.73/32 scope global ens192:2

    inet 193.137.78.74/32 scope global ens192:3

    inet 193.137.78.75/32 scope global ens192:4

    inet 193.137.78.76/32 scope global ens192:5

    inet 193.137.78.77/32 scope global ens192:6

    inet 193.137.78.78/32 scope global ens192:7

    inet 193.137.78.79/32 scope global ens192:8

    inet 193.137.78.100/32 scope global ens192

 

[root@Srv-NginxHaproxy2 ~]# ip addr | grep "inet" | grep "ens192"

    inet 193.137.78.81/26 brd 193.137.78.127 scope global noprefixroute ens192

    inet 193.137.78.71/32 scope global ens192

    inet 193.137.78.72/32 scope global ens192

    inet 193.137.78.73/32 scope global ens192

    inet 193.137.78.74/32 scope global ens192

    inet 193.137.78.75/32 scope global ens192

    inet 193.137.78.76/32 scope global ens192

    inet 193.137.78.77/32 scope global ens192

    inet 193.137.78.78/32 scope global ens192

    inet 193.137.78.79/32 scope global ens192

    inet 193.137.78.100/32 scope global ens192

 

 

What I’m doing wrong?

 

Best regards,

 

António Godinho

Re: Tuning gratuitous arp and failover

Aaron Holtz
 

Correct - VMWare knows the MACs assigned to a machine and if you introduce
a new one the virtual switch doesn't learn about it and allow traffic to
head to that VM. The only "built-in" option is to turn promiscuous mode
on for the entire virtual switch. There looks to be a plugin module that
a user created that allows things to behave better when in promiscuous
mode since it can be an extreme performance hit on a busy virtual switch
(which ours would be - we did this test on a non-production virtual
switch).

Now that I know the issue isn't related to my keepalived configuration I
can work on the best way to implement in our production environment. We
are leaning towards using that plugin. If you are interested in reading
about the issue and the plugin he created, here is the link to the file
(there is a link to the full blog post about the issue on that page as
well):

https://flings.vmware.com/esxi-mac-learning-dvfilter

-- Aaron


On Nov 21, Quentin Armitage molded the electrons to say....

Date: Thu, 21 Nov 2019 08:01:50
From: Quentin Armitage <quentin@...>
To: keepalived-users@groups.io, Aaron Holtz <aholtz@...>
Subject: Re: [keepalived-users] Tuning gratuitous arp and failover

That sounds as though the VMware VM either tells the virtual switch what
MAC addresses it is using, or else the virtual switch won't allow a MAC
address to move to a different interface. It seems to me it would be
better if the virtual switch behaved like a real switch and dynamically
learned where the MAC addresses are.

Does enabling "MAC address changes" on the virtual switch work instead of
promiscuous mode?

Anyway, it's good to know that you have a solution to the problem.

Quentin

On Thu, 2019-11-21 at 07:39 -0500, Aaron Holtz wrote:
I believe this is resolved - because we are on a VMWare platform the
virtual switch has to be in promiscuous mode for the use_vmac option to
properly work.

-- Aaron

On Nov 20, Aaron Holtz molded the electrons to say....

Date: Wed, 20 Nov 2019 10:55:44
From: Aaron Holtz <aholtz@...>
To: keepalived-users@groups.io
Subject: Re: [keepalived-users] Tuning gratuitous arp and failover

Quentin,

I've got a test keepalived layout and I enabled "use_vmac" and brought
it back up. It did create the vrrp.150 interface with my proper IP.
However, while I can ping it locally no other host on the network can get
to it. The other hosts do have the virtual MAC in their ARP table so
that's something.

I've read multiple pages including
https://www.keepalived.org/doc/software_design.html about possible ARP
issues and making some sysctl changes. I believe everything is properly
set per those recommendations but still not having any luck.

Kernel/OS is

Linux test-keepalived-1 5.0.0-19-generic #20~18.04.1-Ubuntu SMP Thu Jun 20 12:17:14 UTC 2019 x86_64 x86_64 x86_64
GNU/Linux

I was concerned it was firewall related but I have tcpdump'd traffic on
the vrrp interface and I do not even see the packets hitting it.

Anything I can look at to get this interface to answer inbound traffic?
As soon as I comment out use_vmac everything works on the primary
interface so I'm pretty sure my test setup is solid otherwise.

Thanks.

-- Aaron


On Nov 19, Quentin Armitage molded the electrons to say....

Date: Tue, 19 Nov 2019 16:04:58
From: Quentin Armitage <quentin@...>
Reply-To: keepalived-users@groups.io
To: keepalived-users@groups.io, Aaron Holtz <aholtz@...>
Subject: Re: [keepalived-users] Tuning gratuitous arp and failover

Aaron,

You don't say which version of keepalived you are using, so I will assume the current version (v2.0.19), and answer in
that context (the easiest way to see
which version you are using is to run keepalived with the -v option).

My comments are inline below.

Quentin
On Tue, 2019-11-19 at 13:29 -0500, Aaron Holtz wrote:

Hello,

Our setup has a pair of virtual servers each running one instance of
keepalived. Each server has an external IP and in internal IP and there
is a VRRP instance running on each of them.

The "outside" interface has about 25 virtual IPs on it and we slowly add
more every few weeks. The default gateway for the keepalived instance is
a Cisco 9k. When we add in a new virtual_ipaddress and restart keepalived
on the master we're seeing it fail over to the backup (as expected) and
then after a few moments it fails back to the primary (as expected).

You might prefer to use the reload option rather than the restart option (if you are using systemctl then systemctl reload
keepalived should work). reload
means that keepalived reloads its config without stopping and restarting, and it means that the master keepalived instance
does not drop back to backup
state.

There is a slight complication using reload in the scenario you mention of adding virtual IPs. By default, keepalived
checks both the number of VIPs and the
VIPs themselves in each received advert, to ensure they match what the keepalived instance has configured. To skip the
check, set vrrp_skip_check_adv_addr
in the global_defs section or set skip_check_adv_addr in each vrrp_instance block to which you want it to apply. If
skip_check_adv_addr is not set, then if
the VIPs don't match both vrrp instances will become master.


We've set "advert_int 3" so that under normal conditions it doesn't bounce
to the backup unless the network is really severed. So during this
"controlled" restart the backup picks up right away but the primary
doesn't come back as the primary for about 10 seconds as we'd expect.

When the primary is shutdown as part of the restart (stop/start) sequence, it sends priority 0 adverts which causes a
rapid takeover by the highest priority
backup.

However, on the Cisco we are finding that after the fail back to the
primary some (not all) of the IPs will still have an arp entry of the
backup keepalived instance. We've confirmed that these IPs are not still
"stuck" on the backup and are indeed on the primary (verified via "ip a").
Our only option is to clear arp on those entries in teh 9k so that the
primary server's MAC is picked up.

The best way of avoiding this issue is to specify use_vmac on each vrrp instance. This causes the vrrp instances to use a
virtual MAC address for each vrrp
instance, so when the backup takes over as master it uses the same (virtual) MAC address, and so there is no issue of
stale arp entries. This is how VRRP is
supposed to work.

So the questions are:

1. Does this sound like a gratutious arp issue? In that the Cisco isn't
always honoring the messages when the primary takes back over. Or maybe
we send too many requests since we have 25 IPs?

It is possible that the flooding of gratuitous arp messages is causing a problem, and for this reason keepalived rate
limiting of sending gratuitous arps,
but I wouldn't expect it to be a problem with a modern device. By default keepalived will loop sending gratuitous arps for
all VIPs 5 times (resulting in
125 gratuitous arp messages being sent in the case of 25 VIPs), and then it will repeat the same 5 seconds later.

Options for controlling gratuitous arp messages are:
vrrp_garp_master_delay # delay in seconds before sending second set of gratuitous arp messages (default 5, 0 means
no second set)
vrrp_garp_master_repeat # number of gratuitous arp messages to send for each VIP each time a set of gratuitous arp
messages is sent (default 5)
vrrp_garp_master_refresh # interval between continuing sending sets of gratuitous arp messages in seconds (default
disabled)
vrrp_garp_master_refresh_repeat # number of gratuitous arp messages to send to send for each VIP in each refresh
set ( default 1 if
vrrp_garp_master_refresh enabled)
vrrp_garp_interval # interval in seconds between sending gratuitous arp messages, resolution in microseconds (e.g. 0.05)
vrrp_gna_interval # same as vrrp_garp_interval but for IPv6
vrrp_min_garp # sets vrrp_garp_master_delay to 0 and vrrp_garp_master_repeat to 1 (this should work with modern switches)

The above is all covered in the keepalived.conf(5) man page. There is also garp_group to rate limit the sending of garp
messages for switches that can't
keep up with the rate of keepalived sending garp messages. This can be limited per single interface, or a group of
interfaces. See keepalived.conf(5) man
page for more details.

2. Can we tune our configuration to either send more garp messages or is
there some other known option we should be using to help ensure the
arp-cache on the Cisco stays correct?

It might work better to reduce the number of garp messages, e.g. set vrrp_garp_master_repeat to 1, to send fewer garp
messages and reduce the load on the
switch/router in terms of processing garp messages.

3. Is there a "delay" option we can use so that in a controlled restart
the backup doesn't take over immediately? If we do a "service keepalived
restart" we know the primary will be right back up so there isn't a need
for the backup to even take over the IPs and send out garp messages.

As mentioned above, use "service keepalived reload" if it is supported. If "service keepalived" doesn't support the reload
option, send SIGHUP to the parent
keepalived process to initiate a reload.

We are not using any of the vrrp_garp_* configurations other than the
defaults. Several of them look useful but I'm unsure how they might
address my problems.


In summary, using the use_vmac option, and also the vrrp_skip_check_adv_addr option may be the easiest solution, since
use_vmac will mean that the MAC
addresses for the VIPs will not change.

Thanks.

-- Aaron





Re: Tuning gratuitous arp and failover

Quentin Armitage
 

That sounds as though the VMware VM either tells the virtual switch what
MAC addresses it is using, or else the virtual switch won't allow a MAC
address to move to a different interface. It seems to me it would be
better if the virtual switch behaved like a real switch and dynamically
learned where the MAC addresses are.

Does enabling "MAC address changes" on the virtual switch work instead of
promiscuous mode?

Anyway, it's good to know that you have a solution to the problem.

Quentin

On Thu, 2019-11-21 at 07:39 -0500, Aaron Holtz wrote:
I believe this is resolved - because we are on a VMWare platform the
virtual switch has to be in promiscuous mode for the use_vmac option to
properly work.

-- Aaron

On Nov 20, Aaron Holtz molded the electrons to say....

Date: Wed, 20 Nov 2019 10:55:44
From: Aaron Holtz <aholtz@...>
To: keepalived-users@groups.io
Subject: Re: [keepalived-users] Tuning gratuitous arp and failover

Quentin,

I've got a test keepalived layout and I enabled "use_vmac" and brought
it back up. It did create the vrrp.150 interface with my proper IP.
However, while I can ping it locally no other host on the network can get
to it. The other hosts do have the virtual MAC in their ARP table so
that's something.

I've read multiple pages including
https://www.keepalived.org/doc/software_design.html about possible ARP
issues and making some sysctl changes. I believe everything is properly
set per those recommendations but still not having any luck.

Kernel/OS is

Linux test-keepalived-1 5.0.0-19-generic #20~18.04.1-Ubuntu SMP Thu Jun 20 12:17:14 UTC 2019 x86_64 x86_64 x86_64
GNU/Linux

I was concerned it was firewall related but I have tcpdump'd traffic on
the vrrp interface and I do not even see the packets hitting it.

Anything I can look at to get this interface to answer inbound traffic?
As soon as I comment out use_vmac everything works on the primary
interface so I'm pretty sure my test setup is solid otherwise.

Thanks.

-- Aaron


On Nov 19, Quentin Armitage molded the electrons to say....

Date: Tue, 19 Nov 2019 16:04:58
From: Quentin Armitage <quentin@...>
Reply-To: keepalived-users@groups.io
To: keepalived-users@groups.io, Aaron Holtz <aholtz@...>
Subject: Re: [keepalived-users] Tuning gratuitous arp and failover

Aaron,

You don't say which version of keepalived you are using, so I will assume the current version (v2.0.19), and answer in
that context (the easiest way to see
which version you are using is to run keepalived with the -v option).

My comments are inline below.

Quentin
On Tue, 2019-11-19 at 13:29 -0500, Aaron Holtz wrote:

Hello,

Our setup has a pair of virtual servers each running one instance of
keepalived. Each server has an external IP and in internal IP and there
is a VRRP instance running on each of them.

The "outside" interface has about 25 virtual IPs on it and we slowly add
more every few weeks. The default gateway for the keepalived instance is
a Cisco 9k. When we add in a new virtual_ipaddress and restart keepalived
on the master we're seeing it fail over to the backup (as expected) and
then after a few moments it fails back to the primary (as expected).

You might prefer to use the reload option rather than the restart option (if you are using systemctl then systemctl reload
keepalived should work). reload
means that keepalived reloads its config without stopping and restarting, and it means that the master keepalived instance
does not drop back to backup
state.

There is a slight complication using reload in the scenario you mention of adding virtual IPs. By default, keepalived
checks both the number of VIPs and the
VIPs themselves in each received advert, to ensure they match what the keepalived instance has configured. To skip the
check, set vrrp_skip_check_adv_addr
in the global_defs section or set skip_check_adv_addr in each vrrp_instance block to which you want it to apply. If
skip_check_adv_addr is not set, then if
the VIPs don't match both vrrp instances will become master.


We've set "advert_int 3" so that under normal conditions it doesn't bounce
to the backup unless the network is really severed. So during this
"controlled" restart the backup picks up right away but the primary
doesn't come back as the primary for about 10 seconds as we'd expect.

When the primary is shutdown as part of the restart (stop/start) sequence, it sends priority 0 adverts which causes a
rapid takeover by the highest priority
backup.

However, on the Cisco we are finding that after the fail back to the
primary some (not all) of the IPs will still have an arp entry of the
backup keepalived instance. We've confirmed that these IPs are not still
"stuck" on the backup and are indeed on the primary (verified via "ip a").
Our only option is to clear arp on those entries in teh 9k so that the
primary server's MAC is picked up.

The best way of avoiding this issue is to specify use_vmac on each vrrp instance. This causes the vrrp instances to use a
virtual MAC address for each vrrp
instance, so when the backup takes over as master it uses the same (virtual) MAC address, and so there is no issue of
stale arp entries. This is how VRRP is
supposed to work.

So the questions are:

1. Does this sound like a gratutious arp issue? In that the Cisco isn't
always honoring the messages when the primary takes back over. Or maybe
we send too many requests since we have 25 IPs?

It is possible that the flooding of gratuitous arp messages is causing a problem, and for this reason keepalived rate
limiting of sending gratuitous arps,
but I wouldn't expect it to be a problem with a modern device. By default keepalived will loop sending gratuitous arps for
all VIPs 5 times (resulting in
125 gratuitous arp messages being sent in the case of 25 VIPs), and then it will repeat the same 5 seconds later.

Options for controlling gratuitous arp messages are:
vrrp_garp_master_delay # delay in seconds before sending second set of gratuitous arp messages (default 5, 0 means
no second set)
vrrp_garp_master_repeat # number of gratuitous arp messages to send for each VIP each time a set of gratuitous arp
messages is sent (default 5)
vrrp_garp_master_refresh # interval between continuing sending sets of gratuitous arp messages in seconds (default
disabled)
vrrp_garp_master_refresh_repeat # number of gratuitous arp messages to send to send for each VIP in each refresh
set ( default 1 if
vrrp_garp_master_refresh enabled)
vrrp_garp_interval # interval in seconds between sending gratuitous arp messages, resolution in microseconds (e.g. 0.05)
vrrp_gna_interval # same as vrrp_garp_interval but for IPv6
vrrp_min_garp # sets vrrp_garp_master_delay to 0 and vrrp_garp_master_repeat to 1 (this should work with modern switches)

The above is all covered in the keepalived.conf(5) man page. There is also garp_group to rate limit the sending of garp
messages for switches that can't
keep up with the rate of keepalived sending garp messages. This can be limited per single interface, or a group of
interfaces. See keepalived.conf(5) man
page for more details.

2. Can we tune our configuration to either send more garp messages or is
there some other known option we should be using to help ensure the
arp-cache on the Cisco stays correct?

It might work better to reduce the number of garp messages, e.g. set vrrp_garp_master_repeat to 1, to send fewer garp
messages and reduce the load on the
switch/router in terms of processing garp messages.

3. Is there a "delay" option we can use so that in a controlled restart
the backup doesn't take over immediately? If we do a "service keepalived
restart" we know the primary will be right back up so there isn't a need
for the backup to even take over the IPs and send out garp messages.

As mentioned above, use "service keepalived reload" if it is supported. If "service keepalived" doesn't support the reload
option, send SIGHUP to the parent
keepalived process to initiate a reload.

We are not using any of the vrrp_garp_* configurations other than the
defaults. Several of them look useful but I'm unsure how they might
address my problems.


In summary, using the use_vmac option, and also the vrrp_skip_check_adv_addr option may be the easiest solution, since
use_vmac will mean that the MAC
addresses for the VIPs will not change.

Thanks.

-- Aaron





Re: Tuning gratuitous arp and failover

Aaron Holtz
 

I believe this is resolved - because we are on a VMWare platform the
virtual switch has to be in promiscuous mode for the use_vmac option to
properly work.

-- Aaron

On Nov 20, Aaron Holtz molded the electrons to say....

Date: Wed, 20 Nov 2019 10:55:44
From: Aaron Holtz <aholtz@...>
To: keepalived-users@groups.io
Subject: Re: [keepalived-users] Tuning gratuitous arp and failover

Quentin,

I've got a test keepalived layout and I enabled "use_vmac" and brought
it back up. It did create the vrrp.150 interface with my proper IP.
However, while I can ping it locally no other host on the network can get
to it. The other hosts do have the virtual MAC in their ARP table so
that's something.

I've read multiple pages including
https://www.keepalived.org/doc/software_design.html about possible ARP
issues and making some sysctl changes. I believe everything is properly
set per those recommendations but still not having any luck.

Kernel/OS is

Linux test-keepalived-1 5.0.0-19-generic #20~18.04.1-Ubuntu SMP Thu Jun 20 12:17:14 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

I was concerned it was firewall related but I have tcpdump'd traffic on
the vrrp interface and I do not even see the packets hitting it.

Anything I can look at to get this interface to answer inbound traffic?
As soon as I comment out use_vmac everything works on the primary
interface so I'm pretty sure my test setup is solid otherwise.

Thanks.

-- Aaron


On Nov 19, Quentin Armitage molded the electrons to say....

Date: Tue, 19 Nov 2019 16:04:58
From: Quentin Armitage <quentin@...>
Reply-To: keepalived-users@groups.io
To: keepalived-users@groups.io, Aaron Holtz <aholtz@...>
Subject: Re: [keepalived-users] Tuning gratuitous arp and failover

Aaron,

You don't say which version of keepalived you are using, so I will assume the current version (v2.0.19), and answer in that context (the easiest way to see
which version you are using is to run keepalived with the -v option).

My comments are inline below.

Quentin

On Tue, 2019-11-19 at 13:29 -0500, Aaron Holtz wrote:

Hello,

  Our setup has a pair of virtual servers each running one instance of
keepalived.  Each server has an external IP and in internal IP and there
is a VRRP instance running on each of them. 

The "outside" interface has about 25 virtual IPs on it and we slowly add
more every few weeks.  The default gateway for the keepalived instance is
a Cisco 9k.  When we add in a new virtual_ipaddress and restart keepalived
on the master we're seeing it fail over to the backup (as expected) and
then after a few moments it fails back to the primary (as expected).

You might prefer to use the reload option rather than the restart option (if you are using systemctl then systemctl reload keepalived should work). reload
means that keepalived reloads its config without stopping and restarting, and it means that the master keepalived instance does not drop back to backup
state.

There is a slight complication using reload in the scenario you mention of adding virtual IPs. By default, keepalived checks both the number of VIPs and the
VIPs themselves in each received advert, to ensure they match what the keepalived instance has configured. To skip the check, set vrrp_skip_check_adv_addr
in the global_defs section or set skip_check_adv_addr in each vrrp_instance block to which you want it to apply. If skip_check_adv_addr is not set, then if
the VIPs don't match both vrrp instances will become master.


We've set "advert_int 3" so that under normal conditions it doesn't bounce
to the backup unless the network is really severed.  So during this
"controlled" restart the backup picks up right away but the primary
doesn't come back as the primary for about 10 seconds as we'd expect.

When the primary is shutdown as part of the restart (stop/start) sequence, it sends priority 0 adverts which causes a rapid takeover by the highest priority
backup.

However, on the Cisco we are finding that after the fail back to the
primary some (not all) of the IPs will still have an arp entry of the
backup keepalived instance.  We've confirmed that these IPs are not still
"stuck" on the backup and are indeed on the primary (verified via "ip a"). 
Our only option is to clear arp on those entries in teh 9k so that the
primary server's MAC is picked up.

The best way of avoiding this issue is to specify use_vmac on each vrrp instance. This causes the vrrp instances to use a virtual MAC address for each vrrp
instance, so when the backup takes over as master it uses the same (virtual) MAC address, and so there is no issue of stale arp entries. This is how VRRP is
supposed to work.

So the questions are:

1. Does this sound like a gratutious arp issue?  In that the Cisco isn't
always honoring the messages when the primary takes back over.  Or maybe
we send too many requests since we have 25 IPs?

It is possible that the flooding of gratuitous arp messages is causing a problem, and for this reason keepalived rate limiting of sending gratuitous arps,
but I wouldn't expect it to be a problem with a modern device. By default keepalived will loop sending gratuitous arps for all VIPs 5 times (resulting in
125 gratuitous arp messages being sent in the case of 25 VIPs), and then it will repeat the same 5 seconds later.

Options for controlling gratuitous arp messages are:
        vrrp_garp_master_delay # delay in seconds before sending second set of gratuitous arp messages (default 5, 0 means no second set)
        vrrp_garp_master_repeat # number of gratuitous arp messages to send for each VIP each time a set of gratuitous arp messages is sent (default 5)
        vrrp_garp_master_refresh # interval between continuing sending sets of gratuitous arp messages in seconds (default disabled)
        vrrp_garp_master_refresh_repeat # number of gratuitous arp messages to send to send for each VIP in each refresh set ( default 1 if
vrrp_garp_master_refresh enabled)
vrrp_garp_interval # interval in seconds between sending gratuitous arp messages, resolution in microseconds (e.g. 0.05)
vrrp_gna_interval # same as vrrp_garp_interval but for IPv6
vrrp_min_garp # sets vrrp_garp_master_delay to 0 and vrrp_garp_master_repeat to 1 (this should work with modern switches)

The above is all covered in the keepalived.conf(5) man page. There is also garp_group to rate limit the sending of garp messages for switches that can't
keep up with the rate of keepalived sending garp messages. This can be limited per single interface, or a group of interfaces. See keepalived.conf(5) man
page for more details.

2. Can we tune our configuration to either send more garp messages or is
there some other known option we should be using to help ensure the
arp-cache on the Cisco stays correct?

It might work better to reduce the number of garp messages, e.g. set vrrp_garp_master_repeat to 1, to send fewer garp messages and reduce the load on the
switch/router in terms of processing garp messages.

3. Is there a "delay" option we can use so that in a controlled restart
the backup doesn't take over immediately?  If we do a "service keepalived
restart" we know the primary will be right back up so there isn't a need
for the backup to even take over the IPs and send out garp messages.

As mentioned above, use "service keepalived reload" if it is supported. If "service keepalived" doesn't support the reload option, send SIGHUP to the parent
keepalived process to initiate a reload.

We are not using any of the vrrp_garp_* configurations other than the
defaults.  Several of them look useful but I'm unsure how they might
address my problems.


In summary, using the use_vmac option, and also the vrrp_skip_check_adv_addr option may be the easiest solution, since use_vmac will mean that the MAC
addresses for the VIPs will not change.

Thanks.

-- Aaron

Re: Tuning gratuitous arp and failover

Aaron Holtz
 

Quentin,

I've got a test keepalived layout and I enabled "use_vmac" and brought
it back up. It did create the vrrp.150 interface with my proper IP.
However, while I can ping it locally no other host on the network can get
to it. The other hosts do have the virtual MAC in their ARP table so
that's something.

I've read multiple pages including
https://www.keepalived.org/doc/software_design.html about possible ARP
issues and making some sysctl changes. I believe everything is properly
set per those recommendations but still not having any luck.

Kernel/OS is

Linux test-keepalived-1 5.0.0-19-generic #20~18.04.1-Ubuntu SMP Thu Jun 20 12:17:14 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

I was concerned it was firewall related but I have tcpdump'd traffic on
the vrrp interface and I do not even see the packets hitting it.

Anything I can look at to get this interface to answer inbound traffic?
As soon as I comment out use_vmac everything works on the primary
interface so I'm pretty sure my test setup is solid otherwise.

Thanks.

-- Aaron


On Nov 19, Quentin Armitage molded the electrons to say....

Date: Tue, 19 Nov 2019 16:04:58
From: Quentin Armitage <quentin@...>
Reply-To: keepalived-users@groups.io
To: keepalived-users@groups.io, Aaron Holtz <aholtz@...>
Subject: Re: [keepalived-users] Tuning gratuitous arp and failover

Aaron,

You don't say which version of keepalived you are using, so I will assume the current version (v2.0.19), and answer in that context (the easiest way to see
which version you are using is to run keepalived with the -v option).

My comments are inline below.

Quentin

On Tue, 2019-11-19 at 13:29 -0500, Aaron Holtz wrote:

Hello,

  Our setup has a pair of virtual servers each running one instance of
keepalived.  Each server has an external IP and in internal IP and there
is a VRRP instance running on each of them. 

The "outside" interface has about 25 virtual IPs on it and we slowly add
more every few weeks.  The default gateway for the keepalived instance is
a Cisco 9k.  When we add in a new virtual_ipaddress and restart keepalived
on the master we're seeing it fail over to the backup (as expected) and
then after a few moments it fails back to the primary (as expected).

You might prefer to use the reload option rather than the restart option (if you are using systemctl then systemctl reload keepalived should work). reload
means that keepalived reloads its config without stopping and restarting, and it means that the master keepalived instance does not drop back to backup
state.

There is a slight complication using reload in the scenario you mention of adding virtual IPs. By default, keepalived checks both the number of VIPs and the
VIPs themselves in each received advert, to ensure they match what the keepalived instance has configured. To skip the check, set vrrp_skip_check_adv_addr
in the global_defs section or set skip_check_adv_addr in each vrrp_instance block to which you want it to apply. If skip_check_adv_addr is not set, then if
the VIPs don't match both vrrp instances will become master.


We've set "advert_int 3" so that under normal conditions it doesn't bounce
to the backup unless the network is really severed.  So during this
"controlled" restart the backup picks up right away but the primary
doesn't come back as the primary for about 10 seconds as we'd expect.

When the primary is shutdown as part of the restart (stop/start) sequence, it sends priority 0 adverts which causes a rapid takeover by the highest priority
backup.

However, on the Cisco we are finding that after the fail back to the
primary some (not all) of the IPs will still have an arp entry of the
backup keepalived instance.  We've confirmed that these IPs are not still
"stuck" on the backup and are indeed on the primary (verified via "ip a"). 
Our only option is to clear arp on those entries in teh 9k so that the
primary server's MAC is picked up.

The best way of avoiding this issue is to specify use_vmac on each vrrp instance. This causes the vrrp instances to use a virtual MAC address for each vrrp
instance, so when the backup takes over as master it uses the same (virtual) MAC address, and so there is no issue of stale arp entries. This is how VRRP is
supposed to work.

So the questions are:

1. Does this sound like a gratutious arp issue?  In that the Cisco isn't
always honoring the messages when the primary takes back over.  Or maybe
we send too many requests since we have 25 IPs?

It is possible that the flooding of gratuitous arp messages is causing a problem, and for this reason keepalived rate limiting of sending gratuitous arps,
but I wouldn't expect it to be a problem with a modern device. By default keepalived will loop sending gratuitous arps for all VIPs 5 times (resulting in
125 gratuitous arp messages being sent in the case of 25 VIPs), and then it will repeat the same 5 seconds later.

Options for controlling gratuitous arp messages are:
        vrrp_garp_master_delay # delay in seconds before sending second set of gratuitous arp messages (default 5, 0 means no second set)
        vrrp_garp_master_repeat # number of gratuitous arp messages to send for each VIP each time a set of gratuitous arp messages is sent (default 5)
        vrrp_garp_master_refresh # interval between continuing sending sets of gratuitous arp messages in seconds (default disabled)
        vrrp_garp_master_refresh_repeat # number of gratuitous arp messages to send to send for each VIP in each refresh set ( default 1 if
vrrp_garp_master_refresh enabled)
vrrp_garp_interval # interval in seconds between sending gratuitous arp messages, resolution in microseconds (e.g. 0.05)
vrrp_gna_interval # same as vrrp_garp_interval but for IPv6
vrrp_min_garp # sets vrrp_garp_master_delay to 0 and vrrp_garp_master_repeat to 1 (this should work with modern switches)

The above is all covered in the keepalived.conf(5) man page. There is also garp_group to rate limit the sending of garp messages for switches that can't
keep up with the rate of keepalived sending garp messages. This can be limited per single interface, or a group of interfaces. See keepalived.conf(5) man
page for more details.

2. Can we tune our configuration to either send more garp messages or is
there some other known option we should be using to help ensure the
arp-cache on the Cisco stays correct?

It might work better to reduce the number of garp messages, e.g. set vrrp_garp_master_repeat to 1, to send fewer garp messages and reduce the load on the
switch/router in terms of processing garp messages.

3. Is there a "delay" option we can use so that in a controlled restart
the backup doesn't take over immediately?  If we do a "service keepalived
restart" we know the primary will be right back up so there isn't a need
for the backup to even take over the IPs and send out garp messages.

As mentioned above, use "service keepalived reload" if it is supported. If "service keepalived" doesn't support the reload option, send SIGHUP to the parent
keepalived process to initiate a reload.

We are not using any of the vrrp_garp_* configurations other than the
defaults.  Several of them look useful but I'm unsure how they might
address my problems.


In summary, using the use_vmac option, and also the vrrp_skip_check_adv_addr option may be the easiest solution, since use_vmac will mean that the MAC
addresses for the VIPs will not change.

Thanks.

-- Aaron

Re: Tuning gratuitous arp and failover

Aaron Holtz
 

Yes a reload seems a lot smarter. For some reason I thought a restart was required to do what I wanted. 


On Nov 19, 2019, at 16:05, graeme@... wrote:


CatOS or IOS has a feature called ARP flap suppression - you'll be hitting this. It's there for good reason, and though it's tunable it's a good idea to just let it get in with the job.

Can you do a reload instead of a restart? That's less disruptive.

Graeme

On 19 Nov 2019 18:29, Aaron Holtz <aholtz@...> wrote:

Hello,

  Our setup has a pair of virtual servers each running one instance of
keepalived.  Each server has an external IP and in internal IP and there
is a VRRP instance running on each of them. 

The "outside" interface has about 25 virtual IPs on it and we slowly add
more every few weeks.  The default gateway for the keepalived instance is
a Cisco 9k.  When we add in a new virtual_ipaddress and restart keepalived
on the master we're seeing it fail over to the backup (as expected) and
then after a few moments it fails back to the primary (as expected). 
We've set "advert_int 3" so that under normal conditions it doesn't bounce
to the backup unless the network is really severed.  So during this
"controlled" restart the backup picks up right away but the primary
doesn't come back as the primary for about 10 seconds as we'd expect.

However, on the Cisco we are finding that after the fail back to the
primary some (not all) of the IPs will still have an arp entry of the
backup keepalived instance.  We've confirmed that these IPs are not still
"stuck" on the backup and are indeed on the primary (verified via "ip a"). 
Our only option is to clear arp on those entries in teh 9k so that the
primary server's MAC is picked up.

So the questions are:

1. Does this sound like a gratutious arp issue?  In that the Cisco isn't
always honoring the messages when the primary takes back over.  Or maybe
we send too many requests since we have 25 IPs?

2. Can we tune our configuration to either send more garp messages or is
there some other known option we should be using to help ensure the
arp-cache on the Cisco stays correct?

3. Is there a "delay" option we can use so that in a controlled restart
the backup doesn't take over immediately?  If we do a "service keepalived
restart" we know the primary will be right back up so there isn't a need
for the backup to even take over the IPs and send out garp messages.

We are not using any of the vrrp_garp_* configurations other than the
defaults.  Several of them look useful but I'm unsure how they might
address my problems.

Thanks.

-- Aaron




Re: Tuning gratuitous arp and failover

Aaron Holtz
 

Thanks for the reply. I am using the latest git version so your answers are appropriate. I am using the vrrp_skip option globally already so that’s good. I had looked at the virtual MAC but swear there was a reason I couldn’t use it. I’ll have to research that again. 

Thanks for the thoughts. They helped quite a bit. 

Aaron

On Nov 19, 2019, at 16:05, Quentin Armitage <quentin@...> wrote:


Aaron,

You don't say which version of keepalived you are using, so I will assume the current version (v2.0.19), and answer in that context (the easiest way to see which version you are using is to run keepalived with the -v option).

My comments are inline below.

Quentin
On Tue, 2019-11-19 at 13:29 -0500, Aaron Holtz wrote:
Hello,

  Our setup has a pair of virtual servers each running one instance of 
keepalived.  Each server has an external IP and in internal IP and there 
is a VRRP instance running on each of them.  

The "outside" interface has about 25 virtual IPs on it and we slowly add 
more every few weeks.  The default gateway for the keepalived instance is 
a Cisco 9k.  When we add in a new virtual_ipaddress and restart keepalived 
on the master we're seeing it fail over to the backup (as expected) and 
then after a few moments it fails back to the primary (as expected).
You might prefer to use the reload option rather than the restart option (if you are using systemctl then systemctl reload keepalived should work). reload means that keepalived reloads its config without stopping and restarting, and it means that the master keepalived instance does not drop back to backup state.

There is a slight complication using reload in the scenario you mention of adding virtual IPs. By default, keepalived checks both the number of VIPs and the VIPs themselves in each received advert, to ensure they match what the keepalived instance has configured. To skip the check, set vrrp_skip_check_adv_addr in the global_defs section or set skip_check_adv_addr in each vrrp_instance block to which you want it to apply. If skip_check_adv_addr is not set, then if the VIPs don't match both vrrp instances will become master.


We've set "advert_int 3" so that under normal conditions it doesn't bounce 
to the backup unless the network is really severed.  So during this 
"controlled" restart the backup picks up right away but the primary 
doesn't come back as the primary for about 10 seconds as we'd expect.
When the primary is shutdown as part of the restart (stop/start) sequence, it sends priority 0 adverts which causes a rapid takeover by the highest priority backup.
However, on the Cisco we are finding that after the fail back to the 
primary some (not all) of the IPs will still have an arp entry of the 
backup keepalived instance.  We've confirmed that these IPs are not still 
"stuck" on the backup and are indeed on the primary (verified via "ip a").  
Our only option is to clear arp on those entries in teh 9k so that the 
primary server's MAC is picked up.

The best way of avoiding this issue is to specify use_vmac on each vrrp instance. This causes the vrrp instances to use a virtual MAC address for each vrrp instance, so when the backup takes over as master it uses the same (virtual) MAC address, and so there is no issue of stale arp entries. This is how VRRP is supposed to work.
So the questions are:

1. Does this sound like a gratutious arp issue?  In that the Cisco isn't 
always honoring the messages when the primary takes back over.  Or maybe 
we send too many requests since we have 25 IPs?
It is possible that the flooding of gratuitous arp messages is causing a problem, and for this reason keepalived rate limiting of sending gratuitous arps, but I wouldn't expect it to be a problem with a modern device. By default keepalived will loop sending gratuitous arps for all VIPs 5 times (resulting in 125 gratuitous arp messages being sent in the case of 25 VIPs), and then it will repeat the same 5 seconds later.

Options for controlling gratuitous arp messages are:
        vrrp_garp_master_delay # delay in seconds before sending second set of gratuitous arp messages (default 5, 0 means no second set)
        vrrp_garp_master_repeat # number of gratuitous arp messages to send for each VIP each time a set of gratuitous arp messages is sent (default 5)
        vrrp_garp_master_refresh # interval between continuing sending sets of gratuitous arp messages in seconds (default disabled)
        vrrp_garp_master_refresh_repeat # number of gratuitous arp messages to send to send for each VIP in each refresh set ( default 1 if vrrp_garp_master_refresh enabled)
vrrp_garp_interval # interval in seconds between sending gratuitous arp messages, resolution in microseconds (e.g. 0.05)
vrrp_gna_interval # same as vrrp_garp_interval but for IPv6
vrrp_min_garp # sets vrrp_garp_master_delay to 0 and vrrp_garp_master_repeat to 1 (this should work with modern switches)

The above is all covered in the keepalived.conf(5) man page. There is also garp_group to rate limit the sending of garp messages for switches that can't keep up with the rate of keepalived sending garp messages. This can be limited per single interface, or a group of interfaces. See keepalived.conf(5) man page for more details.
2. Can we tune our configuration to either send more garp messages or is 
there some other known option we should be using to help ensure the 
arp-cache on the Cisco stays correct?
It might work better to reduce the number of garp messages, e.g. set vrrp_garp_master_repeat to 1, to send fewer garp messages and reduce the load on the switch/router in terms of processing garp messages.
3. Is there a "delay" option we can use so that in a controlled restart 
the backup doesn't take over immediately?  If we do a "service keepalived 
restart" we know the primary will be right back up so there isn't a need 
for the backup to even take over the IPs and send out garp messages.
As mentioned above, use "service keepalived reload" if it is supported. If "service keepalived" doesn't support the reload option, send SIGHUP to the parent keepalived process to initiate a reload.
We are not using any of the vrrp_garp_* configurations other than the 
defaults.  Several of them look useful but I'm unsure how they might 
address my problems.

In summary, using the use_vmac option, and also the vrrp_skip_check_adv_addr option may be the easiest solution, since use_vmac will mean that the MAC addresses for the VIPs will not change.
Thanks.

-- Aaron


Re: Tuning gratuitous arp and failover

Graeme Fowler
 

CatOS or IOS has a feature called ARP flap suppression - you'll be hitting this. It's there for good reason, and though it's tunable it's a good idea to just let it get in with the job.

Can you do a reload instead of a restart? That's less disruptive.

Graeme

On 19 Nov 2019 18:29, Aaron Holtz <aholtz@...> wrote:

Hello,

  Our setup has a pair of virtual servers each running one instance of
keepalived.  Each server has an external IP and in internal IP and there
is a VRRP instance running on each of them. 

The "outside" interface has about 25 virtual IPs on it and we slowly add
more every few weeks.  The default gateway for the keepalived instance is
a Cisco 9k.  When we add in a new virtual_ipaddress and restart keepalived
on the master we're seeing it fail over to the backup (as expected) and
then after a few moments it fails back to the primary (as expected). 
We've set "advert_int 3" so that under normal conditions it doesn't bounce
to the backup unless the network is really severed.  So during this
"controlled" restart the backup picks up right away but the primary
doesn't come back as the primary for about 10 seconds as we'd expect.

However, on the Cisco we are finding that after the fail back to the
primary some (not all) of the IPs will still have an arp entry of the
backup keepalived instance.  We've confirmed that these IPs are not still
"stuck" on the backup and are indeed on the primary (verified via "ip a"). 
Our only option is to clear arp on those entries in teh 9k so that the
primary server's MAC is picked up.

So the questions are:

1. Does this sound like a gratutious arp issue?  In that the Cisco isn't
always honoring the messages when the primary takes back over.  Or maybe
we send too many requests since we have 25 IPs?

2. Can we tune our configuration to either send more garp messages or is
there some other known option we should be using to help ensure the
arp-cache on the Cisco stays correct?

3. Is there a "delay" option we can use so that in a controlled restart
the backup doesn't take over immediately?  If we do a "service keepalived
restart" we know the primary will be right back up so there isn't a need
for the backup to even take over the IPs and send out garp messages.

We are not using any of the vrrp_garp_* configurations other than the
defaults.  Several of them look useful but I'm unsure how they might
address my problems.

Thanks.

-- Aaron




Re: Tuning gratuitous arp and failover

Quentin Armitage
 

Aaron,

You don't say which version of keepalived you are using, so I will assume the current version (v2.0.19), and answer in that context (the easiest way to see which version you are using is to run keepalived with the -v option).

My comments are inline below.

Quentin
On Tue, 2019-11-19 at 13:29 -0500, Aaron Holtz wrote:
Hello,

  Our setup has a pair of virtual servers each running one instance of 
keepalived.  Each server has an external IP and in internal IP and there 
is a VRRP instance running on each of them.  

The "outside" interface has about 25 virtual IPs on it and we slowly add 
more every few weeks.  The default gateway for the keepalived instance is 
a Cisco 9k.  When we add in a new virtual_ipaddress and restart keepalived 
on the master we're seeing it fail over to the backup (as expected) and 
then after a few moments it fails back to the primary (as expected).
You might prefer to use the reload option rather than the restart option (if you are using systemctl then systemctl reload keepalived should work). reload means that keepalived reloads its config without stopping and restarting, and it means that the master keepalived instance does not drop back to backup state.

There is a slight complication using reload in the scenario you mention of adding virtual IPs. By default, keepalived checks both the number of VIPs and the VIPs themselves in each received advert, to ensure they match what the keepalived instance has configured. To skip the check, set vrrp_skip_check_adv_addr in the global_defs section or set skip_check_adv_addr in each vrrp_instance block to which you want it to apply. If skip_check_adv_addr is not set, then if the VIPs don't match both vrrp instances will become master.


We've set "advert_int 3" so that under normal conditions it doesn't bounce 
to the backup unless the network is really severed.  So during this 
"controlled" restart the backup picks up right away but the primary 
doesn't come back as the primary for about 10 seconds as we'd expect.
When the primary is shutdown as part of the restart (stop/start) sequence, it sends priority 0 adverts which causes a rapid takeover by the highest priority backup.
However, on the Cisco we are finding that after the fail back to the 
primary some (not all) of the IPs will still have an arp entry of the 
backup keepalived instance.  We've confirmed that these IPs are not still 
"stuck" on the backup and are indeed on the primary (verified via "ip a").  
Our only option is to clear arp on those entries in teh 9k so that the 
primary server's MAC is picked up.

The best way of avoiding this issue is to specify use_vmac on each vrrp instance. This causes the vrrp instances to use a virtual MAC address for each vrrp instance, so when the backup takes over as master it uses the same (virtual) MAC address, and so there is no issue of stale arp entries. This is how VRRP is supposed to work.
So the questions are:

1. Does this sound like a gratutious arp issue?  In that the Cisco isn't 
always honoring the messages when the primary takes back over.  Or maybe 
we send too many requests since we have 25 IPs?
It is possible that the flooding of gratuitous arp messages is causing a problem, and for this reason keepalived rate limiting of sending gratuitous arps, but I wouldn't expect it to be a problem with a modern device. By default keepalived will loop sending gratuitous arps for all VIPs 5 times (resulting in 125 gratuitous arp messages being sent in the case of 25 VIPs), and then it will repeat the same 5 seconds later.

Options for controlling gratuitous arp messages are:
        vrrp_garp_master_delay # delay in seconds before sending second set of gratuitous arp messages (default 5, 0 means no second set)
        vrrp_garp_master_repeat # number of gratuitous arp messages to send for each VIP each time a set of gratuitous arp messages is sent (default 5)
        vrrp_garp_master_refresh # interval between continuing sending sets of gratuitous arp messages in seconds (default disabled)
        vrrp_garp_master_refresh_repeat # number of gratuitous arp messages to send to send for each VIP in each refresh set ( default 1 if vrrp_garp_master_refresh enabled)
vrrp_garp_interval # interval in seconds between sending gratuitous arp messages, resolution in microseconds (e.g. 0.05)
vrrp_gna_interval # same as vrrp_garp_interval but for IPv6
vrrp_min_garp # sets vrrp_garp_master_delay to 0 and vrrp_garp_master_repeat to 1 (this should work with modern switches)

The above is all covered in the keepalived.conf(5) man page. There is also garp_group to rate limit the sending of garp messages for switches that can't keep up with the rate of keepalived sending garp messages. This can be limited per single interface, or a group of interfaces. See keepalived.conf(5) man page for more details.
2. Can we tune our configuration to either send more garp messages or is 
there some other known option we should be using to help ensure the 
arp-cache on the Cisco stays correct?
It might work better to reduce the number of garp messages, e.g. set vrrp_garp_master_repeat to 1, to send fewer garp messages and reduce the load on the switch/router in terms of processing garp messages.
3. Is there a "delay" option we can use so that in a controlled restart 
the backup doesn't take over immediately?  If we do a "service keepalived 
restart" we know the primary will be right back up so there isn't a need 
for the backup to even take over the IPs and send out garp messages.
As mentioned above, use "service keepalived reload" if it is supported. If "service keepalived" doesn't support the reload option, send SIGHUP to the parent keepalived process to initiate a reload.
We are not using any of the vrrp_garp_* configurations other than the 
defaults.  Several of them look useful but I'm unsure how they might 
address my problems.

In summary, using the use_vmac option, and also the vrrp_skip_check_adv_addr option may be the easiest solution, since use_vmac will mean that the MAC addresses for the VIPs will not change.
Thanks.

-- Aaron


Tuning gratuitous arp and failover

Aaron Holtz
 

Hello,

Our setup has a pair of virtual servers each running one instance of
keepalived. Each server has an external IP and in internal IP and there
is a VRRP instance running on each of them.

The "outside" interface has about 25 virtual IPs on it and we slowly add
more every few weeks. The default gateway for the keepalived instance is
a Cisco 9k. When we add in a new virtual_ipaddress and restart keepalived
on the master we're seeing it fail over to the backup (as expected) and
then after a few moments it fails back to the primary (as expected).
We've set "advert_int 3" so that under normal conditions it doesn't bounce
to the backup unless the network is really severed. So during this
"controlled" restart the backup picks up right away but the primary
doesn't come back as the primary for about 10 seconds as we'd expect.

However, on the Cisco we are finding that after the fail back to the
primary some (not all) of the IPs will still have an arp entry of the
backup keepalived instance. We've confirmed that these IPs are not still
"stuck" on the backup and are indeed on the primary (verified via "ip a").
Our only option is to clear arp on those entries in teh 9k so that the
primary server's MAC is picked up.

So the questions are:

1. Does this sound like a gratutious arp issue? In that the Cisco isn't
always honoring the messages when the primary takes back over. Or maybe
we send too many requests since we have 25 IPs?

2. Can we tune our configuration to either send more garp messages or is
there some other known option we should be using to help ensure the
arp-cache on the Cisco stays correct?

3. Is there a "delay" option we can use so that in a controlled restart
the backup doesn't take over immediately? If we do a "service keepalived
restart" we know the primary will be right back up so there isn't a need
for the backup to even take over the IPs and send out garp messages.

We are not using any of the vrrp_garp_* configurations other than the
defaults. Several of them look useful but I'm unsure how they might
address my problems.

Thanks.

-- Aaron

Re: LVS-DR+Keepalived - Do VIP's really need to be assigned to real servers

Dustin Makepeace
 

Thanks Quentin,

 

I hope you didn’t spend too much time on this. I did in fact find another method that was implemented for accepting connections for the VIP on the real servers. There was a PREROUTING NAT rule setup in Iptables that was REDIRECT’ing connections even without the VIP assigned to an interface. I did send a follow up email the next day advising of this a saying my previous email could be disregarded.

 

I honestly did not setup our LVS routers to begin with so I had to learn about a lot of this on the fly. I didn’t have much prior knowledge on implementation prior to being tasked with looking into this. It seems that the person that did set this up initially had some trouble and in an attempt to get it working, implemented multiple methods without removing the unnecessary ones once he was able to get a working implementation.

 

I got bit by one of the multiple methods that can be used for getting traffic to the real servers with/without the VIP.

 

One follow up question if you don’t mind? I’ve seen a number of forums/articles online that talk about assigning the VIP to a non-ARP interface. We do have the VIP assigned to our main “eth0” interface however we implement ARPtables for disregarding ARP request. Is assigning the VIP to a non-ARP interface just another one of the methods that can be used to avoid the dreaded “ARP problem”? Is it recommended one way over the other?

 

Thanks for all your time.

Dustin Makepeace / DevOps Engineer

 

From: Quentin Armitage <quentin@...>
Sent: Wednesday, November 13, 2019 8:29 AM
To: keepalived-users@groups.io; Dustin Makepeace <Dustin.Makepeace@...>
Subject: Re: [keepalived-users] LVS-DR+Keepalived - Do VIP's really need to be assigned to real servers

 

CAUTION: This email originated from outside of the organization.


Dustin,

 

Please see answers and comments inline below.

 

Quentin Armitage

 

On Wed, 2019-11-06 at 20:00 +0000, Dustin Makepeace wrote:

Good day,

 

For months I have been scouring the internet looking for details or explanations as to why the VIP’s have to be assigned to the real servers. What I am finding is old, outdated, abandoned articles that only say “you have to do this to prevent an ARP issue” but don’t go into much further detail. While it seems that LVS may have been an abandoned piece of software, I understand it is still widely used and very stable and continues to live in the kernel source code.

LVS is not abandoned. Within the last 6 months support for GUE and GRE tunnels has been added, for example, and although there hasn't yet been a version of ipvsadm released to support them, keepalived does support them.

 

I am now starting to understand the relationship between LVS and Keepalived whereas I was always searching for LVS, I now find myself here asking the question, “Does Keepalived remove the requirement of the VIP being assigned to the real servers?”

 

The simple answer to the question is NO. keepalived handles the VIPs on the LVS routers, but not on the real servers.

 

My company has many environments from Dev, to test to Production. All of them use a dual LVS Master-Backup setup with Keepalived. Our configuration is set for direct routing.

 

We use the following (both reside in the same subnet):

 

On LVS routers:

Ubuntu 16.04.6

Keepalived - 1.2.24-1ubuntu0.16.04.2 (Main Ubuntu repo)

ipvsadm - 1.28-3ubuntu0.16.04.1 (Main Ubuntu repo)

 

On Real Servers:

Ubuntu 16.04.6

Apache2 - 2.4.18-2ubuntu3.13 (Main Ubuntu repo)

HAProxy - 1.7.11-1ppa1~xenial (Launchpad PPA)

 

In our initial setup, which existed this way for at least a year or more, we assigned the VIP to the real servers on boot-up using /etc/rc.local. The arptables package that comes from the Ubuntu repos does not contain a service script for loading ARPtables at boot time. So this was done initially using our CM tool Puppet. One day we experienced the dreaded ARP issue on one of our environments and this lead me to change how we applied ARPtables. Since Puppet would run some arbitrary time after start-up, the VIP alias was being added first and the ARPtables rules applied at a little later time. This was an important rule that I discovered we were not following strictly enough. Puppet now just manages out ARPtables config and uses /etc/rc.local to load them before we assign the VIP in the same file during boot-up. This seems to have been working good but at the time we experienced our first ARP issue, in troubleshooting we found that simply removing the VIP from the real servers resolved the performance issues immediately and did not affect the functionality of our application. The users could still use the application without the VIP on the real servers. This is what lead me down a path for months off-and-on looking for explanations as to why the VIP is needed on real servers. I already saw first hand that removing the VIP from the real servers did not impact our load balancing or ability to do direct routing via Keepalived, as well as this very much simplifies the complex configuration we are told to use from various online resources.

 

Unless there is some other very strange configuration on your real servers, removing the VIPs from them will stop the real servers responding to the forwarded packets. Because what you described is so unexpected, I have confirmed this with some testing, and I find that it is necessary to have the VIPs configured on the real servers. If I remove the VIPs the real servers don't respond.

 

 

I finally got around to doing some additional experimentation. I removed the VIP alias and removed the ARPtables entries from the real servers. I even rebooted the real servers to be sure nothing was stuck in memory or in the kernel memory. I ran packet captures from my local machine, on the LVS director, and both real servers. What I found was that I was still able to navigate our application no problem. The LVS director was only sending traffic one way, to the backend real servers. The real servers still sent the response back directly to the client and the IP on the real servers from their end as well as the packet capture on my local machine reflected the source IP as being the VIP… Not sure how the real servers still responded with the VIP but without confirmation online, I’m left to think that this is Keepalived that mangled the packet so the real servers still send the response back with the VIP. I don’t claim to be a networking expert by any means so take my guesses with a grain of salt.

 

 

My concern is that the available documentation out there all claims that for LVS direct routing, you must use ARPtables (or one of the other options offered) to prevent the real servers from responding to ARP request because we are told that the VIP has to be setup on the real servers. This does not seem to be true and can be eliminated all together as far as I can tell. The real servers will not have any ARP issues if the VIP is never assigned to them in the first place. And not assigning the VIP to the real servers does not appear to break the core functionality of using LVS-DR+Keepalived. Am I missing something here. Did I make a mistake and am missing the point? Is my particular setup change the scoop where this applies to me in my case but not necessarily everybody else? Is this something that Keepalived fixed and I just can’t find it properly documented anywhere online?

 

Not having the VIPs configured on the real servers is certainly not expected to work, and should not be relied upon. You need to have a look at your systems in case you have some other strange configuration enabling them to work.

 

keepalived does not process any packets relating to the LVS setup. keepalived sets up the LVS configuration (as seen with ipvsadm), but that is the limit of what it does in respect of LVS.

 

 

 

Sorry for the long rant. I really just want to explain my situation and get my point across. Please respond with any feedback, advice, or follow up questions at your convenience. I’m likely going to start rolling out this newly discovered setup to our lower environments. Hope to hear all of your thoughts soon.

 

Thank you!

Dustin Makepeace / DevOps Engineer

 

I hope the above helps, and if you have any further questions, please do not hesitate to post them to the list.

This email may contain confidential or protected material for the sole use of the intended recipient(s). Any review, use, distribution or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorized to receive for the recipient), please contact the sender by reply email and delete all copies of this message.

Re: LVS-DR+Keepalived - Do VIP's really need to be assigned to real servers

Quentin Armitage
 

Dustin,

Please see answers and comments inline below.

Quentin Armitage

On Wed, 2019-11-06 at 20:00 +0000, Dustin Makepeace wrote:

Good day,

 

For months I have been scouring the internet looking for details or explanations as to why the VIP’s have to be assigned to the real servers. What I am finding is old, outdated, abandoned articles that only say “you have to do this to prevent an ARP issue” but don’t go into much further detail. While it seems that LVS may have been an abandoned piece of software, I understand it is still widely used and very stable and continues to live in the kernel source code.

LVS is not abandoned. Within the last 6 months support for GUE and GRE tunnels has been added, for example, and although there hasn't yet been a version of ipvsadm released to support them, keepalived does support them.

I am now starting to understand the relationship between LVS and Keepalived whereas I was always searching for LVS, I now find myself here asking the question, “Does Keepalived remove the requirement of the VIP being assigned to the real servers?”


The simple answer to the question is NO. keepalived handles the VIPs on the LVS routers, but not on the real servers.

My company has many environments from Dev, to test to Production. All of them use a dual LVS Master-Backup setup with Keepalived. Our configuration is set for direct routing.

 

We use the following (both reside in the same subnet):

 

On LVS routers:

Ubuntu 16.04.6

Keepalived - 1.2.24-1ubuntu0.16.04.2 (Main Ubuntu repo)

ipvsadm - 1.28-3ubuntu0.16.04.1 (Main Ubuntu repo)

 

On Real Servers:

Ubuntu 16.04.6

Apache2 - 2.4.18-2ubuntu3.13 (Main Ubuntu repo)

HAProxy - 1.7.11-1ppa1~xenial (Launchpad PPA)

 

In our initial setup, which existed this way for at least a year or more, we assigned the VIP to the real servers on boot-up using /etc/rc.local. The arptables package that comes from the Ubuntu repos does not contain a service script for loading ARPtables at boot time. So this was done initially using our CM tool Puppet. One day we experienced the dreaded ARP issue on one of our environments and this lead me to change how we applied ARPtables. Since Puppet would run some arbitrary time after start-up, the VIP alias was being added first and the ARPtables rules applied at a little later time. This was an important rule that I discovered we were not following strictly enough. Puppet now just manages out ARPtables config and uses /etc/rc.local to load them before we assign the VIP in the same file during boot-up. This seems to have been working good but at the time we experienced our first ARP issue, in troubleshooting we found that simply removing the VIP from the real servers resolved the performance issues immediately and did not affect the functionality of our application. The users could still use the application without the VIP on the real servers. This is what lead me down a path for months off-and-on looking for explanations as to why the VIP is needed on real servers. I already saw first hand that removing the VIP from the real servers did not impact our load balancing or ability to do direct routing via Keepalived, as well as this very much simplifies the complex configuration we are told to use from various online resources.


Unless there is some other very strange configuration on your real servers, removing the VIPs from them will stop the real servers responding to the forwarded packets. Because what you described is so unexpected, I have confirmed this with some testing, and I find that it is necessary to have the VIPs configured on the real servers. If I remove the VIPs the real servers don't respond.

 

I finally got around to doing some additional experimentation. I removed the VIP alias and removed the ARPtables entries from the real servers. I even rebooted the real servers to be sure nothing was stuck in memory or in the kernel memory. I ran packet captures from my local machine, on the LVS director, and both real servers. What I found was that I was still able to navigate our application no problem. The LVS director was only sending traffic one way, to the backend real servers. The real servers still sent the response back directly to the client and the IP on the real servers from their end as well as the packet capture on my local machine reflected the source IP as being the VIP… Not sure how the real servers still responded with the VIP but without confirmation online, I’m left to think that this is Keepalived that mangled the packet so the real servers still send the response back with the VIP. I don’t claim to be a networking expert by any means so take my guesses with a grain of salt.

 

My concern is that the available documentation out there all claims that for LVS direct routing, you must use ARPtables (or one of the other options offered) to prevent the real servers from responding to ARP request because we are told that the VIP has to be setup on the real servers. This does not seem to be true and can be eliminated all together as far as I can tell. The real servers will not have any ARP issues if the VIP is never assigned to them in the first place. And not assigning the VIP to the real servers does not appear to break the core functionality of using LVS-DR+Keepalived. Am I missing something here. Did I make a mistake and am missing the point? Is my particular setup change the scoop where this applies to me in my case but not necessarily everybody else? Is this something that Keepalived fixed and I just can’t find it properly documented anywhere online?


Not having the VIPs configured on the real servers is certainly not expected to work, and should not be relied upon. You need to have a look at your systems in case you have some other strange configuration enabling them to work.

keepalived does not process any packets relating to the LVS setup. keepalived sets up the LVS configuration (as seen with ipvsadm), but that is the limit of what it does in respect of LVS.

 

Sorry for the long rant. I really just want to explain my situation and get my point across. Please respond with any feedback, advice, or follow up questions at your convenience. I’m likely going to start rolling out this newly discovered setup to our lower environments. Hope to hear all of your thoughts soon.

 

Thank you!

Dustin Makepeace / DevOps Engineer


I hope the above helps, and if you have any further questions, please do not hesitate to post them to the list.

Re: LVS-DR+Keepalived - Do VIP's really need to be assigned to real servers

Dustin Makepeace
 

You can disregard this request. The person that originally setup our LVS routers used more than one method for accepting traffic at the real servers. I found an Iptables NAT rule that was still accepting traffic even after the VIP/ARPtables was removed. This masked what I thought was a conflict between the documentation and what I was observing. I tested and proved that one of these two methods must be implemented. I’m doing away with one and keeping the other.

 

Thanks

Dustin Makepeace / DevOps Engineer

 

From: keepalived-users@groups.io <keepalived-users@groups.io> On Behalf Of Dustin Makepeace via Groups.Io
Sent: Wednesday, November 6, 2019 3:01 PM
To: keepalived-users@groups.io
Subject: [keepalived-users] LVS-DR+Keepalived - Do VIP's really need to be assigned to real servers

 

CAUTION: This email originated from outside of the organization.


Good day,

 

For months I have been scouring the internet looking for details or explanations as to why the VIP’s have to be assigned to the real servers. What I am finding is old, outdated, abandoned articles that only say “you have to do this to prevent an ARP issue” but don’t go into much further detail. While it seems that LVS may have been an abandoned piece of software, I understand it is still widely used and very stable and continues to live in the kernel source code. I am now starting to understand the relationship between LVS and Keepalived whereas I was always searching for LVS, I now find myself here asking the question, “Does Keepalived remove the requirement of the VIP being assigned to the real servers?”

 

My company has many environments from Dev, to test to Production. All of them use a dual LVS Master-Backup setup with Keepalived. Our configuration is set for direct routing.

 

We use the following (both reside in the same subnet):

 

On LVS routers:

Ubuntu 16.04.6

Keepalived - 1.2.24-1ubuntu0.16.04.2 (Main Ubuntu repo)

ipvsadm - 1.28-3ubuntu0.16.04.1 (Main Ubuntu repo)

 

On Real Servers:

Ubuntu 16.04.6

Apache2 - 2.4.18-2ubuntu3.13 (Main Ubuntu repo)

HAProxy - 1.7.11-1ppa1~xenial (Launchpad PPA)

 

In our initial setup, which existed this way for at least a year or more, we assigned the VIP to the real servers on boot-up using /etc/rc.local. The arptables package that comes from the Ubuntu repos does not contain a service script for loading ARPtables at boot time. So this was done initially using our CM tool Puppet. One day we experienced the dreaded ARP issue on one of our environments and this lead me to change how we applied ARPtables. Since Puppet would run some arbitrary time after start-up, the VIP alias was being added first and the ARPtables rules applied at a little later time. This was an important rule that I discovered we were not following strictly enough. Puppet now just manages out ARPtables config and uses /etc/rc.local to load them before we assign the VIP in the same file during boot-up. This seems to have been working good but at the time we experienced our first ARP issue, in troubleshooting we found that simply removing the VIP from the real servers resolved the performance issues immediately and did not affect the functionality of our application. The users could still use the application without the VIP on the real servers. This is what lead me down a path for months off-and-on looking for explanations as to why the VIP is needed on real servers. I already saw first hand that removing the VIP from the real servers did not impact our load balancing or ability to do direct routing via Keepalived, as well as this very much simplifies the complex configuration we are told to use from various online resources.

 

I finally got around to doing some additional experimentation. I removed the VIP alias and removed the ARPtables entries from the real servers. I even rebooted the real servers to be sure nothing was stuck in memory or in the kernel memory. I ran packet captures from my local machine, on the LVS director, and both real servers. What I found was that I was still able to navigate our application no problem. The LVS director was only sending traffic one way, to the backend real servers. The real servers still sent the response back directly to the client and the IP on the real servers from their end as well as the packet capture on my local machine reflected the source IP as being the VIP… Not sure how the real servers still responded with the VIP but without confirmation online, I’m left to think that this is Keepalived that mangled the packet so the real servers still send the response back with the VIP. I don’t claim to be a networking expert by any means so take my guesses with a grain of salt.

 

My concern is that the available documentation out there all claims that for LVS direct routing, you must use ARPtables (or one of the other options offered) to prevent the real servers from responding to ARP request because we are told that the VIP has to be setup on the real servers. This does not seem to be true and can be eliminated all together as far as I can tell. The real servers will not have any ARP issues if the VIP is never assigned to them in the first place. And not assigning the VIP to the real servers does not appear to break the core functionality of using LVS-DR+Keepalived. Am I missing something here. Did I make a mistake and am missing the point? Is my particular setup change the scoop where this applies to me in my case but not necessarily everybody else? Is this something that Keepalived fixed and I just can’t find it properly documented anywhere online?

 

Sorry for the long rant. I really just want to explain my situation and get my point across. Please respond with any feedback, advice, or follow up questions at your convenience. I’m likely going to start rolling out this newly discovered setup to our lower environments. Hope to hear all of your thoughts soon.

 

Thank you!

Dustin Makepeace / DevOps Engineer

 

This email may contain confidential or protected material for the sole use of the intended recipient(s). Any review, use, distribution or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorized to receive for the recipient), please contact the sender by reply email and delete all copies of this message.

This email may contain confidential or protected material for the sole use of the intended recipient(s). Any review, use, distribution or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorized to receive for the recipient), please contact the sender by reply email and delete all copies of this message.

LVS-DR+Keepalived - Do VIP's really need to be assigned to real servers

Dustin Makepeace
 

Good day,

 

For months I have been scouring the internet looking for details or explanations as to why the VIP’s have to be assigned to the real servers. What I am finding is old, outdated, abandoned articles that only say “you have to do this to prevent an ARP issue” but don’t go into much further detail. While it seems that LVS may have been an abandoned piece of software, I understand it is still widely used and very stable and continues to live in the kernel source code. I am now starting to understand the relationship between LVS and Keepalived whereas I was always searching for LVS, I now find myself here asking the question, “Does Keepalived remove the requirement of the VIP being assigned to the real servers?”

 

My company has many environments from Dev, to test to Production. All of them use a dual LVS Master-Backup setup with Keepalived. Our configuration is set for direct routing.

 

We use the following (both reside in the same subnet):

 

On LVS routers:

Ubuntu 16.04.6

Keepalived - 1.2.24-1ubuntu0.16.04.2 (Main Ubuntu repo)

ipvsadm - 1.28-3ubuntu0.16.04.1 (Main Ubuntu repo)

 

On Real Servers:

Ubuntu 16.04.6

Apache2 - 2.4.18-2ubuntu3.13 (Main Ubuntu repo)

HAProxy - 1.7.11-1ppa1~xenial (Launchpad PPA)

 

In our initial setup, which existed this way for at least a year or more, we assigned the VIP to the real servers on boot-up using /etc/rc.local. The arptables package that comes from the Ubuntu repos does not contain a service script for loading ARPtables at boot time. So this was done initially using our CM tool Puppet. One day we experienced the dreaded ARP issue on one of our environments and this lead me to change how we applied ARPtables. Since Puppet would run some arbitrary time after start-up, the VIP alias was being added first and the ARPtables rules applied at a little later time. This was an important rule that I discovered we were not following strictly enough. Puppet now just manages out ARPtables config and uses /etc/rc.local to load them before we assign the VIP in the same file during boot-up. This seems to have been working good but at the time we experienced our first ARP issue, in troubleshooting we found that simply removing the VIP from the real servers resolved the performance issues immediately and did not affect the functionality of our application. The users could still use the application without the VIP on the real servers. This is what lead me down a path for months off-and-on looking for explanations as to why the VIP is needed on real servers. I already saw first hand that removing the VIP from the real servers did not impact our load balancing or ability to do direct routing via Keepalived, as well as this very much simplifies the complex configuration we are told to use from various online resources.

 

I finally got around to doing some additional experimentation. I removed the VIP alias and removed the ARPtables entries from the real servers. I even rebooted the real servers to be sure nothing was stuck in memory or in the kernel memory. I ran packet captures from my local machine, on the LVS director, and both real servers. What I found was that I was still able to navigate our application no problem. The LVS director was only sending traffic one way, to the backend real servers. The real servers still sent the response back directly to the client and the IP on the real servers from their end as well as the packet capture on my local machine reflected the source IP as being the VIP… Not sure how the real servers still responded with the VIP but without confirmation online, I’m left to think that this is Keepalived that mangled the packet so the real servers still send the response back with the VIP. I don’t claim to be a networking expert by any means so take my guesses with a grain of salt.

 

My concern is that the available documentation out there all claims that for LVS direct routing, you must use ARPtables (or one of the other options offered) to prevent the real servers from responding to ARP request because we are told that the VIP has to be setup on the real servers. This does not seem to be true and can be eliminated all together as far as I can tell. The real servers will not have any ARP issues if the VIP is never assigned to them in the first place. And not assigning the VIP to the real servers does not appear to break the core functionality of using LVS-DR+Keepalived. Am I missing something here. Did I make a mistake and am missing the point? Is my particular setup change the scoop where this applies to me in my case but not necessarily everybody else? Is this something that Keepalived fixed and I just can’t find it properly documented anywhere online?

 

Sorry for the long rant. I really just want to explain my situation and get my point across. Please respond with any feedback, advice, or follow up questions at your convenience. I’m likely going to start rolling out this newly discovered setup to our lower environments. Hope to hear all of your thoughts soon.

 

Thank you!

Dustin Makepeace / DevOps Engineer

 

This email may contain confidential or protected material for the sole use of the intended recipient(s). Any review, use, distribution or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorized to receive for the recipient), please contact the sender by reply email and delete all copies of this message.

Re: How do I define multiple IP's under one keepalived configuration WITH the ability to restart each one individually.

Quentin Armitage
 

On Fri, 2019-11-01 at 18:41 -0400, TomK wrote:
On 11/1/2019 9:52 AM, Quentin Armitage wrote:
On Fri, 2019-11-01 at 08:40 -0400, TomK wrote:

There are various other options that can be used with track files - see man page keepalived.conf(5).

If you execute:
echo 1 >PATH_TO_FILE1
any vrrp instance tracking the file will go to fault state.

Writing 0 to the file will allow the vrrp instance to come out of fault state.


This would do it.  By fault state you mean effective 'offline' state correct?


By fault state I mean the vrrp instance would be down, and not attempting to communicate with other VRRP instances, so I suppose that is offline.

Quentin

Re: How do I define multiple IP's under one keepalived configuration WITH the ability to restart each one individually.

TomK
 

On 11/1/2019 9:52 AM, Quentin Armitage wrote:
On Fri, 2019-11-01 at 08:40 -0400, TomK wrote:
Hey All,

I have a need to restart individual VIP's under a single keepalived 
definition.  What I mean by that is that if I have a configuration 
similar to this one here:

https://access.redhat.com/discussions/3007011

How do I bring up individual IP's from the command line without having 
to restart the entire keepalived instance?  In other words, I'm looking 
for the ability to be able to take individual VIP's offline, rather then 
the whole keepalived instance.

Is this possible?

Tom,

When you refer to VIPs, do you mean the VRRP instances? VIP is normally used to refer to virtual ip addresses, but since the example configuration you have linked to only has one VIP per VRRP instance, I take it you don't mean the individual addresses.

I meant the Virtual IP addresses. 



If you want to be able to force a VRRP instance to go to fault state, and then to be able to clear the fault state, the most efficient way is to use track files. Add the following to the configuration:

vrrp_track_file file1 {
file PATH_TO_FILE1
}

vrrp_track_file file2 {
file PATH_TO_FILE2
}

In the vrrp_instance blocks, add

track_file {
file1 weight 0
}

and likewise for the second vrrp instance.

There are various other options that can be used with track files - see man page keepalived.conf(5).

If you execute:
echo 1 >PATH_TO_FILE1
any vrrp instance tracking the file will go to fault state.

Writing 0 to the file will allow the vrrp instance to come out of fault state.


This would do it.  By fault state you mean effective 'offline' state correct?



I hope that helps,

Quentin Armitage


--
Thx,
TK.