Topics

Question about reload / add or remove VIPS

Quentin Armitage
 

On Thu, 2019-10-17 at 09:40 -0700, Luiz Fernando Marques wrote:

Hello,

 

I am wondering how I can efficiently add / remove IPs on keepalived without causing unexpected stops or exchanges between Master / Backup servers.

Currently I use Keepalived on 2 servers in version 2.0.7, having only 1 "vrrp_instance" and an average of 30 VIPS, being 5 in IPv6 with the parameter "virtual_ipaddress_excluded".

Currently when I have to add an IP or remove it, I do it on the master, I execute a 'systemctl reload keepalved' and I urgently need to execute the same command on backup, otherwise it understands that the master has more / less characters in the file of configuration and assumes as Master and begins to give error in everything "Keepalived_vrrp[20666]: (1) vrrp packet too long, length 204 and expect 200" .Is there another alternative to this? What is the recommended way?

 


Fortunately there is a solution to this, but it will certainly require several reloads in the first place, but for subsequent reloads will be easier provided you are happy with the change to enable this to work.

You need to add the keyword 'skip_check_adv_addr' to the vrrp instance, or 'vrrp_skip_adv_addr' to the global_defs section (this has been available since v1.2.20). This option means that when a backup instance receives an advert from the same master that it received the previous advert from, it will not check the addresses in the advert. This was added as an optimisation to save the advert checking, on the basis that a master won't change the addresses in an advert packet. We are now going to abuse that assumption!

Since keepalived doesn't remember, across a config reload, who the current master is, a backup instance will always check the addresses for the first advert it receives after a reload (this is probably a sensible thing to do), it will be necessary to reload the master instance first with the new VIPs and then subsequently reload the backup instances after skip_check_adv_addr is enabled.

You need to do the following in sequence:
1. Add the 'skip_chk_adv_addr' or 'vrrp_skip_check_adv_addr' to each backup instance configuration (do not change the VIPs at this stage), and then execute 'systemctl reload keepalived'. After the reload they will check the addresses in the first advert they receive, and they will match. It won't check the addresses in subsequent adverts.

2. Add/remove/change the VIPs in the master instance configuration (and if you want to keep skip_chk_adv_addr in the configurations, add it to the master's configuration now).

3. Execute 'systemctl_reload_keepalived' on the master. The backups will not see a change of master, and so will not check the addresses in the following adverts.

4. On each of the backup instances, edit the configuration to add/remove/change the VIPs and execute 'systemctl reload keepalived'. They will check the next advert they receive after the reload, but since their configuration matches the master's configuration, there will not be a problem.

5. If you don't want to leave the 'skip_chk_adv_addr' in the configuration, edit the configs to remove it and reload each instance.

If you are happy keeping 'skip_chk_adr_addr' in the configurations, then for future reloads you will only need to do steps 2, 3 and 4.

This approach only works because you do not have a mixture of master and backup instances at the same time on a system (in your case since you only have one keepalived instance).

I had been thinking about adding an option to tell keepalived instances to reload at a specific time, so that reloads could all be scheduled to happen simultaneously, and then no backup instance would see a wrong set of addresses in an advert.

I hope this helps,

Quentin Armitage

Luiz Fernando Marques
 


Good afternoon,

I performed the procedure you mentioned, however it did not work.

1 - The option 'skip_chk_adv_addr' does not work on vrrp instance or global_defs

2 - The 'vrrp_skip_adv_addr' option works on global_defs. I set different IPs in Master and Backup, for testing basis, the operation continues. However when removing / adding IPs the same error happens




I believe it has to do with the number of lines declared in 'virtual_ipaddress', if it has a larger / smaller amount it understands that the master is out and takes over. Of course after the 'advert_int 10' parameter expires.

Is there any parameter where it does not do this check the amount of lines / ips configured?
Em 17/10/2019 16:10, Quentin Armitage escreveu:

On Thu, 2019-10-17 at 09:40 -0700, Luiz Fernando Marques wrote:

Hello,

 

I am wondering how I can efficiently add / remove IPs on keepalived without causing unexpected stops or exchanges between Master / Backup servers.

Currently I use Keepalived on 2 servers in version 2.0.7, having only 1 "vrrp_instance" and an average of 30 VIPS, being 5 in IPv6 with the parameter "virtual_ipaddress_excluded".

Currently when I have to add an IP or remove it, I do it on the master, I execute a 'systemctl reload keepalved' and I urgently need to execute the same command on backup, otherwise it understands that the master has more / less characters in the file of configuration and assumes as Master and begins to give error in everything "Keepalived_vrrp[20666]: (1) vrrp packet too long, length 204 and expect 200" .Is there another alternative to this? What is the recommended way?

 


Fortunately there is a solution to this, but it will certainly require several reloads in the first place, but for subsequent reloads will be easier provided you are happy with the change to enable this to work.

You need to add the keyword 'skip_check_adv_addr' to the vrrp instance, or 'vrrp_skip_adv_addr' to the global_defs section (this has been available since v1.2.20). This option means that when a backup instance receives an advert from the same master that it received the previous advert from, it will not check the addresses in the advert. This was added as an optimisation to save the advert checking, on the basis that a master won't change the addresses in an advert packet. We are now going to abuse that assumption!

Since keepalived doesn't remember, across a config reload, who the current master is, a backup instance will always check the addresses for the first advert it receives after a reload (this is probably a sensible thing to do), it will be necessary to reload the master instance first with the new VIPs and then subsequently reload the backup instances after skip_check_adv_addr is enabled.

You need to do the following in sequence:
1. Add the 'skip_chk_adv_addr' or 'vrrp_skip_check_adv_addr' to each backup instance configuration (do not change the VIPs at this stage), and then execute 'systemctl reload keepalived'. After the reload they will check the addresses in the first advert they receive, and they will match. It won't check the addresses in subsequent adverts.

2. Add/remove/change the VIPs in the master instance configuration (and if you want to keep skip_chk_adv_addr in the configurations, add it to the master's configuration now).

3. Execute 'systemctl_reload_keepalived' on the master. The backups will not see a change of master, and so will not check the addresses in the following adverts.

4. On each of the backup instances, edit the configuration to add/remove/change the VIPs and execute 'systemctl reload keepalived'. They will check the next advert they receive after the reload, but since their configuration matches the master's configuration, there will not be a problem.

5. If you don't want to leave the 'skip_chk_adv_addr' in the configuration, edit the configs to remove it and reload each instance.

If you are happy keeping 'skip_chk_adr_addr' in the configurations, then for future reloads you will only need to do steps 2, 3 and 4.

This approach only works because you do not have a mixture of master and backup instances at the same time on a system (in your case since you only have one keepalived instance).

I had been thinking about adding an option to tell keepalived instances to reload at a specific time, so that reloads could all be scheduled to happen simultaneously, and then no backup instance would see a wrong set of addresses in an advert.

I hope this helps,

Quentin Armitage
Mensagem Verificada pelo AntiSpam Infotecnica

Quentin Armitage
 

Hi,

My apologies for not getting it quite right.

1. Unfortunately I mistyped the keyword; it should be 'skip_check_adv_addr' rather than 'skip_chk_adv_addr'.

2. You are quite right. There is a separate check that the advert packet is the correct length based on the number of VIPs configured, which isn't turned off by setting skip_check_adv_addr.

I have had a further look at the RFCs for VRRPv2 and VRRPv3, and they state the following:
 - MUST verify that the received packet contains the complete VRRP packet (including fixed fields, and IPvX address).
and
 - MAY verify that "Count IPvX Addrs" and the list of IPvX address(es) match the IPvX Address(es) configured for the VRID.

At the moment, keepalived is using the number of configured VIPs to determine the expected packet length, whereas I now think it should be using the Count IPvX Addrs field in the VRRP packet header.

I have now pushed commit be47538 which changes keepalived to use the Count IPvX Addrs field, and with this commit it is now possible to reload with a different number of VIPs as I described below.

Unfortunately this won't resolve the issue for v2.0.7 (unless you backport the patch), and I can't see any other way of achieving what you want to do directly with keepalived.

You could try using the following script to reload all vrrp instances simultaneously, which should stop backup instances becoming masters. The script would need to be run on all systems where keepalived is running. To reload at 12:31:30, run it as: timed_reload 12 31 30 . Specifying the seconds is optional.

cat /tmp/timed_reload
#!/bin/bash

if [[ $# -lt 2 || $# -gt 3 ]]; then
	echo $0 HH MM [SS]
	exit 1
fi

HOUR=$( <<<$1 sed -e "s/^0*//")
MIN=$( <<<$2 sed -e "s/^0*//")
[[ $# -eq 3 ]] && SEC=$( <<<$3 sed -e "s/^0*//") || SEC=0

set $(date +"%H %M %S %N")
C_HOUR=$( <<<$1 sed -e "s/^0*//")
C_MIN=$( <<<$2 sed -e "s/^0*//")
C_SEC=$( <<<$3 sed -e "s/^0*//")
C_NSEC=$( <<<$4 sed -e "s/^0*//")

#echo $HOUR - $C_HOUR * 3600 + $MIN - $C_MIN * 60 + $SEC - $C_SEC
DELAY=$(( (HOUR - C_HOUR) * 3600 + (MIN - C_MIN) * 60 + SEC - C_SEC ))

if [[ $C_NSEC -ne 0 ]]; then
	: $((DELAY--))
	NSEC_DELAY=$((1000000000 - C_NSEC))
	NSEC_DELAY=.$(printf "%9.9d" $NSEC_DELAY)
else
	NSEC_DELAY=
fi

if [[ $DELAY -lt 0 ]]; then
	echo Time specified is in the past
	exit 1
fi

sleep $DELAY$NSEC_DELAY

systemctl reload keepalived

I hope that helps,

Quentin Armitage

On Fri, 2019-10-18 at 14:20 -0300, Luiz Fernando Marques wrote:


Good afternoon,

I performed the procedure you mentioned, however it did not work.

1 - The option 'skip_chk_adv_addr' does not work on vrrp instance or global_defs

2 - The 'vrrp_skip_adv_addr' option works on global_defs. I set different IPs in Master and Backup, for testing basis, the operation continues. However when removing / adding IPs the same error happens




I believe it has to do with the number of lines declared in 'virtual_ipaddress', if it has a larger / smaller amount it understands that the master is out and takes over. Of course after the 'advert_int 10' parameter expires.

Is there any parameter where it does not do this check the amount of lines / ips configured?
Em 17/10/2019 16:10, Quentin Armitage escreveu:
On Thu, 2019-10-17 at 09:40 -0700, Luiz Fernando Marques wrote:

Hello,

 

I am wondering how I can efficiently add / remove IPs on keepalived without causing unexpected stops or exchanges between Master / Backup servers.

Currently I use Keepalived on 2 servers in version 2.0.7, having only 1 "vrrp_instance" and an average of 30 VIPS, being 5 in IPv6 with the parameter "virtual_ipaddress_excluded".

Currently when I have to add an IP or remove it, I do it on the master, I execute a 'systemctl reload keepalved' and I urgently need to execute the same command on backup, otherwise it understands that the master has more / less characters in the file of configuration and assumes as Master and begins to give error in everything "Keepalived_vrrp[20666]: (1) vrrp packet too long, length 204 and expect 200" .Is there another alternative to this? What is the recommended way?

 


Fortunately there is a solution to this, but it will certainly require several reloads in the first place, but for subsequent reloads will be easier provided you are happy with the change to enable this to work.

You need to add the keyword 'skip_check_adv_addr' to the vrrp instance, or 'vrrp_skip_adv_addr' to the global_defs section (this has been available since v1.2.20). This option means that when a backup instance receives an advert from the same master that it received the previous advert from, it will not check the addresses in the advert. This was added as an optimisation to save the advert checking, on the basis that a master won't change the addresses in an advert packet. We are now going to abuse that assumption!

Since keepalived doesn't remember, across a config reload, who the current master is, a backup instance will always check the addresses for the first advert it receives after a reload (this is probably a sensible thing to do), it will be necessary to reload the master instance first with the new VIPs and then subsequently reload the backup instances after skip_check_adv_addr is enabled.

You need to do the following in sequence:
1. Add the 'skip_chk_adv_addr' or 'vrrp_skip_check_adv_addr' to each backup instance configuration (do not change the VIPs at this stage), and then execute 'systemctl reload keepalived'. After the reload they will check the addresses in the first advert they receive, and they will match. It won't check the addresses in subsequent adverts.

2. Add/remove/change the VIPs in the master instance configuration (and if you want to keep skip_chk_adv_addr in the configurations, add it to the master's configuration now).

3. Execute 'systemctl_reload_keepalived' on the master. The backups will not see a change of master, and so will not check the addresses in the following adverts.

4. On each of the backup instances, edit the configuration to add/remove/change the VIPs and execute 'systemctl reload keepalived'. They will check the next advert they receive after the reload, but since their configuration matches the master's configuration, there will not be a problem.

5. If you don't want to leave the 'skip_chk_adv_addr' in the configuration, edit the configs to remove it and reload each instance.

If you are happy keeping 'skip_chk_adr_addr' in the configurations, then for future reloads you will only need to do steps 2, 3 and 4.

This approach only works because you do not have a mixture of master and backup instances at the same time on a system (in your case since you only have one keepalived instance).

I had been thinking about adding an option to tell keepalived instances to reload at a specific time, so that reloads could all be scheduled to happen simultaneously, and then no backup instance would see a wrong set of addresses in an advert.

I hope this helps,

Quentin Armitage
Mensagem Verificada pelo AntiSpam Infotecnica

Luiz Fernando Marques
 

Thank you!


I have been updated for v2.0.19 (10/19,2019) and it worked!!!


I was depending on this feature to put the environment into production, again grateful.


Em 19/10/2019 08:26, Quentin Armitage escreveu:

Hi,

My apologies for not getting it quite right.

1. Unfortunately I mistyped the keyword; it should be 'skip_check_adv_addr' rather than 'skip_chk_adv_addr'.

2. You are quite right. There is a separate check that the advert packet is the correct length based on the number of VIPs configured, which isn't turned off by setting skip_check_adv_addr.

I have had a further look at the RFCs for VRRPv2 and VRRPv3, and they state the following:
 - MUST verify that the received packet contains the complete VRRP packet (including fixed fields, and IPvX address).
and
 - MAY verify that "Count IPvX Addrs" and the list of IPvX address(es) match the IPvX Address(es) configured for the VRID.

At the moment, keepalived is using the number of configured VIPs to determine the expected packet length, whereas I now think it should be using the Count IPvX Addrs field in the VRRP packet header.

I have now pushed commit be47538 which changes keepalived to use the Count IPvX Addrs field, and with this commit it is now possible to reload with a different number of VIPs as I described below.

Unfortunately this won't resolve the issue for v2.0.7 (unless you backport the patch), and I can't see any other way of achieving what you want to do directly with keepalived.

You could try using the following script to reload all vrrp instances simultaneously, which should stop backup instances becoming masters. The script would need to be run on all systems where keepalived is running. To reload at 12:31:30, run it as: timed_reload 12 31 30 . Specifying the seconds is optional.

cat /tmp/timed_reload
#!/bin/bash

        
if [[ $# -lt 2 || $# -gt 3 ]]; then
	echo $0 HH MM [SS]
	exit 1
fi

        
HOUR=$( <<<$1 sed -e "s/^0*//")
MIN=$( <<<$2 sed -e "s/^0*//")
[[ $# -eq 3 ]] && SEC=$( <<<$3 sed -e "s/^0*//") || SEC=0

        
set $(date +"%H %M %S %N")
C_HOUR=$( <<<$1 sed -e "s/^0*//")
C_MIN=$( <<<$2 sed -e "s/^0*//")
C_SEC=$( <<<$3 sed -e "s/^0*//")
C_NSEC=$( <<<$4 sed -e "s/^0*//")

        
#echo $HOUR - $C_HOUR * 3600 + $MIN - $C_MIN * 60 + $SEC - $C_SEC
DELAY=$(( (HOUR - C_HOUR) * 3600 + (MIN - C_MIN) * 60 + SEC - C_SEC ))

        
if [[ $C_NSEC -ne 0 ]]; then
	: $((DELAY--))
	NSEC_DELAY=$((1000000000 - C_NSEC))
	NSEC_DELAY=.$(printf "%9.9d" $NSEC_DELAY)
else
	NSEC_DELAY=
fi

        
if [[ $DELAY -lt 0 ]]; then
	echo Time specified is in the past
	exit 1
fi

        
sleep $DELAY$NSEC_DELAY

        
systemctl reload keepalived

        
I hope that helps,

Quentin Armitage
On Fri, 2019-10-18 at 14:20 -0300, Luiz Fernando Marques wrote:


Good afternoon,

I performed the procedure you mentioned, however it did not work.

1 - The option 'skip_chk_adv_addr' does not work on vrrp instance or global_defs

2 - The 'vrrp_skip_adv_addr' option works on global_defs. I set different IPs in Master and Backup, for testing basis, the operation continues. However when removing / adding IPs the same error happens




I believe it has to do with the number of lines declared in 'virtual_ipaddress', if it has a larger / smaller amount it understands that the master is out and takes over. Of course after the 'advert_int 10' parameter expires.

Is there any parameter where it does not do this check the amount of lines / ips configured?
Em 17/10/2019 16:10, Quentin Armitage escreveu:
On Thu, 2019-10-17 at 09:40 -0700, Luiz Fernando Marques wrote:

Hello,

 

I am wondering how I can efficiently add / remove IPs on keepalived without causing unexpected stops or exchanges between Master / Backup servers.

Currently I use Keepalived on 2 servers in version 2.0.7, having only 1 "vrrp_instance" and an average of 30 VIPS, being 5 in IPv6 with the parameter "virtual_ipaddress_excluded".

Currently when I have to add an IP or remove it, I do it on the master, I execute a 'systemctl reload keepalved' and I urgently need to execute the same command on backup, otherwise it understands that the master has more / less characters in the file of configuration and assumes as Master and begins to give error in everything "Keepalived_vrrp[20666]: (1) vrrp packet too long, length 204 and expect 200" .Is there another alternative to this? What is the recommended way?

 


Fortunately there is a solution to this, but it will certainly require several reloads in the first place, but for subsequent reloads will be easier provided you are happy with the change to enable this to work.

You need to add the keyword 'skip_check_adv_addr' to the vrrp instance, or 'vrrp_skip_adv_addr' to the global_defs section (this has been available since v1.2.20). This option means that when a backup instance receives an advert from the same master that it received the previous advert from, it will not check the addresses in the advert. This was added as an optimisation to save the advert checking, on the basis that a master won't change the addresses in an advert packet. We are now going to abuse that assumption!

Since keepalived doesn't remember, across a config reload, who the current master is, a backup instance will always check the addresses for the first advert it receives after a reload (this is probably a sensible thing to do), it will be necessary to reload the master instance first with the new VIPs and then subsequently reload the backup instances after skip_check_adv_addr is enabled.

You need to do the following in sequence:
1. Add the 'skip_chk_adv_addr' or 'vrrp_skip_check_adv_addr' to each backup instance configuration (do not change the VIPs at this stage), and then execute 'systemctl reload keepalived'. After the reload they will check the addresses in the first advert they receive, and they will match. It won't check the addresses in subsequent adverts.

2. Add/remove/change the VIPs in the master instance configuration (and if you want to keep skip_chk_adv_addr in the configurations, add it to the master's configuration now).

3. Execute 'systemctl_reload_keepalived' on the master. The backups will not see a change of master, and so will not check the addresses in the following adverts.

4. On each of the backup instances, edit the configuration to add/remove/change the VIPs and execute 'systemctl reload keepalived'. They will check the next advert they receive after the reload, but since their configuration matches the master's configuration, there will not be a problem.

5. If you don't want to leave the 'skip_chk_adv_addr' in the configuration, edit the configs to remove it and reload each instance.

If you are happy keeping 'skip_chk_adr_addr' in the configurations, then for future reloads you will only need to do steps 2, 3 and 4.

This approach only works because you do not have a mixture of master and backup instances at the same time on a system (in your case since you only have one keepalived instance).

I had been thinking about adding an option to tell keepalived instances to reload at a specific time, so that reloads could all be scheduled to happen simultaneously, and then no backup instance would see a wrong set of addresses in an advert.

I hope this helps,

Quentin Armitage
Mensagem Verificada pelo AntiSpam Infotecnica
Mensagem Verificada pelo AntiSpam Infotecnica