Unexpected reboot: "syslogd 1.4.1: restart"

1

At a clients location we have installed two KEMP LM-5400G+ appliances in HA setup. I have noticed that the "Boot Time" and "Active Since" time are frequently updated on both units. Looking at the log (/progs/logging/display/messages), I noticed the following:

Mar  8 21:20:35 KEMP1 l4d: Adding RS 171.23.255.198:443 to VS 192.110.200.102:443(domain.client.com)
Mar  8 21:20:35 KEMP1 l4d: Removing RS 171.23.255.198:443 from VS 192.110.200.102:443(domain.client.com) - EOF or Incorrect data received
Mar  8 21:21:24 KEMP1 syslogd 1.4.1: restart.
Mar  8 21:21:24 KEMP1 /usr/sbin/cron[2349]: (CRON) STARTUP (1.4.7)
Mar  8 21:21:24 KEMP1 /usr/sbin/cron[2349]: (CRON) INFO (Syslog will be used instead of sendmail.): No such file or directory
Mar  8 21:21:24 KEMP1 /usr/sbin/cron[2349]: (CRON) bad minute (/etc/crontab)
Mar  8 21:21:24 KEMP1 /usr/sbin/cron[2349]: (CRON) INFO (running with inotify support)
Mar  8 21:21:24 KEMP1 ucarp[2495]: [INFO] Local advertised ethernet address is [mac address]
Mar  8 21:21:24 KEMP1 ucarp[2495]: [WARNING] Switching to state: BACKUP
Mar  8 21:21:24 KEMP1 ucarp[2495]: [INFO] Local advertised ethernet address is [mac address]
Mar  8 21:21:24 KEMP1 ucarp[2495]: [WARNING] Switching to state: BACKUP
Mar  8 21:21:24 KEMP1 ucarp[2495]: [INFO] Local advertised ethernet address is [mac address]
Mar  8 21:21:24 KEMP1 ucarp[2495]: [WARNING] Switching to state: BACKUP
....
Mar  8 21:21:27 KEMP1 sslproxy_ha: (2654) listening for connections on 172.20.144.1:443 (id:0)
Mar  8 21:21:27 KEMP1 l4d: l4d [2792] started.
Mar  8 21:21:27 KEMP1 ucarp[2495]: [WARNING] Link bnd3 is up
Mar  8 21:21:27 KEMP1 ucarp[2495]: [WARNING] Link bnd0 is up
Mar  8 21:21:27 KEMP1 named[2832]: starting BIND 9.10.2-P3
Mar  8 21:21:27 KEMP1 named[2832]: built with '--host=i386-unknown-linux-gnu' '--with-dlz-stub=yes' '--without-openssl' '--without-gssapi' '--without-python' '--disable-threads' 'host_alias=i386-unknown-linux-gnu'
Mar  8 21:21:27 KEMP1 named[2832]: ----------------------------------------------------
Mar  8 21:21:27 KEMP1 named[2832]: BIND 9 is maintained by Internet Systems Consortium,

(IP, Domain and MAC information changed to protect the clients network)

It seems the units are rebooting unexpectedly. What is the best way to troubleshoot this?

Both units have been updated to Firmware "7.1-32a-90.20160114-1842"  last week, but this issue has been occuring since the units where installed.

 

 

10 comments

Avatar
0
Mark Deegan

Hello 

This looks to be an issue with the HA. I would check whether IGMP snooping is enabled on your switch that the loadmasters are connected t and disable it. I would also check that the time server is set correctly on both units. I would ensure that the HA virtual ID is anything other than 1. I would use a dedicated interface (usually eth1) as the HA check interface using a direct connect cable. here is some information on setting up HA.

https://support.kemptechnologies.com/hc/en-us/articles/203125199-High-Availability-HA-

https://support.kemptechnologies.com/hc/en-us/articles/201837397-How-do-I-create-an-HA-pair-from-two-single-LoadMasters

If this does not help i would open a support ticket in your region.

regards

Mark

Avatar
0
Rinie Huijgen

Hello Mark,

I will ask our networking department to double check those settings on the HA lan. However during installation (performed by an actual KEMP employee, at the clients office) we have had multiple days of issues with HA, caused by these switch settings not being what the KEMP unit expects. The config would constantly undo any changes we made.

I do have some serious thoughts about a Load Balancer that can crash/reboot every couple of days by some network settings that might be off. Is this a known issue? And will KEMP be resolving this soon?

Kind regards

Avatar
0
Mark Deegan

Hello,

If the switch is the issue then the direct connect cable as mentioned above would be the solution.

regards

Mark

Avatar
0
Rinie Huijgen

Hi Mark,

The distance between both units is around 1 KM. Both units are placed in different datacenters. The whole reason for the HA setup. A direct cable is not possible,

The solution would be for KEMP to resolve this issue in the firmware and not have both units crash and reboot every couple of days. Wouldn't you agree?

Avatar
0
Mark Deegan

Hello,

we usually do not recommend using HA across a link that is 1Km long as the time out on the CARP packet may expire before we get a response from the active unit. As per the HA document the CARP packet is layer 2 broadcast. If the passive unit does not receive the packet in a timely manner then the unit fails over. We recommend using 2 units in HA when they are in the same network rack or datacenter. See the link below for prerequisites for setting up HA. For multiple site redundancy we use GEO functionality. here are some links on both.

GEO

https://support.kemptechnologies.com/hc/en-us/articles/203127879-GEO-Overview

HA

https://support.kemptechnologies.com/hc/en-us/articles/203125199-High-Availability-HA-#_Toc438636279

NOTE: 

  • LoadMasters must be located on the same subnet in order to be in a HA pair
  • LoadMasters must be in the same physical location
  • A layer 2 connection (Ethernet/VLAN) is required
  • The LoadMasters must not be located further than 100 meters from each other

regards

Mark

Avatar
0
Rinie Huijgen

Hi Mark,

GEO loadbalancing is not what the client needs. The KEMP setup is HA for a good reason as all components that the clients uses in production are placed in both datacenters. As they operate multiple NetApp and EMC based sync storage units over fibre, the links between both datacenters are very low latency. Around 5 to 10ms, it below the requirement of 100 ms as in the document you linked

-Latency on the link between the two LoadMasters must be below 100 milliseconds

 

Again, a rebooting KEMP HA setup might be something you want to resolve ?  if you need more information please let me know

Avatar
0
Mark Deegan

Hello 

I would say that we would need a backup and possibly a netconsole host to resolve your issue. I would recommend opening a support ticket. link below

https://support.kemptechnologies.com/hc/en-us/articles/205120745-How-to-Submit-a-Support-Request

regards

Mark 

 

Avatar
0
me

Hello,

We experienced the same issue with a VLM-5000 on the same software version 7.1.32.

The unit suddenly reboots -> without reason or previous error in the log file.

This is a software issue and you have to check what is going wrong.

Regards, Martin

 

Avatar
0
Rinie Huijgen

We have had extensive contact with Kemp support regarding this issue. The problem was firmware based and an update to 7.1.35 resolved the issue.

Avatar
0
me

Ahhhhh! Thanks for this info.
This is a very important information for us!

We will upgrade to 7.1.35 now, then hopefully the issue is resolved.

Thank your very much.

Regards, Martin