It was noticed that whenever the syncfailed log message was seen an Unite get isolated from the stack and work standalone.
Only reboot of the stack is the solution as per the log reference guide.
57:09:49:34 17317 NVR Stack sync failed Exp: 0x4FE3DBDA Act: 0x4F63DBDA Missing: 0x00800000
57:09:49:34 17317 NVR Stack sync failed Exp: 0x4FE3DBDA Act: 0x4F63DBDA Missing: 0x00800000
These logs will be triggered by stack manger and the logs states Stack synchronization failed on the unit. Missing value yields the application(s) that caused failure.
Logs are the severity logs, but in case if the unit are not successful in joining the stack then in that case unite need to be rebooted.
Few other stacks in the network seems to have the same issue.
The most common factor in all impacted switches is NSNA configuration.
We have further analyzed all the logs and then see lot many NSNA logs on all devices.
It seems the sync lost messages are result of NSNA failure. There seems to be s/w exception on two tasks that are relevant to NSNA traffic and not sure why it is not forming the coredump:
If you follow time in the log, preceding to those sync message are huge # of nsna messages.
There are known limitation recorded in 6.3.1 release note which matches with issue seen in the log.
We do speak to our engineering team and they confirm that NSNA is replaced by Identity Engines.
For whic customer need to consider that option and can reach your sale team.
As far as NSNA issue is concern, so I am worried that it will not be fixed.
In the release note of the 6-3-1 , please see page 64 for known NSNA issues.
Logs:
Exception tasks:
C 57:09:54:39 17334 NVR Task tSSA is suspended
C 57:09:54:39 17333 NVR Task tLLDP is suspended
Sync Failure :
57:09:49:34 17317 NVR Stack sync failed Exp: 0x4FE3DBDA Act: 0x4F63DBDA Missing: 0x00800000
57:09:49:34 17317 NVR Stack sync failed Exp: 0x4FE3DBDA Act: 0x4F63DBDA Missing: 0x00800000