Causes of enabling and disabling parity errors on ERS 8600 running on 7.2.0.2 software version.
SSF failure dynamic detection existing functionality demands at least one parity error per 500 ms on any I/O module TAP in order to declare the existence of a SSF problem. For example, if user set 8 errors ( config bootconfig parity-errors set 8), that will mean at least one error per 500ms in an interval of 4 seconds.
Certain limitations of this new feature:
1.When there is only one CP in the chassis, on detecting an excess parity error the line card is now powered off before rebooting the CP to monitor mode.
2.When the parity error occurs in the slave CP then the "autoboot flag" is set to false in the slave boot.cfg. This will result in the slave CP being rebooted and staying in the boot monitor mode effectively taking the slave/standby SF out of service.
3.When the master detects the excess parity error, the "autoboot flag" is set to false on both the master and slave boot.cfg. The consequence of this is that if a second error occurs on the new master, it will re-boot and stop at boot monitor. The customer has a script that goes round and changes the "autoboot flag" to true in the master boot.cfg
4.When the parity error is detected on the master, the new slave that becomes master will initialize the failed SF. The failed SF will still be in service.
SSF Failure Dynamic Detection Enhanced Functionality :
Enhancement I : Check the SSF/CPU module's TAP's parity bit :
This enhancement will take care of packets destined to CPU. For example, an OSPF sent from the neighbor could be corrupted by the SSF, and the packet dropped by the OctaPid before being handed to the CPU. This is now detected by the 8895SF (or 8692SF) TAP parity bit.
Enhancement I I : Attempt to recover by resetting the SSF :
This Enhancement is for calling "swipLockupReset" for potential recovery when SSF failure is detected. Actually, based on the experience, 50% of the SSF problems can be recovered in this way.
Enhancement I I I : Detection for low rate traffic :
This enhancement is to take care of control packets, like PIM, OSPF etc, which are low frequency.
It runs in parallel with the high rate (one per 500 ms) detection mechanism.