Communication Manager: UPS,MIN: Uninterruptible Power Supply


Doc ID    SOLN120642
Version:    12.0
Status:    Published
Published date:    23 Aug 2018
Created Date:    04 Dec 2007
Author:   
Dennis Wennerstrom
 

Details

When a UPS event occurs, an alarm will be raised and the server state of health will be degraded. The server will not shut down as long as battery power is available. It is the goal of Communication Manager to provide call processing service for as long as possible. Given the highly reliable file systems and recovery mechanisms in place, it is expected that no damage will be done to the server. The server is expected to recover normally even if the UPS runs of out battery backup and the server catastrophically loses its power. After power has been restored and the system has been up for approximately 1 hour 15 minutes, the alarm will be resolved automatically.

Problem Clarification

Alarm on MO UPS

Cause

A Linux-based Media Server (internal or external) is configured so that it serves as the trap collector and provides external alarm notification.

A process called the Global Maintenance Manager (GMM) runs on the Media Server and collects events that are logged to the Linux syslog_d process. These events consist primarily of failure notification events logged by Communication Manager (CM) and INTUITY maintenance subsystems. For events that require external notification, the most basic choice is to call the Avaya technical service center's Initialization and Administration System (INADS). However, other possible methods are sending an email and/or page to specified destinations and sending an SNMP trap to a specified network management station.

The Uninterruptible Power Supply (UPS)Maintenance Object(MO) supports the UPS device for each Media Server. This MO’s maintenance software reacts to UPS-generated in-line errors through SNMP traps.

Solution

SNMP Trap from UPS          Event              Definition of Trap                                                ID

Trap (1)                                   #1–8                Alarm string = #1, ACT, UPS, A, Event ID #, MAJ,
Warning, system power failure: Possible UPS exhaustion in 1 - 8 minutes. The UPS battery’s power is in a critically low condition, with an estimated 8 minutes or less of remaining holdover.
A warning is sent to every logged-in user of the server. For troubleshooting procedures, see Events #1–8
 
Trap (3)                                   #11                  Alarm string = #1, ACT, UPS, A, 11, WRN,
upsAlarmShutdownPending                           Miscellaneous trap, e.g., bad battery. For troubleshooting procedures, see Event #11
 
Trap (3)                                   #12                  Alarm string = #1, ACT, UPS, A, 12, WRN,
upsAlarmShutdownPending                           Miscellaneous trap, e.g., bad battery. For troubleshooting procedures, see Event #12
 
Trap (3)1                                  #13                  Alarm string = #1,ACT, UPS, A, 13, MAJ, upsAlarmShutdownImminent                         Miscellaneous trap, e.g., bad battery. For troubleshooting procedures, see Event #13
 
Trap (3)1                                  #14                  Alarm string = #1,ACT,UPS,A,14,MAJ, upsAlarmDepletedBattery                         Miscellaneous trap, e.g., bad battery. For troubleshooting procedures, see Event #14
 
Trap (3)1                                  #15                  Alarm string =#1,ACT,UPS,A,15,MIN, upsAlarmBatteryBad                                                Miscellaneous trap, e.g., bad battery For troubleshooting procedures, see Event #15
 
Trap (3)                                   #16                  Alarm string = #1,ACT,UPS,A,16,MIN, upsAlarmInputBad                                       Miscellaneous trap, e.g., bad battery. For troubleshooting procedures, see Event #16
 
Trap (3)                                   #17                  Alarm string = #1,ACT,UPS,A,16,MIN, upsAlarmTempBad                                     Miscellaneous trap, e.g., bad battery For troubleshooting procedures, see Event #16
 
Trap (3)                                   #18                  Alarm string = #1,ACT,UPS,A,18,WRN, upsAlarmCommunicationsLost                      Miscellaneous trap, e.g., bad battery. For troubleshooting procedures, see Event #18
 
Trap (3)                                   #19                  Alarm string = #1,ACT,UPS,A,19,MIN, upsAlarmBypassBad                                               Miscellaneous trap, e.g., bad battery. For troubleshooting procedures, see Event #19
 
Trap (3)                                   #20                  Alarm string = #1,ACT,UPS,A,20,WRN, upsAlarmLowBattery                                             Miscellaneous trap, e.g., bad battery. For troubleshooting procedures, see Event #20
 
Trap (3)                                   #21                  Alarm string = #1,ACT,UPS,A,21,WRN, upsAlarmUpsOutputOff                            Miscellaneous trap, e.g., bad battery. For troubleshooting procedures, see Event #21
 
Trap (3)                                   #22                  Alarm string = #1,ACT,UPS,A,22,WRN, upsAlarmOutputBad                                             Miscellaneous trap, e.g., bad battery. For troubleshooting procedures, see Event #22
 
Trap (3)                                   #23                  Alarm string = #1,ACT,UPS,A,23,WRN, upsAlarmOutputOverload                         Miscellaneous trap, e.g., bad battery. For troubleshooting procedures, see Event #23
 
Trap (3)                                   #24                  Alarm string = #1,ACT,UPS,A,24,WRN, upsAlarmChargerFailed                            Miscellaneous trap, e.g., bad battery. For troubleshooting procedures, see Event #24
 
Trap (3) – upsAlarmFan         #25                  Alarm string = #1,ACT,UPS,A,25,WRN,
Failure                                                             Miscellaneous trap, e.g., bad battery. For troubleshooting procedures, see Event #25
 
Trap (3) – upsAlarmFuse        #26                  Alarm string = #1,ACT,UPS,A,26,WRN,
Failure                                                             Miscellaneous trap, e.g., bad battery. For troubleshooting procedures, see Event #26
 
Trap (3) –                                #27                  #1,ACT,UPS,A,27,WRN,Miscellaneous trap, e.g.,
upsAlarmGeneralFault                                    bad battery. For troubleshooting procedures, see Event #27
 
Trap (4), Event ID #9              9                      Alarm string = #1, ACT, UPS, A, 9,WRN,
upsAlarmOnBattery                                        Miscellaneous trap, e.g., bad battery. This UPS trap [Event #9] is a miscellaneous environmental alarm sent from the UPS that supports server A. For example, the battery may be bad and should be replaced.
 
Event              Alarm              Alarm Text, Cause/Description, Recommendation
1–8                  MAJ                “upsEstimatedMinutesRemaining” — The UPS is supplying power and there are only 1 - 8 minutes of batter life remaining. The UPS does not have an AC-power source.
1. Restore AC power to the UPS.
 
11                    WRN               “upsAlarmonBattery” — The UPS is drawing power from the batteries. This message should be accompanied by a "upsEstimatedMinutesRemaining"
message.
1.    Restore AC power to the UPS.
12                    MAJ                “upsAlarmShutdownPending” — A shutdown-after-delay countdown is underway (i.e., the UPS has been commanded off).
1. Stop countdown timer. (Can be done via SNMP messages.)
 
13                    MAJ                “upsAlarmShutdownImminent” — The UPS will turn off power to the load in < 5 seconds.
1. Restore AC power to the UPS.
 
14                    MAJ                “upsAlarmDepletedBattery” — If primary power is lost, the UPS could not sustain the current load.
1. Charge or replace the batteries in the UPS. See the appropriate manual for the UPS model.
 
15                    MAJ                “upsAlarmBatteryBad” — One or more batteries needs to be replaced.
1. Replace any defective batteries in the UPS. See the appropriate manual for the UPS model.
 
16                    MIN                 “upsAlarmInputBad” — An input condition is out of tolerance.
1. Provide appropriate AC power to the UPS.
 
17                    MIN                 “upsAlarmTempBad” — The internal temperature of a UPS is out of tolerance. (On the UPS, the “over temperature” alarm indicator flashes, and the UPS changes to Bypass mode for cooling. Either:
1. Look for and remove any obstructions to the UPS’s fans.
2. Wait at least 5 minutes, and restart the UPS.
3. Check for and resolve any fan alarms (Event ID 25) against the UPS.
4. Either:
Change (increase or decrease) the environment’s temperature.
Change the alarming thresholds.
 
18                    MIN                 “upsAlarmCommunicationsLost” — The SNMP agent and the UPS are having communications problems. (A UPS diagnosis may be required.)
1. Behind the UPS in its upper left-hand corner, verify that an SNMP card (with an RJ45 connector) resides in the UPS — instead of a serial card with DB9 and DB25 connectors.
2. Verify that the server is physically connected to the UPS via the RJ45 connector.
3. Verify that the SNMP card is properly administered according to the procedures in its users guide, provided by the vendor.
4. If necessary, replace the SNMP card in the UPS.
5. If the problem persists, replace the UPS, and diagnose it later.
 
19                    WRN               “upsAlarmBypassBad” — The “source” power to the UPS, which (during a UPS overload or failure) also serves as “bypass” power to the load, is out of tolerance — incorrect voltage by > ±12% or frequency > ±3%. This on-line UPS normally regenerates its source power into clean AC power for the load. However, the source power’s quality is currently
unacceptable as bypass power to the load).
1. Verify that the UPS expects the correct “nominal input voltage” from its power source.
2. If so, restore acceptable AC power to the UPS. If not, reconfigure the UPS to expect the correct voltage. See the appropriate manual for the UPS model.
 
20                    WRN               “upsAlarmLowBattery” — The battery’s remaining run time ≤ specified threshold.
1. Restore AC power to the UPS.
 
21                    WRN               “upsAlarmUpsOutputOff” — As requested, UPS has shut down output power. The UPS is in Standby mode.
1. Turn on output power. (Can be done via SNMP messages.)
 
22                    WRN               “upsAlarmOutputBad” — A receptacle’s output is out of tolerance. (A UPS diagnosis is required.)
1. Replace the UPS, and diagnose it later.
 
23                    WRN               “upsAlarmOutputOverload” — The load on the UPS exceeds its output capacity. The UPS enters Bypass mode.
1. Reduce the load on the UPS.
2. Verify that the UPS returns to Normal mode.
 
24                    WRN               “upsAlarmChargerFailed” — The UPS battery charger has failed. (A UPS diagnosis is required.)
1. Replace the UPS, and diagnose it later.
 
25                    WRN               “upsAlarmFanFailure” — One or more UPS fans have failed. Unless lightly loaded, the UPS enters Bypass mode.
1. Replace the UPS, and diagnose it later.
 
26                    WRN               “upsAlarmFuseFailure” — One or more UPS fuses have failed.
1. Replace the UPS, and diagnose it later.
 
27                    WRN               “upsAlarmGeneralFault” — A general fault occurred in the UPS. (A UPS
diagnosis is required.)
1. Replace the UPS, and diagnose it later.
 
NOTE 1 : This event degrades the server’s state of health.

============================================

 

Legacy ID

KB01029622

Additional Relevant Phrases

CM: UPS beeping, no power UPS system beeping and red alarm CM: UPS alarm

Avaya -- Proprietary. Use pursuant to the terms of your signed agreement or Avaya policy