Application Enablement Services (AES),Virtual System Platform (VSP): AES not working, CDOM is inaccessible


Doc ID    SOLN201928
Version:    7.0
Status:    Published
Published date:    05 Jul 2019
Created Date:    12 May 2012
Author:   
Javier Aranguren
 

Details

AES and CDOM server was no longer accessible from command line, cannot be pinged and webpage (OAM) not accessible.

AES, CDOM, DOM.  This is an HA pair.

System Platform 6.0.1.0.5

Problem Clarification

Multiple Scenarios possible.

1. There are two pairs of VSP servers in two sites. The issue is in one site. The primary AES server is inaccessible, the secondary is fine. 
 (Please note this is not an System Platform with High Availability configured)

Customer was able to access dom0 via ssh and or at the console.  (Customer may not know AES is on System Platform therefore sending a tech to connect a  keyboard and monitor may be needed to access the console).  From the secondary System Platform dom0 to determine if the AES and CDOM (udom) are virtual machines are running execute the following command as root.

[root@AESR1 ~]# xm list
Name                                        ID   Mem VCPUs      State   Time(s)
Domain-0                                  0   512     4                  r-----    258.1
aes                                                 2048   6                              71.8
udom                                             1024    1                             60.2 

This tells us the AES and CDOM are running and have Time(s) why they are accessible as the virtual machines are running.

At the server console of the System Platform server where the AES and CDOM are not accessible the same command provides different output and we see 0.0 under Time(s) which indicates the AES and cdom are not running at all.

[root@AESR1 ~]# xm list
Name                                        ID   Mem VCPUs      State   Time(s)
Domain-0                                  0   512     4                  r-----    258.1
aes                                                 2048   6                              0.0
udom                                             1024    1                             0.0
 

2.  Customer powered down the AES and System Platform servers.  When powering server backup no access, AES cannot be pinged.  (Customer may not know AES is on System Platform therefore sending a tech to connect a  keyboard and monitor may be needed to access the console)

At the server console of the System Platform server where the AES and CDOM are not accessible the same command provides different output and we see 0.0 under Time(s) which indicates the AES and cdom are not running at all.

[root@AESR1 ~]# xm list
Name                                        ID   Mem VCPUs      State   Time(s)
Domain-0                                  0   512     4                  r-----    258.1
aes                                                 2048   6                              0.0
udom                                             1024    1                             0.0
 

If the virtual machines are not running there are many possible scenarios.  Collect more information to help troubleshoot using the commands below.
1. get the swversion output from dom0
2. uptime
3. df -h (to check disk space) look specifically for a full partition - /var 100%
4. find / -type f -size +100000000c | xargs ls -lth (use this command to determine very large files on the dom0 server)

 

Cause

In this case running the above checks on dom0, we know this is VSP 6.0.1.0.5, xm list shows us AES and cdom not running, uptime tells is dom0 recently restarted, and df -h tells us /var is full at 100% disk utilization.

Using the find command above we determined there is large file in the /var/account/pact (usually around 4GB).  The main issue is process accounting logs are not rotating thus filling up the disk.

Solution

On dom0 as root user:

Execute the following command to clear the large file:  rm –Rf /var/account/pact
Check disk space with the command df -h
Use these commands to restart the psacct service:
service psacct restart
 

Start the CDOM and AES virtual machines on dom0.
xm start udom
xm start aes

This is a known issue on VSP 6.0.1.0.5 and is strongly recommended to upgrade this version to a minimum of VSP 6.0.3.10.3.  AES 5.2.3/5.2.4 is required for this release of System Platform.

Additional Relevant Phrases

SOLN201928

Avaya -- Proprietary. Use pursuant to the terms of your signed agreement or Avaya policy