Multiple Scenarios possible.
1. There are two pairs of VSP servers in two sites. The issue is in one site. The primary AES server is inaccessible, the secondary is fine.
(Please note this is not an System Platform with High Availability configured)
Customer was able to access dom0 via ssh and or at the console. (Customer may not know AES is on System Platform therefore sending a tech to connect a keyboard and monitor may be needed to access the console). From the secondary System Platform dom0 to determine if the AES and CDOM (udom) are virtual machines are running execute the following command as root.
[root@AESR1 ~]# xm list
Name ID Mem VCPUs State Time(s)
Domain-0 0 512 4 r----- 258.1
aes 2048 6 71.8
udom 1024 1 60.2
This tells us the AES and CDOM are running and have Time(s) why they are accessible as the virtual machines are running.
At the server console of the System Platform server where the AES and CDOM are not accessible the same command provides different output and we see 0.0 under Time(s) which indicates the AES and cdom are not running at all.
[root@AESR1 ~]# xm list
Name ID Mem VCPUs State Time(s)
Domain-0 0 512 4 r----- 258.1
aes 2048 6 0.0
udom 1024 1 0.0
2. Customer powered down the AES and System Platform servers. When powering server backup no access, AES cannot be pinged. (Customer may not know AES is on System Platform therefore sending a tech to connect a keyboard and monitor may be needed to access the console)
At the server console of the System Platform server where the AES and CDOM are not accessible the same command provides different output and we see 0.0 under Time(s) which indicates the AES and cdom are not running at all.
[root@AESR1 ~]# xm list
Name ID Mem VCPUs State Time(s)
Domain-0 0 512 4 r----- 258.1
aes 2048 6 0.0
udom 1024 1 0.0
If the virtual machines are not running there are many possible scenarios. Collect more information to help troubleshoot using the commands below.
1. get the swversion output from dom0
2. uptime
3. df -h (to check disk space) look specifically for a full partition - /var 100%
4. find / -type f -size +100000000c | xargs ls -lth (use this command to determine very large files on the dom0 server)