AACC HA MAS error

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • sfitzg
    Guru
    • Jul 2010
    • 190

    AACC HA MAS error

    What is the correct state of the backup MAS in an HA environment (2x MAS on Linux). The primary MAS changes from HA being unavailable to "No alarm" when the backup comes up but it looks like the backup is unable to communicate with the primary as per the screenshot attached. The Primary MAS is licensed and has the backup MAC address licenses as well.

    The backup shows "Pending Update" ...

    Any thoughts would be appreciated.
    Attached Files
    Last edited by sfitzg; 06-06-2012, 07:56 PM.
  • stphnwd
    Brainiac
    .
    • Jan 2011
    • 52

    #2
    Your AMS in an HA pair need to be licensed appropriately for example:

    The bolded plicd lines is the important part this lets the LM know that there is a Primary and Backup license server

    # plicd 1.2 00:03:73:f6:5c:cc 00:03:73:F6:5C:18 (1) 360 secs
    # plicd 1.2 00:03:73:f6:5c:18 00:03:73:F6:5C:CC (1) 360 secs
    # __sip-annc::sess 1.0 00:03:73:f6:5c:cc (881) 360 secs
    # __sip-annc::sess 1.0 00:03:73:f6:5c:18 (881) 360 secs
    # __sip-dialog::sess 1.0 00:03:73:f6:5c:cc (48) 360 secs
    # __sip-dialog::sess 1.0 00:03:73:f6:5c:18 (48) 360 secs
    # __sip-conf::sess 1.0 00:03:73:f6:5c:cc (185) 360 secs
    # __sip-conf::sess 1.0 00:03:73:f6:5c:18 (185) 360 secs
    # inst::auth 1.0 00:03:73:f6:5c:cc (6) 360 secs
    # inst::auth 1.0 00:03:73:f6:5c:18 (6) 360 secs
    Stephen Wood

    Comment

    • sfitzg
      Guru
      • Jul 2010
      • 190

      #3
      Thanks for that,

      Yes my license file would appear to be correct and you have also answered a question I had around upper and lower case in the MAC addresses.

      However I still have a problem. I believe the problem is that replication is not working or more likely not configured correctly, and therefore the license is getting through to the backup server.

      Comment

      • sfitzg
        Guru
        • Jul 2010
        • 190

        #4
        Making progress, now getting relevant errors in the backup server. Mirror messaging connection unavailable as shown in the attached. Now just got to work out how to fix it.
        Attached Files

        Comment

        • stphnwd
          Brainiac
          .
          • Jan 2011
          • 52

          #5
          That is a good message in a way. Have you restarted the both servers? Primary first and then the backup?
          Stephen Wood

          Comment

          • sfitzg
            Guru
            • Jul 2010
            • 190

            #6
            I have restarted the servers and even gone as far as reinstalling the backup server.


            Trolling through log dumps, the only thing I can find wrong is the backup server trying to open a connection to the primary server on port 4004, however the primary server is only listening to this port on localhost so the connection fails.

            In the Cstore debug log, the backup and primary seem to communicating OK but then a CS_MIRROR_API_DOWN alarm is raised. Does anybody know if this is related to port 4004 above?

            (08 13:34:33.550)<I,CStore,NWDBTrigger Thread,00000000-0000-0000-0000-000000000000> (NWIniConfig) Dynamic config change : HA_CFG_SERVICES_RUNNING : [0 -> 1].
            (08 13:34:33.570)<I,CStore,NWDBTrigger Thread,00000000-0000-0000-0000-000000000000> (NWIniConfig) Dynamic config change : HA_CFG_NODE_STATE : [HBM_BACKUP_SHUTDOWN -> HBM_BACKUP_SEARCHING].
            (08 13:34:33.571)<I,CStore,NWDBTrigger Thread,00000000-0000-0000-0000-000000000000> (NWIniConfig) Dynamic config change : HA_LOCAL_STATUS : [Shutdown -> Searching].
            (08 13:34:33.571)<I,CStore,NWDBTrigger Thread,00000000-0000-0000-0000-000000000000> (NWIniConfig) Dynamic config change : HA_CFG_NODE_STATE : [HBM_BACKUP_SEARCHING -> HBM_BACKUP_ACTIVE].
            (08 13:34:33.572)<I,CStore,NWDBTrigger Thread,00000000-0000-0000-0000-000000000000> (NWIniConfig) Dynamic config change : HA_LOCAL_STATUS : [Searching -> Active].
            (08 13:34:33.580)<I,CStore,NWDBTrigger Thread,00000000-0000-0000-0000-000000000000> (NWIniConfig) Dynamic config change : HA_CFG_SERVER_ACTIVE : [0 -> 1].
            (08 13:34:33.589)<I,CStore,NWLoResTimer,00000000-0000-0000-0000-000000000000> NWOmManager::Audit: Delete om array list: 66 -> 70
            (08 13:34:33.589)<I,CStore,NWLoResTimer,00000000-0000-0000-0000-000000000000> NWOmManager::Audit: Create om array list: 70
            (08 13:34:34.433)<I,CStore,MirrorClient Thread,00000000-0000-0000-0000-000000000000> Raising Alarm: CS_MIRROR_API_DOWN
            (08 13:34:34.443)<I,CStore,MirrorClient Thread,00000000-0000-0000-0000-000000000000> NWEventLogger: Logging Event: Alarm Activated: Mirror Messaging Connection Unavailable (Id: 352)

            Comment

            • stphnwd
              Brainiac
              .
              • Jan 2011
              • 52

              #7
              Did you ensure that your replication account username and password is the same on both servers? Then reboot the backup.

              When all is good both servers should show no alarms while both servers are UP.
              Stephen Wood

              Comment

              • stphnwd
                Brainiac
                .
                • Jan 2011
                • 52

                #8
                Another thing to check is your network configuration in the Element manager you want to ensure that all of your services are running on the LAN adapter and not the local host loopback.
                Stephen Wood

                Comment

                • sfitzg
                  Guru
                  • Jul 2010
                  • 190

                  #9
                  Thanks for your input. I resolved the problem in a few minutes once I was point to an April 2012 document (NN44400-802), where I had been work from a March installation/commissioning guide.

                  The problem was that while the two MAS were communicating with each other using FQDN, the mirroring function requires hostname resolution not FQDN. Adding the hostnames to both MAS fixed the problem.

                  Simon

                  Comment

                  • aapawan123
                    Hot Shot
                    • Jan 2011
                    • 18

                    #10
                    Hi Simon,

                    I am in the process of installing AMS HA as well. We ran into an issue where the 'maspvicheck' (PVI checker) fails and returns hardware error. Upon reviewing the log, the pvi checker could not obtain the MAC address of the bonded NIC (bond0). To prove that this was the case, we have the system reverted back to pre-bonding. Ran the maspvicheck again and this time it succeeded. In your AMS HA, did you create NIC bonding? If so, did the maspvicheck ran successfully with any negative results?

                    Thanks,

                    Alan

                    Comment

                    • sfitzg
                      Guru
                      • Jul 2010
                      • 190

                      #11
                      Hi Alan

                      Been a long time and different product. I don't have my laptop but one of the documents states that bonding is not supported.

                      Simon

                      Comment

                      • aapawan123
                        Hot Shot
                        • Jan 2011
                        • 18

                        #12
                        Hi Simon,

                        Good to hear from you. Yes, different product and too much unclear and at times misleading information. Anyway, based on the Mission Critical High Availability document (http://origin-support.avaya.com/css/...ents/100160292) on page 38 states that NIC teaming is recommended for HA environment. The version of this document is April 2012. Note that there are some differences on instructions from the Installation Guide document for AMS and the Mission Critical HA document. When you get a chance, can you send me the link of the document where you have NIC bonding is not supported? I would appreciate it.

                        Thanks,

                        Alan

                        Comment

                        • sfitzg
                          Guru
                          • Jul 2010
                          • 190

                          #13
                          Sorry Alan, I can't find it again. All I remember is that it said NIC bonding was not supported on Linux and I thought great, I don't have to work out how to do it. I will keep searching but will also see what sort of response you get here as I may have to update my configuration.

                          Simon

                          Comment

                          • stphnwd
                            Brainiac
                            .
                            • Jan 2011
                            • 52

                            #14
                            Page 59 of the SP5 Release Notes:
                            AMS Linux: AMS HA does not operate with Linux NIC bonding enabled
                            Stephen Wood

                            Comment

                            • aapawan123
                              Hot Shot
                              • Jan 2011
                              • 18

                              #15
                              Hi stphnwd and Simon, I certainly appreciate all of your response. This helps greatly.

                              Regards,

                              Alan

                              Comment

                              Loading