CMS: CMS link with CM drop


Doc ID    SOLN292199
Version:    14.0
Status:    Published
Published date:    21 Dec 2019
Created Date:    30 Jun 2016
Author:   
Ming Jiang
 

Details

Any CMS and CM Release.

Problem Clarification

 on Jun 20 10:25:52 [51] PBX  WARNING missing required SETUP message

Mon Jun 20 10:25:53 [52] PBX  WARNING missing required SETUP message

Mon Jun 20 10:26:00 [59] PBX  DATAX     10:26:00 06/20/16   01549 calls  ======

Mon Jun 20 10:26:44 [40] SESS ST8camack timer expired, link is not responding...

Mon Jun 20 10:26:54 [50] SESS ST8camack timer expired, link is not responding...

Mon Jun 20 10:26:54 [50] TCP  CLIENT DISCONNECTED

Mon Jun 20 10:26:54 [50] TCP  client connect failed, errno= 146

Mon Jun 20 10:26:54 [50] TCP  OUT OF ORDER

Mon Jun 20 10:26:56 [50] TCP  client connect failed, errno= 146

Mon Jun 20 10:26:58 [50] TCP  client connect failed, errno= 146

Mon Jun 20 10:27:00 [50] TCP  client connect failed, errno= 146

Mon Jun 20 10:27:02 [50] TCP  client connect failed, errno= 146

Mon Jun 20 10:27:04 [50] TCP  client connect failed, errno= 146

Mon Jun 20 10:27:06 [50] TCP  client connect failed, errno= 146

Mon Jun 20 10:27:08 [50] TCP  client connect failed, errno= 146

Mon Jun 20 10:27:10 [50] TCP  client connect failed, errno= 146

Mon Jun 20 10:27:12 [50] TCP  OPERATIONAL

Mon Jun 20 10:27:13 [51] SESS got a message, link back to normal

Mon Jun 20 10:27:13 [52] SESS bad sequence, 147 received, 69 expected

Mon Jun 20 10:27:13 [52] TCP  CLIENT DISCONNECTED

Mon Jun 20 10:27:13 [52] TCP  client connect failed, errno= 146

Mon Jun 20 10:27:13 [52] TCP  OUT OF ORDER

Mon Jun 20 10:27:13 [52] TCP  client connect failed, errno= 146

 

Cause

 01400 Mon Jun 20 10:26:54 2016  SRC_ERR_NUM=00000 PROCESS=spi            PID=001

686              timertask.c:00400 SEVERITY=INFO  ACD=01 DUPS=00004 SPI session

error: SESS connectivity timeout: warning LINK NOT RESPONDING

 

01350 Mon Jun 20 10:26:55 2016  SRC_ERR_NUM=00001 PROCESS=chipr          PID=025

225                   main.c:00351 SEVERITY=ERROR ACD=01 DUPS=00001 GENERAL erro

r internal to process: chip exited with error : LOGIN=M232413

 

01400 Mon Jun 20 10:27:13 2016  SRC_ERR_NUM=00000 PROCESS=spi            PID=001

686                    x25.c:00202 SEVERITY=INFO  ACD=01 DUPS=00005 SPI session

error: Got a message Link back to normal

 

01400 Mon Jun 20 10:27:13 2016  SRC_ERR_NUM=-00002 PROCESS=spi            PID=00

1686                msgtask.c:00742 SEVERITY=ERROR ACD=01 DUPS=00006 SPI session

error: SESS bad data message sequence number

 

01401 Mon Jun 20 10:27:13 2016  SRC_ERR_NUM=00146 PROCESS=spi            PID=001

686                 tkdata.c:00472 SEVERITY=INFO  ACD=01 DUPS=00001 SPI pbx erro

r: 250 calls ignored

 

01400 Mon Jun 20 10:27:13 2016  SRC_ERR_NUM=00000 PROCESS=spi            PID=001

686                sessmsg.c:00082 SEVERITY=ERROR ACD=01 DUPS=00007 SPI session

error: data collection session is down

 

01350 Mon Jun 20 10:27:13 2016  SRC_ERR_NUM=00001 PROCESS=link_watch     PID=001

627              AcdStatus.c:00157 SEVERITY=INFO  ACD=01 DUPS=00001 GENERAL erro

r internal to process: Link is down: acd  1

From the CMS spi.err and elog, we can know that CMS are trying to get touch with CM all the time, but did not get any response from CM. "ST8camack timer expired, link is not responding..." "client connect failed, errno= 146". "SESS connectivity timeout: warning LINK NOT RESPONDING". All these are signs that CM side or network could cause the issue.

 

In some instances if an issue with CM to CMS routing then there may be no Ping, telnet or SSH ability, or if noted the trace route shows traffic die or stop at network switch, from either direction, indicating a possible switch issue, or again routing

Solution

 Need to ask CM engineer to check CM error logs.

Also check with network team for network errors.

Sometime you will see this:

Mon Aug  8 04:26:44 [41] TCP  cannot get address information, errno=11

Mon Aug  8 04:26:44 [41] TCP  OUT OF ORDER

 

Mon Aug  8 04:26:45 [41] TCP  cannot get address information, errno=4

this mostly means that network has some problem and can not resolve the opponent IP address. This could also be just a network change which causes the link to drop.

 If this continues to occur ensure that the speed and duplex settings on bot the NIC and the other end of the connection, whether this be a network port or a direct CM connection are both set to auto-negotiate.

 

If an HA system you may wish to verify logs on the secondary

Additional Relevant Phrases

CM and CMS link dropped. CMS Data link failure. CMS Data Collection down.

Avaya -- Proprietary. Use pursuant to the terms of your signed agreement or Avaya policy