CM 6.3.16:RCA for AES upgrade from 6.3.1 to 7.1.3 resulted in TSAPI processes restarting repeatedly.


Doc ID    SOLN329254
Version:    3.0
Status:    Published
Published date:    26 Aug 2019
Created Date:    07 Sep 2018
Author:   
Arif Neemuchwala
 

Details

However, at the time of the TSAPI resets a pcap conducted from the CM server to the AES server resulted in the following problem zero window messages from CM to the AES. As seen here:
[root@oakpuhpbxcm-01 ~]# tshark -i eth0 host 10.xxx.xx.xx and port 8765 -t ad -n -d tcp.port==8765,ssl
Running as user "root" and group "root". This could be dangerous.
Capturing on eth0
2018-08-31 04:27:20.631662 10.xxx.xx.xxx -> 10.xxx.xx.xxx TLSv1 [TCP ZeroWindow]continuous TSAPI resets on the CTI linksThe process failures indicate repeated failures by the Communication Manager to read from the TCP stack due to the GIP process not reading them out of its TCP receive buffer. Hence the zero window was sent to the AES. The GIP application did not process the inbound TCP messages until a reset system 4 was conducted on the CM
A review of the CM logs showed a procedure error that was , at the time, considered a definitive indication of the software error that caused the CM GIP process to fail to read from the TCP stack .
After a review of the CM code , we can see that the specific procedure error(20180831:042741508:324287511:gip(29543):MED:[ IP_PROC_ERR: pro=7204, err=203, seq=24085, da1=3(0x3), da2=10.x.x.x])
is described as so:
* Since the client is attempting to connect again,
* it is assumed that the current connection is stale.
* Send an error notification/new connection made error.
* to the old connection and close it (the close will
* clean up the Esai_info[] entry). Then continue
* (which will replace it with the new one).At the time, there were no corresponding resets, restarts or interchanges seen within the communication manager as you can see here.

Problem Clarification

repeated cti link resets

Cause

The possible cause was found in consultations with AES engineers we have found that TCP signaling requires the TLS version to be at v1.2 . The packet captures taken at the time of the failures indicate the TCP requests from the 6.3 CM to be TLSv1
2018-08-31 04:27:20.631662 10.200.46.xxx -> 10.200.13.xxx TLSv1 [TCP ZeroWindow]
It has been determined that the cause of the CM’s failure to load its GIP process was because its TCP buffer stack had filled up due to what appear to be a mismatched TLS version (6.3 CM at TLSv1 and AES 7.1 at TLSv 1.2) during the time of the AES upgrade. The only way to restart the CM GIP process and clear out the TCP stack (from all the failed zerowindow messages).

Solution

Reset system 4.

Additional Relevant Phrases

TSAPI restarts , cti link resets, CM 6.3.16 , AES 7.1.3, TLSv1 [TCP ZeroWindow] ,

Avaya -- Proprietary. Use pursuant to the terms of your signed agreement or Avaya policy