IC Dataserver encounters read write errors. Email Server Crashes.


Doc ID    SOLN274757
Version:    1.0
Status:    Published
Published date:    07 Sep 2015
Author:   
Ashok Joshi
 

Details

IC Dataserver goes down frequently casing all IC services to be impacted
The Data server failes to read and write to the DB as can be seen in the logs. The database TCP connectivity although is alright as can be seen form the network response.

Problem Clarification

Dataserver having problems read/write to the SQL database.

@20150806 12:33:39.112 #2812   <Info> 

!DbConnectionPool::setDbState - DB Error is State:01000,Code:10054,Error:[DBNETLIB]ConnectionWrite (send()).

 

Email Server crashes.

20150817 12:41:31.444 #2492   <Tools> [.\tools.cpp@96]

ICEmail server [ICEmail_SYD_Email2] Assign with WACD failed, will retry Assign with Functional WACD server.

@20150817 12:41:31.444 #2492   <FatalEXCEPTION> [.\mttoolkit.cpp@8585]

thread=2492

!Server is being forced to exit due to an exception within onEvent

@20150817 12:41:31.475 #2492   <Report QWPROXY-073> [.\errors.cpp@1056] 

               t@2492:               "Attempting Deassign: DataServerMSSQL.Deassign call on session 8c36d30 (Default Connection session: 392f078)" 

Cause

NTFS Error. FileSystem Problems.

It was determined to be caused by a coinciding time where a VM snapshot was taken. The VM snapshot causes errors in the server NTFS volume and hence causes i/o errors. 

 Windows Events of the affected server encoutering these errors:

Error 1: The default transaction resource manager on volume \\?\Volume{ae3a0222-3984-11e5-98de-00505692327c} encountered a non-retryable error and could not start.  The data contains the error code. 

Error 2 :  Volume Shadow Copy Service error: Unexpected error DeviceIoControl(\\?\fdc#generic_floppy_drive#6&2bc13940&0&0#{53f5630d-b6bf-11d0-94f2-00a0c91efb8b} - 0000000000000498,0x00560000,0000000000000000,0,00000000003C9710,4096,[0]).  hr = 0x80070001, Incorrect function.

 

 Error 1: The above errors are a FileSystem error which happens when a snapshot of the server is being taken. The File system error in turn causes I/O problems with AIC(or any other application)

Error 2: The second error is also a VMware generated error due to snapshot.

Solution

Recommend customer not to take the VM snapshots on a time when the volume is higher as I/O intensive operations take place at peak hours.

The above issues are addressed by VMware in form of patches. Recommend customer to have those patches in place. 

The email server crash happened under this scenario under an exception. This will be addressed in the Email Server in version 7.3.4 so that email server does not crash in this scenario and only raises alarms.


Avaya -- Proprietary. Use pursuant to the terms of your signed agreement or Avaya policy