IQ: How to troubleshoot time issue's in IQ


Doc ID    PRCS100944
Version:    3.0
Status:    Published
Published date:    23 Aug 2018
Created Date:    20 Sep 2017
Author:   
Trusfy Pei
 

Abstract

Troubleshooting time issues

Generally, trouble with time will manifest itself as link trouble. Either the link with CM has not come up, or will not stay up. Critical time difference alarms may show in the IQ Alarm Manager viewer for this CM. The steps outlined below may help in diagnosing and fixing the problem. If you end up making changes to time on an IQ box with IQ installed and running, you must reboot the system to restore its sanity (and Oracle’s sanity too if on this host.)

Body

1. Is the link up, or does it start to come up? cd /var/log/Avaya/CCR/DP* # for the correct container tail hex_dump_all.log

There should be some messages in here if the link ever came up. If new messages appear now and then here, then note what they are. If you see a sequence of RTCS, XSTAT, TIME messages but no more, it is possible that the time is at fault. (There can also be other reasons for this condition.) If there are lots of other messages, then the link has come up much farther, and time may be your problem. In hex_dump_all.log you can grep for 'TIME20', from the latest time20 message you shoud be able to get the CM time. The first time set in TIME20 message is IQ host time and the second one is CM time. This message will also give you information on the exact time/offset diference.

2. Look at the DP_avaya.debug.log file. If somewhere after Pump Up you find a line mentioning “TIME_DIFFERENCE_CRITICAL”, then you have a time problem.

2007-11-02 10:10:02,488 com.avaya.ccr.eventmgt.cmadapter.applayer.apptasks.TIME20Task TIME20Task$AlarmCategory [cm02PresAndAppLayer] 553 TIME20Task.java INFO Alarm_Critical:TIME_DIFFERENCE_CRITICAL (cm02)

3. Turn on more debug for DP container. Find the right place to change the logging properties (shown here as ccr2), and add the lines as shown. Easiest way to find the correct location for relevant log4j.xml is to perform a

ps -ef | grep <DP UUID>

This will show you a large string as output take a look at the last line and you will see ccr<n> (it will be ccr1 or ccr2 or ccr3). Go to that location (below, replace <n> with the number)

cd /opt/coreservices/jboss-4.0.3SP1/server/ccr<n>/conf

vi log4j.xml

Search for the string 'eventmgt', You will find lines such as below, change the priority value to DEBUG or ALL and save the file. You may also see the lines commented in that case un-comment them.

<category name="com.avaya.ccr.eventmgt.cmadapter.applayer.apptasks">

<priority value= "INFO"/>

</category>

IMPORTANT NOTE: Never leave the DP logs in debug mode, once you are done redo the 'Priority Value' to INFO. COMMENTING THE LINES WILL NOT WORK

Wait for about 2 minutes. Then look at the most recent line of output in the current DP log file matched by:

grep “Switch Time SnapShot” DP*.log

which will look like:

2007-11-02 10:15:02,551 com.avaya.ccr.eventmgt.cmadapter.applayer.apptasks.TIME20Task [cm02PresAndAppLayer] 482 TIME20Task.java DEBUG Switch Time= 1194019980000, CCR Time= 1194020102551, CCR Time Zone= Mountain Standard Time, Switch Time SnapShot: Switch:MilliSeconds Since 1970= 1194019980000, Switch:Year= 7, Switch:yday= 306, Switch:hour= 10, Switch:minutes= 13, Switch:tzoffset= 0, Switch:tzoffsetHours= 6, Switch:tzoffsetMinutes= 0, Switch:daylightSavings= 1

This line tells you several things. The raw UTC time of both the IQ and the CM are shown in the milliseconds since 1970 format used internally by the software. These fields are “Switch Time=” and “CCR Time=”. The difference between these two is the number of milliseconds different between the two boxes. It should be within a few seconds (or a few thousand milliseconds). The example above shows a difference of 122551 milliseconds or about 122 seconds, which is just over 2 minutes.

The switch time as obvious is CM time and CCR time is the IQ time. You can convert the unix time to UTC time using tools such as one at http://www.onlineconversion.com/unix_time.htm and get the difference in easy readable format.

4. If the difference found above is around 1 hour (3600000 milliseconds), then you likely have a daylight savings time problem, especially if the current date should be subject to daylight savings time. Use the above steps for fixing time and time zones on the IQ, and time, time zones and daylight savings time rules on the CM/DADS.

How to verify if NTP sync is working

1. Ensure the IQ host is in sync with an NTP server by running the following commands:

  • ntpstat;echo $?

A return value of 0 for the second command indicates success. Note that the IP address shown when ntpstat is run should not be the same IP address as the IQ host (nor should it be a localhost equivalent beginning with 127).

2. If NTP has not been configured, the file /etc/ntp.conf.avaya should be present from the original installation of the IQ software. Back up /etc/ntp.conf and replace it with ntp.conf.avaya:

  • cp /etc/ntp.conf /etc/ntp.conf.backup
  • cp /etc/ntp.conf.avaya /etc/ntp.conf

3. Edit the last two lines of /etc/ntp.conf replacing the default IP address with the user's NTP server (this will likely need to be provided by the user)

4. Restart the NTP daemon to reload the configuration file:

  • service ntpd restart

5. If the IQ host is far out of sync with the NTP server, NTP brings the client back in sync very slowly. To quickly set the IQ host time to the correct time, use the command ntpdate:

  • service ntpd stop
  • ntpdate
  • service ntpd start

These steps may take up to 30 minutes to sync.


6. Similar steps may need to be performed on the switch if the switch is not synced with the NTP server.


 

Link failures in IQ when the DP/DC buffer is full

IQ Data Collection has the capacity of buffering call data to a defined extent. In case of an issue when the IQ data processing is down but the data collection is up IQ buffers the data received from CM in buffer files until the IQ DP comes back or the buffer is full. In case of buffer full IQ will drop the ACD link since it has no space to store the data anymore. This is a probable reason for link failure.

The buffer file location is /opt/Avaya/CCR/data/buffer

In case of buffer full the root cause has to be targeted and fixed as soon as possbile. The buffer files can be transferred to another location to cleanup buffer so that some more data can be captured. Though this is not a recommended way and preferably be done after consultation with CPE.

Link failure when the database is down

As mentioned in the previous section here also in case DB goes down IQ will start data buffering and will continue to do so until the DB is back or buffer is full, whichever occurs first.

Additional Relevant Phrases

IQ time greater then 300 seconds, ntpd dead but pid file exists, TIME20

Avaya -- Proprietary. Use pursuant to the terms of your signed agreement or Avaya policy