WEA Level 2 ESS Troubleshooting Guide

Turn on ESS Logging
Specific Errors In Logs
Obtaining a Java Dump of ESS
Using Custom Templates


Domino Server showing as stopped
Please perform these initial checks for me:
                  
1. Notes Client on WEA server is at least 6.0.3.                       
2. nslookup for rhnotes03.rh.dk from WEA server returns only one IP address.                                                               
3. ping of hostname finds server from WEA server                       
4. confirm that Domino server shows as started if IP address is used.  

another note

In addition to ESS high tracing, have them add the following line in ESSConfig.xml to the com.ibm.caf.exchange.mapi.BackEndManager component.
<PROPERTY NAME="MAPIDebug" VALUE="1"/>

we will need the trace log and esserrors.log for the whole scenario from the time the contact is first sync'd to the point they see the duplicate contact.

It would be helpful to know what extraneous actions were done as well. For instance, did they have Outlook running on the WEA server or on their desktop while they were sync'ing? Did they create the contact and then go back and add the birthday or was the birthday added at creation time? What did they look at or open, etc between the time of creation and sync and during the sync.



Tim,
      This is the SQL0803 problem, you cannot have 2 different portal id's pointing to the same backend mail db.

From the release notes...

Not all records are synchronizing to the device
If an end user fails to receive some or all PIM data during initial synchronization, check the <esshome>\logs\ESSErrors.log file for the following error:
--- SQL Exception ---
Message : [IBM][CLI Driver][DB2/NT] SQL0803N One or more values in the INSERT statement, UPDATE statement, or foreign key update caused by a DELETE statement are not valid because the primary key, unique constraint or unique index identified by "3" constrains table "WPSADMIN.TS_GUD_TIMESTAMP" from having duplicate rows for those columns. SQLSTATE=23505
SQLState : 23505
ErrorCode : -803
This error means that there are two portal users trying to synchronize with the same backend userid, which is not supported. This query shows which userids are configured to sync to the same backend user:
select distinct a.USERID, a.EMAIL_SRV_FLDR from CAF.PIM_USRPRF a where a.EMAIL_SRV_FLDR in (select b.EMAIL_SRV_FLDR from CAF.PIM_USRPRF b where b.USERID != a.USERID)
To validate that this is the problem you are having, do the following:
1.    Log on to the portal as an administrator.
2.    Go to the Everyplace Synchronization tab and select Server Settings, and set synchronization tracing to High.
3.    Synchronize again as the user who had the error and, for that user, check for the SQL0803 error between the validateLogin and NotifyLogout.
4.    If you see the error, reset both users to an initial synchronization state by running
<esshome>\CAF\bin\esscmd deleteuserdata userid
5.    Reset the data on the devices.




This query will show which userids are configured to sync to the same
backend user:
select distinct a.USERID, a.EMAIL_SRV_FLDR from CAF.PIM_USRPRF a where
a.EMAIL_SRV_FLDR in (select b.EMAIL_SRV_FLDR from CAF.PIM_USRPRF b where
b.USERID != a.USERID)

and done deleteuserdata for the userids returned from the query and they still have the problem then lets try the same stuff we did for Leica when they had the problem.

Try this for each of these GUIDs (8071 8072 8073 8074 8075 8076 8077 8078 8079)
select ts_account_id from ts_gud_timestamp where ts_gud_record_id = GUID

If there is no record in account with account_id equal to that ts_account_id, then the delete_user_data missed
that record somehow.  If there is a record in account with that account_id, then there was yet another account
configured to the same backend contact db.
 
In either case, there probably are other records in ts_gud_timestamp with that other ts_account_id.  In the first
case, for a one-time fix, delete all the records with that ts_account_id.  The only way I see how to figure out what
the problem is more generally is to do dump out all the records in ts_gud_timestamp of  a user, then doing a
delete_user_data, and seeing what records aren't getting deleted.
For the account id that is returned from the above query, do this second query before and after running deleteUserData
"select * from ts_gud_timestamp where ts_account_id = xxxxx"
where xxxxx is the value of ts_account_id from the first query above.
 
In the second case, doing a delete_user_data on the other account might fix it.  You might want to dump out all
the records in ts_gud_timestamp for that user before and after doing the delete_user_data also, just in case there
is a problem with delete_user_data, as in the other case.

ST with Peter:
There's a technote about problems with 2 portal users using the same backend userid. Unicco says they are seeing the logs reflecting that problem, however when  they execute the SQL statement querying the DB for redundant backend users there are no records found. Is the SQL statement in the technote for doing this queury to be used verbatim (no changes at all, just type it in)?
Pete Gamble    let me check...
Pete Gamble    Is this what they did?   select distinct a.USERID, a.EMAIL_SRV_FLDR from CAF.PIM_USRPRF a where
a.EMAIL_SRV_FLDR in (select b.EMAIL_SRV_FLDR from CAF.PIM_USRPRF b where
b.USERID != a.USERID)
shayden@us.ib...    Yes, verbatim.
Pete Gamble    is it possible that one of the users has already been deleted?
shayden@us.ib...    At first they didnt bother running the query, they just deleted the user who could not synch (calender, etc.), and tried to resync. I had them run the query (result was "0 records found") then deleteuserdata, hard reset the device and reload, then attempt the synch. The user is supposed to have lotsa data, but none was synced, and the SQL errors are in the ESS_errors.log. I have asked for the logs to be sent to me.
Pete Gamble    do they remember the name of the other user that they already deleted, they need to do deleteuserdata for that user too
shayden@us.ib...    They did the deleteuserdata before running the query. They probably dont know if there was another user (or especially who it is).
Pete Gamble    i thought you said the deleted the other user
shayden@us.ib...    No, they just deleted the user that was having problems syncing.
shayden@us.ib...    (That's what they said, I can re-confirm if need be).
Pete Gamble    ok, I am off my call, how about I call you
shayden@us.ib...    k, Tie 441-7375
shayden@us.ib...    Are you still there?
Pete Gamble    yep
shayden@us.ib...    The note you sent,  these operations are to be performed on the TSS1 DB, right?
Pete Gamble    sorry about that, yes
shayden@us.ib...    and there is an "account" table in that DB?
Pete Gamble    yes
shayden@us.ib...    I'm on the phone with the customer now, and is having trouble understanding these instructions. Do you have a few minutes to teleconf in to answer?
Pete Gamble    sure
shayden@us.ib...    Your ext.
Pete Gamble    3-2018
Pete Gamble    validatel|end session for|ger getupd|ger putupd|long guid :



When we saw the problem before it was when customers were trying to use Domino LDAP instead of LDAP.

What are they using for a directory?

If they are using something like Domino LDAP, then you can check this...
Have them check in <was_home>/lib/app/config/um.properties. The setting for user.fbadefault.filter identifies the attribute that we use to look up the user. We have seen a problem when it is set to cn. You can correct this by changing it to uid ( user.fbadefault.filter=uid). Portal will need to be restarted after making the changed.


If they are not using Domino LDAP, then we need to get a UserGroupInfoWebService servlet trace to determine why FAILED is being returned. Here are the instructions to get the trace.

1. Open the Websphere Administrator's Console
2. In the left hand pane, navigate to and select the Websphere Portal application server
3. In the right hand pane, click on the Services tab, select Trace Services, then click Edit Properties
4. Click the ... beside Trace Specifications
5. Expand Components->com->ibm->pvc->ent->us and right click on UserGroupInfoWebServiceImplServer and select All.
6. Make sure Trace Output is ring buffer.
7. Click OK
8. In the right hand pane, click Apply

9. Recreate the problem.

10. Go back to Trace Services under Websphere Portal and click Edit Properties
11. Click the ... beside Trace specifications
12. Type a filename (i.e. C:\temp\US.log) to contain the trace information beside Dump File Name then click Dump.

Note: To have the trace output written directly to a file, in step 6 change the Trace Output to Specify and type in the name of the trace file. You will need to restart WAS for this to take affect before you recreate the problem.




Turn on ESS Logging

Logging At ESS Server

1) To turn ESS tracing on, log into Portal with an admin id and go to : Everyplace Synchronization tab and then Server Settings tab. Change the Trace Log level to high and then click save.  Then recreate the problem.

Provide the following logs and let us know the userid having the problem and the approximate time of the recreate:

     \IBMSyncServer\logs
        ESS_Messages
        ESS_Traces
        ESSErrors
        appserver-stderr.log
        appserver-stdout.log

HTTP Server Logs

In the *HTTPServer/logs directory, provide the "errors.log" and "access.log".

To Enable the servlet log for 1 user on post FP5 drivers

1) On the device, browse to the URL "http://hostname/ess/SyncML?trace=3".

2) On the server, in "config_tss.txt" set the value "INTERNAL_DEBUG_LEVEL=-1".

3) You will then get just starfish tracing (if you are using the debug file), for just that user (peterG (32018)    restart of servlet)


Domino Server Logs ("dominoca.trace")

Add this line to the Lotus\Notes\notes.ini file: DCALOG=1

This will produce a file in the root directory of C: called "dominoca.trace".  This file contains information about the Notes API communication used by ESS to talk to Domino.

WEA Client Logs (XML logs)

On the PPC device, bring up the syncML client which is in My Device -> Program Files -> Startfish -> syncML -> smlCEClient. Then go to tools, setup, then tracing tab.  Select the 'Set trace on' and then ok.  Recreate the problem.  The logs on the device can be found in the XMLlogs directory.  Send all of them.

There is a technote with how (starting in fp4) it is possible to change ESS tracing without stopping and restarting the server. The following info was obtained from IDD:

It's an internal technote:

Defect 67547

#1152894
http://www-1test.ibm.com/support/dcf/preview.wss?host=d02dbs88.southbury.ibm.com&db=support/swg/swgtech.nsf&unid=7EB6A7F6462C437485256DE3006B8C40&taxOC=SSCVS62&MD=2003/11/19%2014:11:36&sid=



 
Specific Errors In Logs

Log:    "(4756)Session::impersonate() - LogonUser(SRVWEA,UFP,********) failed: 1326"

Means your admin password is incorrect, fix it by logging as admin to wea and go to settings for the exchange 5.5 adapter in ESS. Please note that the password is case sensative!


Log:  "Internal Error: got javax.mail.internet.ParseException"

This is probably related to an email problem, for some reason the email is being quoted incorrectly either by the clinet or the server..



Obtaining a Java Dump of ESS

1) Download/detach the file "jvmdump.exe" and place this file in the directory "~\WebSphere\IBMSyncServer\java\jre\bin"
 (Also available on ftp://fatfish.raleigh.ibm.com/castanef   (user:xdisk   pass:xdisk)

(Note:  If you are starting ESS using the Window Services task, you MUST configure and start the Everyplace Synchronization server to run under a userid, NOT as LocalSystem.  Otherwise the jvmdump command will not create a core file.)

2)After you start the ESS server (from the console window), look in the file c:\websphere\ibmsyncserver\logs\ESSErrors.log.  Go to the end of the file, looking for the last occurance of this lines:

ESS PID=3692 ProductSuite=Unknown Type ProductType=ServerNT

The Process ID that the core sync server is using is listed above (it will be different each time you start the server - this is just an example)

3) When you see the server hang, run the following command from the c:\WebSphere\IBMSyncServer\java\jre\bin directory:

jvmdump PID

where PID is the process id for ESS (in the abouve example it would be jvmdump 3692.  This command will create a file called core.<some date stamp>.dmp.  This will be a BIG file, like 300MB. 

4) Run jextract on the same machine that the core file was created on.
jextract -o mydumpfile corefilename


5) Using JDK, on developemnt machine:
jformat -f mydumpfile


Please upload this file to fatfish when you have it, along with all of the other traces from the recreate.




Resetting An ESS User

1) From the command line, go to "~\WebSphere\IBMSyncServer\caf\bin" and execute the command "esscmd deleteuserdata <Portal username>", where "<Portal username>" is the users ID (i.e. "esscmd deleteuserdata bubba").

gewhiteh@us.ibm.c...    also, you may want to see how large the user's mail file is that says it is "hanging"

For the deleteuserdata command, here are the instructions:

All Commands are run from the \WebSphere\IBMSyncServer\caf\bin direcory
(Windows)

Here is a command example from esscmd:

DELETEUSERDATA
This command is used to clear all User Data, meaning any information
relating to synched data for the user.  This command is run when you want
to clear the WEA server side of any record of prior synchs for a user.

command:  esscmd deleteuserdata <username>, where <username> is the
shortname of user.

expected results from command



command:  esscmd deleteuserdata <username>, where <username> is the
shortname of user.

expected results from command

C:\WebSphere\IBMSyncServer\caf\bin>esscmd deleteuserdata wuser1

C:\WebSphere\IBMSyncServer\caf\bin>REM This script runs the Everyplace
Synchroni
zation Server Command / Console
ESMCommandLine will connect with = ESMAdminService
ESMCommandLine will connect with = ESMAdminService
ESM CommandLine instance created on a free port.
ESMCommandLine bound to RMI Registry with URL
"//localhost:20000/ESMCommandLine1".
Trying to locate ESMAdmin with URL "//localhost:20000/ESMAdminService".
ESMAdmin located with URL "//localhost:20000/ESMAdminService".
ESMCommandLine: direct-command-mode.
ESMCommandLine: deleteUserData.
UserID: wuser1 was successfully deleted.
ESMCommandLine: exit.

This command will allow any user to be synched as an initial synch after it
is run.
NOTE:  Make sure that you remove all data from the device as well, or you
will cause any data on the device to be synched to the server.  This could
cause many duplicates.



Using Custom Templates

There is a tech document available on WEA support site that discusses additional steps required for custom templates:

http://www-1.ibm.com/support/docview.wss?rs=754&context=SSNM6Y&q1=templates&uid=swg21111408&loc=en_US&cs=utf-8&lang=en+en