Fixes are available
7.0.0.27: WebSphere Application Server V7.0 Fix Pack 27
8.5.0.2: WebSphere Application Server V8.5 Fix Pack 2
8.0.0.6: WebSphere Application Server V8.0 Fix Pack 6
7.0.0.29: WebSphere Application Server V7.0 Fix Pack 29
8.0.0.7: WebSphere Application Server V8.0 Fix Pack 7
8.0.0.8: WebSphere Application Server V8.0 Fix Pack 8
7.0.0.31: WebSphere Application Server V7.0 Fix Pack 31
7.0.0.27: Java SDK 1.6 SR13 FP2 Cumulative Fix for WebSphere Application Server
7.0.0.33: WebSphere Application Server V7.0 Fix Pack 33
8.0.0.9: WebSphere Application Server V8.0 Fix Pack 9
7.0.0.35: WebSphere Application Server V7.0 Fix Pack 35
8.0.0.10: WebSphere Application Server V8.0 Fix Pack 10
7.0.0.37: WebSphere Application Server V7.0 Fix Pack 37
8.0.0.11: WebSphere Application Server V8.0 Fix Pack 11
7.0.0.39: WebSphere Application Server V7.0 Fix Pack 39
8.0.0.12: WebSphere Application Server V8.0 Fix Pack 12
7.0.0.41: WebSphere Application Server V7.0 Fix Pack 41
8.0.0.13: WebSphere Application Server V8.0 Fix Pack 13
7.0.0.43: WebSphere Application Server V7.0 Fix Pack 43
8.0.0.14: WebSphere Application Server V8.0 Fix Pack 14
7.0.0.45: WebSphere Application Server V7.0 Fix Pack 45
8.0.0.15: WebSphere Application Server V8.0 Fix Pack 15
7.0.0.27: Java SDK 1.6 SR12 Cumulative Fix for WebSphere Application Server
7.0.0.29: Java SDK 1.6 SR13 FP2 Cumulative Fix for WebSphere Application Server
7.0.0.45: Java SDK 1.6 SR16 FP60 Cumulative Fix for WebSphere Application Server
7.0.0.31: Java SDK 1.6 SR15 Cumulative Fix for WebSphere Application Server
7.0.0.35: Java SDK 1.6 SR16 FP1 Cumulative Fix for WebSphere Application Server
7.0.0.37: Java SDK 1.6 SR16 FP3 Cumulative Fix for WebSphere Application Server
7.0.0.39: Java SDK 1.6 SR16 FP7 Cumulative Fix for WebSphere Application Server
7.0.0.41: Java SDK 1.6 SR16 FP20 Cumulative Fix for WebSphere Application Server
7.0.0.43: Java SDK 1.6 SR16 FP41 Cumulative Fix for WebSphere Application Server
APAR status
Closed as program error.
Error description
The WebSphere Applicaiton Server that was running Messaging Engine (ME) was being brought down. That caused ME to failover to another cluster member on a different LPAR which is expected. However, the adjunct in the 2nd lpar got the errors below and adjunct was terminated. To recover, the application server had to be manually restarted. J2CA0206W: A connection error occurred. To help determine the problem, enable the Diagnose Connection Usage option on the Connection Factory or Data Source. J2CA0056I: The Connection Manager received a fatal connection error from the Resource Adapter for resource jdbc/<<<resourceName>>>. The exception is: com.ibm.db2.jcc.am.ClientRerouteException: [jcc][t4][2027][11212][3.59.83] A connection failed but has been re-established. The host name or IP address is "abc.ibm.comt" and the service name or port number is 1,234. Special registers may or may not be re-attempted (Reason code = 1). ERRORCODE=-30108, SQLSTATE=08506 Followed by FFDC error: [jcc][t4][2027][11212][3.59.83] A connection failed but has been re-established. The host name or IP address is "abc.ibm.com" and the service name or port number is 1,234. Special registers may or may not be re-attempted (Reason code = 1). ERRORCODE=-30108, SQLSTATE=08506 at com.ibm.db2.jcc.am.dd.a(dd.java:304) at com.ibm.db2.jcc.am.dd.a(dd.java:356) at com.ibm.db2.jcc.t4.a.a(a.java:473) at com.ibm.db2.jcc.t4.a.L(a.java:1024) at com.ibm.db2.jcc.t4.b.a(b.java:4885) at com.ibm.db2.jcc.t4.l.bc(l.java:124) at com.ibm.db2.jcc.am.cn.executeQuery(cn.java:652) at com.ibm.ws.rsadapter.jdbc.WSJdbcPreparedStatement.pmiExecute Query at com.ibm.ws.rsadapter.jdbc.WSJdbcPreparedStatement.executeQue ry at com.ibm.ws.sib.msgstore.persistence.impl.MEInnerOwnerTable. readOwningME at com.ibm.ws.sib.msgstore.persistence.lock.DBLockingThread. waitAndRefreshLock at com.ibm.ws.sib.msgstore.persistence.lock.DBLockingThread.run The error above is a result of this query: SELECT ME_UUID,INC_UUID,VERSION,MIGRATION_VERSION FROM SIBSYS01.SIBOWNER 1003 1007 0 0 2 Finaly, HA Manager killed the JVM bringing Adjunct down: HMGR0130I: The local member of group <<< group name>>> has indicated that is it not alive. The JVM will be terminated. at java.lang.Thread.dumpStack(Thread.java:417) at com.ibm.ws.hamanager.proxy.DispatchHAGroupCallbackImpl.isAli ve(DispatchHAGroupCallbackImpl.java:193) ... Panic:component requested panic from isAlive In this case, the problem was that the 2nd WebSphere Application Server created a connection to DB2 datasharing member that was also being brought down. That caused DB2 to return above ClientRerouteException saying that connection was lost, but it was successfully reconnected to a different datasharing member. However, with this property defined: sib.msgstore.jdbcFailoverOnDBConnectionLoss=true once there is one failure for connecting to DB2, we will not retry again and the ME will be brought down. This apar will provide a property that will make it configurable to retry the connection (and how many times) before the ME is brought down.
Local fix
Configure sib.msgstore.jdbcFailoverOnDBConnectionLoss=false See details on this property here: http://pic.dhe.ibm.com/infocenter/wasinfo/v7r0/topic/com.ibm.web sphere.zseries.doc/info/zseries/ae/tjm_dsconnloss.html
Problem summary
**************************************************************** * USERS AFFECTED: Users of the default messaging provider for * * IBM WebSphere Application Server versions * * 7.0, 8.0, and 8.5 * **************************************************************** * PROBLEM DESCRIPTION: In a z O/S LPARs if the Messaging * * Engine is configured to be running in * * high availability mode and the DB2 * * which is used as a datastore is also * * configured to be a clustered setup, * * when one LPAR is brought down the * * Messaging Engine on the LPAR would * * failover onto the other LPAR. But if * * the connection pool returns a * * connections that is pointing to a DB2 * * instance running on the LPAR which * * was brought down then the Messaging * * Engine would initiate a local error. * * If there are only 2 LPARS then the * * system would be rendered without * * any Messaging Engine. * **************************************************************** * RECOMMENDATION: * **************************************************************** In a setup where WebSphere Application server is running on a z/OS LPAR (active passive)topology and is configured to be in a high availability mode. If there is a Bus for which Messaging Engine is configured to run in a high availability mode on the LPARs with the Database(DB2) also configured to run in a similar high availability mode on the LPAR. If one the active LPARs is brought down the Messaging Engine on the LPAR would failover onto the other LPAR. The first time the Messaging Engine is coming up it would attempt to obtain a connection from the the connection pool. The connection pool would return the connection that would point to the DB2 instance running on the previous LPAR when attempting to use it, DB2 driver would issue a "ClientRerouteException". And by default "ClientRerouteException" is mapped to a StaleConnectionException. In thecase of a StaleConnectionException the Messaging Engine would not re-attempt the previous operation since the connection is not guaranteed and would initiate the failover. Since the active LPAR was already brought down the system is left without any messaging Engine.
Problem conclusion
After collaborating with the DB2 team we understand that some of the error codes in the "ClientRerouteException" would mean that there is already an instance of DB2 up and running elsewhere and retrying would connect to a running database with guarantee. So in the Messaging Engine we will look for the error codes "-30108,-4499,-4498 " in which case we will attempt to retry instead of causing a failover. The fix for this APAR is currently targeted for inclusion in fix packs 7.0.0.27, 8.0.0.6, and 8.5.0.2. Please refer to the Recommended Updates page for delivery information: http://www.ibm.com/support/docview.wss?rs=180&uid=swg27004980
Temporary fix
Comments
APAR Information
APAR number
PM64875
Reported component name
WAS SIB & SIBWS
Reported component ID
620800101
Reported release
300
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt
Submitted date
2012-05-17
Closed date
2012-10-04
Last modified date
2012-10-04
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
PM93758
Fix information
Fixed component name
WAS SIB & SIBWS
Fixed component ID
620800101
Applicable component levels
R300 PSY
UP
R800 PSY
UP
Document Information
Modified date:
28 October 2021