IBM Support

PK19028: CICS-DB2 STATE IS DISCONNECTING AFTER DB2 ABEND. CDBF TASK IS SUSPENDED ON DB2CDISC.

A fix is available

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • CICS-DB2 connection status shows DISCONNECTING after DB2 abends.
    DB2CONN is unable to reconnect to DB2. The CDBF task is
    suspended on resource DB2CDISC which indicates that a
    SET DB2CONN NOTCONNECTED was issued. DFHD2TM is going to wait
    for the count of tasks using DB2 to reach zero. The DB2 summary
    shows that no tasks are active in DB2. All CSUBs have been
    freed and the CEX2 task has gone away. The CDBF task is
    suspended in DFHD2TM, waiting for the D2S_DISCONNECT_ECB to be
    posted by DFHD2STP.
    .
    The CICS-DB2 connection never finishes disconnecting because of
    a steady stream of tasks continually trying to access DB2 and
    failing with abendAEY9.
    1- DB2 abending causes the CEX2 task to do a START for
       transaction CDBF.
    2- This CDBF task does the EXEC CICS SET DB2CONN DISCONNECTED.
       This causes control to go to DFHD2TM where, in the
       stop_the_db2_connection iproc, the presence of LOTs on the
       GWA_LOT chain (meaning there are tasks currently using the
       DB2 connection) causes process_db2conn_busy_option to be set
       and control to return. Eventually, the
       process_db2conn_busy_option causes control to go to iproc
       db2conn_busy where all the tasks using the DB2 connection are
       forcepurged. The CDBF task then waits in a DB2CDISC wait,
       waiting for the D2S_DISCONNECT_ECB to be posted. The CDBF
       task never wakes up from this wait.
    3- The forcepurging of the tasks using DB2 is supposed to cause
       them to abend and stop using DB2. The last one to stop using
       DB2 is supposed to then initiate another call to stop the DB2
       connection. This happens in the process_task_manager_call
       iproc in DFHD2EX1. The last task in there finds GWA_LOT =
       nulls and so then calls DFHD2TM for set_db2conn notconnected.
       At this point, under this last task, control would go to
       DFHD2TM and DFHD2CC and eventually to DFHD2STP.
    4- DFHD2STP works with CEX2 to terminate the thread subtasks.
       Then, since STANDBYMODE(RECONNECT) is not set on the DB2CONN,
       DFHS2STP tries to DISABLE the TRUE. If that call to
       DISABLE_TRUE is successful, then DFHD2STP would post the
       D2S_DISCONNECT_ECB ecb that the CDBF task is waiting on.
       But that call fails, DFHD2STP does not post the bit, and so
       the CDBF task never wakes up.
    5- The DISABLE_TRUE iproc first does an EXEC CICS DISABLE STOP.
       This causes all tasks that attempt to use the DB2 connection
       to abend with AEY9. But then the EXEC CICS DISABLE EXITALL
       keeps failing with INVEXITREQ and RESP2 800080000000 meaning
       there are still active tasks. There is a loop in here that
       tries this DISABLE EXITALL every second for 10 minutes. When
       the 10 minutes is up, it gives up, issues DFHAP0002 with
       code (X'31B4'), and fails the request. Because of this
       failure, d2s_disconnect_ecb will never get posted and the
       CDBF task will hang forever and the DB2 connection will
       remain disconnecting.
    6- DFHUEM processes the EXEC CICS DISABLE commands.  It returns
       the INVEXITREQ with reason 800080000000 when the EPBICNT
       (invocation count) is non-zero.  At time of dump, this
       EPBICNT is 00000003. And currently at time of dump, there are
       3 tasks that are in DFHERM for a DB2 call. These tasks are in
       the process of abending with abend AEY9.  But when DFHERM
       figures out that a request is for DB2, it bumps the EPBICNT.
       And, in the case where the request is going to fail with
       AEY9, EPBICNT isn't decremented until DFHERM's recovery
       routine ERMREC gets control from Kernel because of the AEY9
       that DFHERM initiated. And in between where EPICNT is
       incremented and then decremented, a transaction dump is taken
       which lengthens the amount of time with EPBICNT incremented.
       The trace shows a steady stream of abendAEY9 abends all the
       way to dump time. So during the entire 10 minutes where
       DFHD2STP was trying the DISABLE EXITALL, there was always a
       task in DFHERM for a DB2 request.
       .
       One thing that elongates the time in DFHERM is the fact that
       Fault Analyzer is being used.  It hooks in at the XDUREQ exit
       and participates in lengthening the time.  The three tasks
       that are in DFHERM abending with AEY9 are all currently
       suspended by Fault Analyzer.
    

Local fix

  • The problem can be circumvented by specifying
    STANDYMODE(RECONNECT) on the DB2CONN.  That would cause CICS not
    to try to DISABLE the TRUE.
    

Problem summary

  • ****************************************************************
    * USERS AFFECTED: All.                                         *
    ****************************************************************
    * PROBLEM DESCRIPTION: DFHD2STP suffers severe error (X'31B4') *
    *                      and CDBF is in DB2DISC wait following a *
    *                      DB2 crash.                              *
    ****************************************************************
    * RECOMMENDATION:                                              *
    ****************************************************************
    Following a DB2 crash it is possible for DFHD2STP to suffer a
    severe error code (X'31B4'). This will occur if the STANDBYMODE
    in the DB2CONN is set to CONNECT. The CICS task, CDBF, will also
    be left in a DB2DISC wait.
    The reason for this problem is that DFHD2STP is attempting to
    DISABLE the TRUE but because of a steady stream of new tasks
    trying to use the TRUE the DISABLE command always has a
    response of TASKACTIVE.
    The DISABLE command is tried for 10 minutes and will then end
    with disable_failed which raises the severe error.
    The new tasks trying to use the TRUE are entering DFHERM and
    obtaining a TIE and EPB, the EPBICNT count is being incremented
    and then, because the TRUE is not available, being abended AEY9.
    The dump domain exit XDUREQ is active and doing extra processing
    for each AEY9 abend. This results in the task taking longer to
    return to DFHERM and its recovery routine to decrement the
    EPBICNT count.
    This results in the DISABLE command receiving the TASKACTIVE
    response even when a CICS task is not actually using the TRUE.
    The reason the CDBF task is in a DB2DISC wait is because it is
    waiting on an ECB which never gets posted because the DISABLE
    fails.
    
    Keywords: msgDFHAP0002 DFHAP0002 AP0002 AbendAEY9 AbendsAEY9
    INVEXITREQ 800080000000 d2s_disconnect_ecb
    

Problem conclusion

  • DFHERM has been changed to decrement the EPBICNT count before
    the AEY9 abend is processed.
    

Temporary fix

  • FIX AVAILABLE BY PTF ONLY
    

Comments

APAR Information

  • APAR number

    PK19028

  • Reported component name

    CICSTS 3.1 Z/OS

  • Reported component ID

    5655M1500

  • Reported release

    400

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt

  • Submitted date

    2006-02-03

  • Closed date

    2006-02-20

  • Last modified date

    2006-03-02

  • APAR is sysrouted FROM one or more of the following:

    PK15787

  • APAR is sysrouted TO one or more of the following:

    UK11861

Modules/Macros

  •    DFHERM
    

Fix information

  • Fixed component name

    CICSTS 3.1 Z/OS

  • Fixed component ID

    5655M1500

Applicable component levels

  • R400 PSY UK11861

       UP06/02/22 P F602

Fix is available

  • Select the PTF appropriate for your component level. You will be required to sign in. Distribution on physical media is not available in all countries.

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSGMGV","label":"CICS Transaction Server"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"3.1","Edition":"","Line of Business":{"code":"LOB35","label":"Mainframe SW"}},{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SG19M","label":"APARs - z\/OS environment"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"3.1","Edition":"","Line of Business":{"code":"","label":""}}]

Document Information

Modified date:
02 March 2006