IBM Support

PI76327: EYUWG0106E AND DIAGNOSTIC SVC DUMP TAKEN BY CPSM TRANSACTION COWC

A fix is available

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • A diagnostic SVC dump was taken by CPSM transaction COWC, with
    a title similar to the following:
    .
    EYU0XZSD Dump,masname,applid,sysidnt,LMAS,COWC,00012356,
             TRAC,EYU0WNLM,mm/dd/yyyy,hh:mm:ss
    In the CICS job log, the following messages will appear:
    .
    EYUWG0106E applid WLM has encountered an error while
                      attempting to release MAS resources.
    .
    In the SVC dump, you format the CPSM trace entries using
    .
      VERBX EYU9Dxxx 'TRC=A,JOB=cicsname'
    .
    and find repeated entries similar to the following:
    .
    Task  Mtd  Prev Tran Obj Level Pt-ID  Debug   UOW  CMAS/Usr Envr
    12345 XSRA WDTR EZLI SRV Excp      1 INVRESPT CPSM CICSNAME LMAS
    12345 WDTR XLOP EZLI WLM Excp      3 WDTRXSRA CPSM CICSNAME LMAS
    12345 XSRA WDTR EZLI SRV Excp      1 INVRESPT CPSM CICSNAME LMAS
    12345 WDTR XLOP EZLI WLM Excp      3 WDTRXSRA CPSM CICSNAME LMAS
    12345 XSRA WDTR EZLI SRV Excp      1 INVRESPT CPSM CICSNAME LMAS
    12345 WDTR XLOP EZLI WLM Excp      3 WDTRXSRA CPSM CICSNAME LMAS
    12345 XSRA WDTR EZLI SRV Excp      1 INVRESPT CPSM CICSNAME LMAS
    12345 WDTR XLOP EZLI WLM Excp      3 WDTRXSRA CPSM CICSNAME LMAS
    12345 XSRA WDTR EZLI SRV Excp      1 INVRESPT CPSM CICSNAME LMAS
    12345 WDTR XLOP EZLI WLM Excp      3 WDTRXSRA CPSM CICSNAME LMAS
    12345 XSRA WDTR EZLI SRV Excp      1 INVRESPT CPSM CICSNAME LMAS
    12345 WDTR XLOP EZLI WLM Excp      3 WDTRXSRA CPSM CICSNAME LMAS
    .
    The entries from XSRA are due to a failed attempt to get a
    shared lock on a workload descriptor. Formatting the same trace
    with 'TRC' instead of 'TRC=A', the full MAL for the call to
    XSRA is formatted. It shows the response and reason as follows:
    .
          Keyword        Data Queue Req Data     Data
          Value          Type Dir   Opt Address  Value
     In: *FUNCTION       FUN         .  0001E3C0 RESACQ
         *DEBUG          CHR         .  0001E3C4 WDTRXSRA
          EXCLUSIVE      SDT         .
         *RESOURCE_PTR   EPT         .  0001E3CC A=01FF002A
                                                 O=00058B50
          CONDITIONAL    SDT         .
          EXCLUSIVE_MODE ENM         .
          DOWNGRADE      SDT         .
    Out: *RESPONSE       RSP         .  0001E3C2 INVALID
         *REASON         RSN         .  0001E3C3
    INVALID_RESOURCE_PTR
         *STATUS         STA         .  0001E3D4 OK
    .
    .
    The INVALID_RESOURCE_POINTER in this instance refers to the
    workload descriptor lock, field WRKD_LOCK in the workload
    descriptor. When we go to the workload descriptor in the WLM
    Dataspaces however, we find the following in the eye-catcher:
    .
    | ..>eYUWMEYURWRKD |
    .
    Notice the lower case 'e'. This indicates that CPSM has
    logically deleted it. As such, the lock word is invalid. When
    this occurs, we return and fail to free up a control block
    called a WNLE.
    .
    When transaction COWC ran to perform cleanup of these WNLEs, it
    them all in use by an active task, so it produced the
    diagnostic dump.
    .
    .
    Additional Symptom(s) Search Keyword(s): KIXREVSVR
    

Local fix

Problem summary

  • ****************************************************************
    * USERS AFFECTED: All CICSPlex SM V5R1M0, V5R2M0 and V5R3M0    *
    *                 Users                                        *
    ****************************************************************
    * PROBLEM DESCRIPTION: MASes active as routing or target       *
    *                      regions for a CPSM WLM workload may be  *
    *                      invalidly removed from the workload, if *
    *                      the CMAS they are connected to is       *
    *                      restarted multiple consecutive times    *
    *                      while the MASes remain active, and the  *
    *                      CMAS restarts are terminated before the *
    *                      CMAS performs Topology Connect with the *
    *                      MASes.                                  *
    *                                                              *
    *                      During the first terminated restart,    *
    *                      messages similar to the following will  *
    *                      be issued in the EYULOG of the CMAS:    *
    *                                                              *
    *                        EYUTI0009I Topology warm start for    *
    *                                   <masname> initiated -      *
    *                                   APPLID(<applid>)           *
    *                                   CICSplex(<plexname>).      *
    *                                                              *
    *                        EYUWT0053W Workload Specifications    *
    *                                   cannot be removed during   *
    *                                   CMAS termination for       *
    *                                   CICSplex(<plexname>)       *
    *                                   because at least one       *
    *                                   <routing|target> region is *
    *                                   connected to the CMAS.     *
    *                                                              *
    *                        EYUWT0054I MAS <masname> is connected *
    *                                   to the CMAS as a           *
    *                                   <routing|target> region in *
    *                                   Workload(<wlmspec>) for    *
    *                                   CICSplex(<plexname>).      *
    *                                                              *
    *                      During the second terminated restart,   *
    *                      messages similar to the following will  *
    *                      be issued in the EYULOG of the CMAS:    *
    *                                                              *
    *                        EYUWM0425I Target region (<aorname>)  *
    *                                   has been terminated for    *
    *                                   Workload (<wlmspec>).      *
    *                                                              *
    *                        EYUWM0421I Routing region (<torname>) *
    *                                   has been removed from      *
    *                                   Workload (<wlmspec>).      *
    *                                                              *
    *                        EYUWM0411I Workload Specification     *
    *                                   (<wlmspec>) has been       *
    *                                   removed from this CMAS for *
    *                                   context (<plexname>).      *
    *                                                              *
    *                      If this occurs, then routes attempted   *
    *                      by any routing region for which message *
    *                      EYUWM0421I was received will fail, and  *
    *                      routes attempted to any target region   *
    *                      for which message EYUWM0425I was        *
    *                      received may fail.                      *
    *                                                              *
    *                      In either case, the failures can result *
    *                      in orphaning of CPSM WLM resources in   *
    *                      the MAS, which can result in message    *
    *                      EYUWG0106E being issued in the MAS,     *
    *                      followed by a dump.                     *
    *                                                              *
    *                      The message text will be similar to the *
    *                      following:                              *
    *                                                              *
    *                        EYUWG0106E WLM has encountered an     *
    *                                   error while attempting to  *
    *                                   release MAS resources.     *
    *                                                              *
    *                      and the dump title will be similar to   *
    *                      the following:                          *
    *                                                              *
    *                        EYU0XZSD Dump,<jobname>,<masname>,    *
    *                        <lparid>,LMAS,COWC,<tasknum>,TRAC,    *
    *                        EYU0WNLM,<date>,<time>                *
    *                                                              *
    *                      The errors with the routing and target  *
    *                      regions will continue until the CMAS is *
    *                      restarted and performs Topology Connect *
    *                      with the routing and target regions.    *
    ****************************************************************
    * RECOMMENDATION: After applying the PTF that resolves this    *
    *                 APAR, all CMASes and MASes must be restarted *
    *                 to pick up the new code.                     *
    *                                                              *
    *                 The restarts need not be performed at the    *
    *                 same time, however if systems are not        *
    *                 restarted at the same time, the following    *
    *                 rules apply:                                 *
    *                                                              *
    *                 - Maintenance Point (MP) CMASes must be      *
    *                   restarted on the updated code before       *
    *                   non-MP CMASes.                             *
    *                                                              *
    *                 - If you have more than one MP CMAS and any  *
    *                   of those MP CMASes are connected directly  *
    *                   or indirectly, then those MP CMASes must   *
    *                   be restarted at the same time.             *
    *                                                              *
    *                 - Before a MAS is restarted with the updated *
    *                   code, the CMAS to which the MAS connects   *
    *                   must be running with the updated code.     *
    *                                                              *
    *                 - This fix is being provided across all      *
    *                   supported releases of CPSM as follows:     *
    *                                                              *
    *                   -  CPSM V4R1M0 - APAR PI75418              *
    *                   -  CPSM V4R2M0 - APAR PI75418              *
    *                   -  CPSM V5R1M0 - APAR PI76327              *
    *                   -  CPSM V5R2M0 - APAR PI76327              *
    *                   -  CPSM V5R3M0 - APAR PI76327              *
    *                                                              *
    *                   Before a CMAS running with the PTF that    *
    *                   resolves this APAR for its release         *
    *                   connects directly or indirectly to a CMAS  *
    *                   running a higher release of CPSM, the      *
    *                   higher release CMAS must be restarted so   *
    *                   that it is running with the appropriate    *
    *                   PTF for its release.                       *
    ****************************************************************
    When a CMAS that manages a CPSM WLM workload terminates while
    connected to MASes running as routing or target regions for the
    workload, and those MASes remain active, those routing and
    target regions should remain active in the workload.
    
    When the CMAS restarts:
    
    -  Method EYU0TIWS (TIWS) executes part one of Topology warm
       start processing. Since the Topology data spaces are retained
       over the restart due to previously connected MASes still
       being active (CPSM CMAS warm start) TIWS is able to scan the
       Topology CICS system descriptor blocks (CSDBs) in the data
       spaces as they were at CMAS termination, to determine which
       MASes were active.  For MASes that were connected to the
       CMAS, it will call the ESSS to determine if those regions are
       still active, and if so, will call method EYU0CPAM (CPAM) to
       inform the Communication component of that, issue message
       EYUTI0009I, and then build a TOPWCDTR resource record
       indicating that the MASes are in a lost state.
    
    -  These records are processed by method EYU0TIW2 (TIW2) during
       part two of Topology warm start.  It is the job of TIW2 to
       update the CSDBs with information from the TOPWCDTR records,
       including marking the CSDB status as lost connection for any
       lost MASes.  This is done so that when Topology Connect
       occurs for the MASes, they will be processed properly,
       including marked as active.
    
    -  Additionally, the TOPWCDTR records are passed to method
       EYU0WMWS (WMWS), which performs warm start processing for the
       WLM component.  WMWS will then call either method EYU0WMAT
       (WMAT - target region termination) or EYU0WMTT (WMTT -
       routing region termination) with a status of lost for the
       MAS, which will mark the AOR descriptor (EYURWAOR) or the TOR
       descriptor (EYURWTOR) for the MAS to indicate it is in a lost
       contact state, but still active and available for WLM.
    
    However, if the first CMAS restart is terminated before
    performing Topology Connect for the lost MASes, then when the
    next restart occurs the CSDBs of still active MASes will now be
    marked as lost contact instead of active.
    
    -  TIWS will set a status of gone for the MASes in the TOPWCDTR
       records, the ESSS and CPAM will not be called, and message
       EYUTI0009I will not be issued.
    
    -  When TIW2 is called, it will propagate the gone status in the
       CSDB.
    
    -  When WMWS is called, it will call WMAT or WMTT with a status
       of gone, which will result in the MAS being removed from the
       workload, with either message EYUWM0425I (WMAT) or EYUWM0421I
       (WMTT) being issued.  When this restart of the CMAS
       terminates before performing Topology Connect with the MASes,
       since there are no local routing or target regions active in
       the workload, method EYU0WMWT (WMWT) will be called to
       terminate the workload, issuing message EYUWM0411I.
    
    If before another restart occurs for the CMAS that results in
    the MASes going through Topology Connect, a terminated routing
    region is called for a route request, or a terminated target
    region is called to handle a distributed route, CICS will pass
    control to module EYU9XLOP (XLOP), which is the CPSM DTRPGM and
    DSRTPGM routing exit.  XLOP will call method EYU0WDTR (WDTR) to
    process the request. WDTR first calls method EYU0WDIN (WDIN),
    which will allocate CICS EDSA storage required for its
    processing.  It will then call method EYU0XSRA (XSRA) to acquire
    a shared lock on the workload.  Since the workload has been
    terminated, its lock has been unregistered, and XSRA will fail
    the request.  As such, WDTR will propagate the failure back to
    XLOP, which will return to CICS with a response of abort.  This
    results in CICS terminating processing for the route without
    calling CPSM again, and the storage allocated by the WDTR call
    to WDIN is orphaned.
    
    Note that if a customer is calling EYU9XLOP directly for routing
    decision processing, the same type of resource orphaning can
    occur.
    
    There are two problems that result in the errors documented
    above:
    
    -  WDTR is not freeing allocated storage when it fails.
    
    -  Topology warm start is causing the workload to be terminated
       prematurely, which results in subsequent calls to WDTR to
       fail.
    
    This APAR will address the Topology warm start problem.  This
    should minimize the possibility of WDTR failing.  WDTR resource
    management will be improved either in a subsequent APAR or
    through the development process.
    

Problem conclusion

  • There are two problems with Topology warm start processing that
    need to be addressed to allow for a CMAS restart terminating
    before the CMAS performs Topology connect:
    
    -  TIWS only checks for active MASes.  It needs to also check
       for lost contact MASes, since that is what TIW2 would set the
       MAS state to.
    
    -  TIWS assumes that the data it needs for the CPAM call will be
       in the CSDB.  That is not the case since what is in the CSDB
       is what a previous call to TIW2 might have set, and TIWS does
       not set into the TOPWCDTR record all of that data.
    
    To address these problems, the following changes have been made:
    
    -  The TOPWCDTR resource table has been updated to include new
       attributes to hold additional data required from the existing
       CSDBs.
    
    -  TIWS has been updated to collect that additional data and set
       it into the TOPWCDTR record.  Additionally, TIWS has been
       updated to process MASes with a CSDB status of lost contact
       exactly as it processes MASes with a CSDB status of active.
    
    -  TIW2 has been updated to update the CSDB with the additional
       data in the TOPWCDTR record.
    

Temporary fix

Comments

APAR Information

  • APAR number

    PI76327

  • Reported component name

    CICS TS Z/OS V5

  • Reported component ID

    5655Y0400

  • Reported release

    80M

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2017-02-09

  • Closed date

    2017-02-14

  • Last modified date

    2017-03-02

  • APAR is sysrouted FROM one or more of the following:

    PI75418

  • APAR is sysrouted TO one or more of the following:

    UI44658 UI44659 UI44660

Modules/Macros

  • EYU0TIW2 EYU0TIWS EYU0WMWS EYUT2542 EYUY2542
    

Fix information

  • Fixed component name

    CICS TS Z/OS V5

  • Fixed component ID

    5655Y0400

Applicable component levels

  • R00M PSY UI44660

       UP17/02/20 P F702

  • R80M PSY UI44658

       UP17/02/20 P F702

  • R90M PSY UI44659

       UP17/02/20 P F702

Fix is available

  • Select the PTF appropriate for your component level. You will be required to sign in. Distribution on physical media is not available in all countries.

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSGMGV","label":"CICS Transaction Server"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"5.1","Edition":"","Line of Business":{"code":"LOB35","label":"Mainframe SW"}},{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SG19M","label":"APARs - z\/OS environment"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"5.1","Edition":"","Line of Business":{"code":"","label":""}}]

Document Information

Modified date:
02 March 2017