IBM Support

PK16787: HIGH CPU IN DFHNQIB AND DFHNQNQ DURING TRANSACTION BACKOUT WHEN TASK HAS LOTS OF NQEAS. NO OTHER TASKS CAN RUN ON QR.

A fix is available

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • A task has updated thousands and thousands of record in a
    dataset, all with the same unit of work.   This task abends and
    needs to backout all that work.  A problem with the dataset
    makes backout impossible.  CICS needs to retain all the record
    locks on all those records.  The process of retaining all the
    record locks takes lots of CPU and no other tasks can run during
    that process.  One hundred thousand or so retained locks could
    cause the task to dominate the QR TCB for minutes.  No other
    tasks can run during that time.
        The high CPU shows up in DFHNQIB and in DFHNQNQ.  The
    code that is driving the loop is the fcca_retain_dataset_locks
    code in DFHFCNQ .
        There needs to be a means of suspending the task
    periodically as it browses through the thousands of NQEA control
    blocks so that other tasks can run some too.
    

Local fix

Problem summary

  • ****************************************************************
    * USERS AFFECTED: All                                          *
    ****************************************************************
    * PROBLEM DESCRIPTION: Very many NQEAs prevent CICS from       *
    *                      multitasking during backout failure     *
    *                      processing.                             *
    ****************************************************************
    * RECOMMENDATION:                                              *
    ****************************************************************
    The customer had a batch-style transaction performing hundreds
    of thousands of recoverable updates to files. Each update led
    to an NQEA being acquired to represent the record enq. The
    transaction then abended and tried to back out the updates. A
    back-out failure of the dataset occurred as the dataset could
    not be extended. This meant that DFHFCNQ was invoked to
    retain the non-RLS record locks for the files.
      When control entered fcnq_mark_non_rls_locks_for_retention
    DFHFCNQ had to invoke Enqueue Domain to return each enq for
    the UOW. This involved a loop of nqib_get_next_enqueue
    operations. For each NQEA returned, an nqnq_deactivate call
    was then issued if the lock was associated with a record lock
    or ESDS write lock, and the dataset associated with it had had
    the backout failure. Both of these calls to the Enqueue Domain
    involved chaining through the many hundreds of thousands of
    NQEAs for the UOW. This did not allow for invocation of CICS
    dispatcher services, so the QR TCB was unable to subdispatch
    other tasks in the system. This led to the File Owning Region
    being cancelled by the operator, after excessive time and CPU
    usage had elapsed.
    KEYWORDS: CSMI FOR NQ NQs AZI6 CPU
    

Problem conclusion

  • DFHFCNQ has been changed to issue a dsat_change_priority call
    after every hundred iterations through the loop of
    nqib_get_next_enqueue calls. This allows the CICS Dispatcher to
    subdispatch other tasks within the system, as appropriate.
    

Temporary fix

  • FIX AVAILABLE BY PTF ONLY
    

Comments

APAR Information

  • APAR number

    PK16787

  • Reported component name

    CICSTS 3.1 Z/OS

  • Reported component ID

    5655M1500

  • Reported release

    400

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt

  • Submitted date

    2005-12-15

  • Closed date

    2006-01-04

  • Last modified date

    2006-02-02

  • APAR is sysrouted FROM one or more of the following:

    PK14578

  • APAR is sysrouted TO one or more of the following:

    UK10481

Modules/Macros

  •    DESFCNQ  DFHFCNQ
    

Fix information

  • Fixed component name

    CICSTS 3.1 Z/OS

  • Fixed component ID

    5655M1500

Applicable component levels

  • R400 PSY UK10481

       UP06/01/06 P F601

Fix is available

  • Select the PTF appropriate for your component level. You will be required to sign in. Distribution on physical media is not available in all countries.

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSGMGV","label":"CICS Transaction Server"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"3.1","Edition":"","Line of Business":{"code":"LOB35","label":"Mainframe SW"}},{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SG19M","label":"APARs - z\/OS environment"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"3.1","Edition":"","Line of Business":{"code":"","label":""}}]

Document Information

Modified date:
02 February 2006