IBM Support

PM71339: JOBS REMAIN IN SUBMITTED STATE UNTIL ENDPOINTS ARE RECYCLED.

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • Sometimes jobs in customer environment remain in the submitted
    state, despite indications from joblogs or other log files that
    they should have reached some other state (ended, become
    restartable, etc.).  They have to interrupt their service by
    cycling endpoints, the job scheduler, or sometimes even the cell
    to clear the issue.
    

Local fix

  • For the time being customer can recycle the endpoints to clear
    the issue.
    

Problem summary

  • ****************************************************************
    * USERS AFFECTED:  All users of WebSphere Extended Deployment  *
    *                  Compute Grid Version 8                      *
    ****************************************************************
    * PROBLEM DESCRIPTION: Problem initializing job log            *
    *                      processing has side effect leading to   *
    *                      jobs stuck in submitted state.          *
    *                      A second problem is that jobs that      *
    *                      failed before their first step ran      *
    *                      were, on restart, going                 *
    *                      right to "ended" state without          *
    *                      executing any of the jobs' steps.       *
    ****************************************************************
    * RECOMMENDATION:                                              *
    ****************************************************************
    A problem initializing job log handling for a given job was
    handled incorrectly by the runtime
    on the endpoint server.  This led to a problem downstream
    where the job status wasn't getting
    communicated and managed correctly.  So rather than the job
    appearing to have failed and
    been put into "restartable" state, the job appears to be stuck
    in "submitted" state (still
    in "submitted" state since job log initialization happens
    early on in the job lifecycle).
    A second problem found at the same time occurs when a job
    fails and is put into "restartable"
    state before the first step ever gets executed.   The problem
    appears when the job is
    restarted.   The restarted job appears to execute successfully
    and finish in the 'ended'
    state, according to the external job status as seen for
    example in the job management
    console.   However, none of the steps comprising the job will
    actually have executed, due
    to a flaw in the logic used by the batch container in restart
    execution.
    

Problem conclusion

  • Job log initialization error handling has been tightened up so
    that a failure puts the
    job into restarted state.   The bug in restart execution has
    been fixed so that upon
    restart, a job that failed before the first step was attempted
    to be executed will
    resume execution at the first step.
    The fix for this APAR is currently targeted for inclusion in
    fixpack 8.0.0.3. Please refer to the Recommended Updates page
    for delivery information:
    http://www.ibm.com/support/docview.wss?uid=swg27022998
    

Temporary fix

  • An interim fix is available upon request.
    

Comments

APAR Information

  • APAR number

    PM71339

  • Reported component name

    WXD COMPUTE GRI

  • Reported component ID

    5725C9301

  • Reported release

    800

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt

  • Submitted date

    2012-08-22

  • Closed date

    2013-01-02

  • Last modified date

    2013-01-02

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

    PM79241

Fix information

  • Fixed component name

    WXD COMPUTE GRI

  • Fixed component ID

    5725C9301

Applicable component levels

  • R800 PSY

       UP

[{"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSFVRM","label":"WebSphere Extended Deployment Compute Grid"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"8.0","Line of Business":{"code":"LOB45","label":"Automation"}}]

Document Information

Modified date:
29 October 2021