PM71339: JOBS REMAIN IN SUBMITTED STATE UNTIL ENDPOINTS ARE RECYCLED.

Fixes are available

8.0.0.3: WebSphere Extended Deployment Compute Grid V8.0 Fix Pack 3
8.0.0.4: WebSphere Extended Deployment Compute Grid V8.0 Fix Pack 4
8.0.0.5: WebSphere Extended Deployment Compute Grid V8.0 Fix Pack 5

APAR status

Closed as program error.

Error description

Sometimes jobs in customer environment remain in the submitted
state, despite indications from joblogs or other log files that
they should have reached some other state (ended, become
restartable, etc.).  They have to interrupt their service by
cycling endpoints, the job scheduler, or sometimes even the cell
to clear the issue.

Local fix

For the time being customer can recycle the endpoints to clear
the issue.

Problem summary

****************************************************************
* USERS AFFECTED:  All users of WebSphere Extended Deployment  *
*                  Compute Grid Version 8                      *
****************************************************************
* PROBLEM DESCRIPTION: Problem initializing job log            *
*                      processing has side effect leading to   *
*                      jobs stuck in submitted state.          *
*                      A second problem is that jobs that      *
*                      failed before their first step ran      *
*                      were, on restart, going                 *
*                      right to "ended" state without          *
*                      executing any of the jobs' steps.       *
****************************************************************
* RECOMMENDATION:                                              *
****************************************************************
A problem initializing job log handling for a given job was
handled incorrectly by the runtime
on the endpoint server.  This led to a problem downstream
where the job status wasn't getting
communicated and managed correctly.  So rather than the job
appearing to have failed and
been put into "restartable" state, the job appears to be stuck
in "submitted" state (still
in "submitted" state since job log initialization happens
early on in the job lifecycle).
A second problem found at the same time occurs when a job
fails and is put into "restartable"
state before the first step ever gets executed.   The problem
appears when the job is
restarted.   The restarted job appears to execute successfully
and finish in the 'ended'
state, according to the external job status as seen for
example in the job management
console.   However, none of the steps comprising the job will
actually have executed, due
to a flaw in the logic used by the batch container in restart
execution.

Problem conclusion

Job log initialization error handling has been tightened up so
that a failure puts the
job into restarted state.   The bug in restart execution has
been fixed so that upon
restart, a job that failed before the first step was attempted
to be executed will
resume execution at the first step.
The fix for this APAR is currently targeted for inclusion in
fixpack 8.0.0.3. Please refer to the Recommended Updates page
for delivery information:
http://www.ibm.com/support/docview.wss?uid=swg27022998

Temporary fix

An interim fix is available upon request.

Comments

APAR Information

APAR number
PM71339
Reported component name
WXD COMPUTE GRI
Reported component ID
5725C9301
Reported release
800
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt
Submitted date
2012-08-22
Closed date
2013-01-02
Last modified date
2013-01-02

APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:

PM79241

Fix information

Fixed component name
WXD COMPUTE GRI
Fixed component ID
5725C9301

Applicable component levels

R800 PSY
UP

[{"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSFVRM","label":"WebSphere Extended Deployment Compute Grid"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"8.0","Line of Business":{"code":"LOB45","label":"Automation"}}]

Document Information

Modified date:
29 October 2021

Tips

PM71339: JOBS REMAIN IN SUBMITTED STATE UNTIL ENDPOINTS ARE RECYCLED.

Fixes are available

Subscribe

APAR status

Closed as program error.

Error description

Local fix

Problem summary

Problem conclusion

Temporary fix

Comments

APAR Information

APAR number

Reported component name

Reported component ID

Reported release

Status

PE

HIPER

Special Attention

Submitted date

Closed date

Last modified date

APAR is sysrouted FROM one or more of the following:

APAR is sysrouted TO one or more of the following:

Fix information

Fixed component name

Fixed component ID

Applicable component levels

R800 PSY

Document Information

Share your feedback

Need support?