Fixes are available
8.5.5.9: WebSphere Application Server V8.5.5 Fix Pack 9
8.5.5.10: WebSphere Application Server V8.5.5 Fix Pack 10
8.5.5.11: WebSphere Application Server V8.5.5 Fix Pack 11
8.5.5.12: WebSphere Application Server V8.5.5 Fix Pack 12
8.5.5.13: WebSphere Application Server V8.5.5 Fix Pack 13
8.5.5.14: WebSphere Application Server V8.5.5 Fix Pack 14
8.5.5.15: WebSphere Application Server V8.5.5 Fix Pack 15
8.5.5.17: WebSphere Application Server V8.5.5 Fix Pack 17
8.5.5.20: WebSphere Application Server V8.5.5.20
8.5.5.18: WebSphere Application Server V8.5.5 Fix Pack 18
8.5.5.19: WebSphere Application Server V8.5.5 Fix Pack 19
8.5.5.16: WebSphere Application Server V8.5.5 Fix Pack 16
8.5.5.21: WebSphere Application Server V8.5.5.21
APAR status
Closed as program error.
Error description
When a job with a partitioned step is stopped in the middle of executing a partitioned step, a remote partition (running on a separate server than the top-level job is executing on) may wrongly end with a BatchStatus of COMPLETED, rather than the correct status of STOPPED. Although the partitioned step as a whole may re-execute on a restart of the job, the individual partition will not re-execute, since it is detected as already complete. The business logic will not execute for this partition and the partition analyzer will not receive a call to analyzeStatus() (which it would have received on completion). Note that it is possible that due to various circumstances such as the reason for which the job is being stopped, it is possible that for a given partitioned step, some number, 0..M of the partitions will hit this problem while some number, 0..N do not.
Local fix
Problem summary
**************************************************************** * USERS AFFECTED: All users of IBM WebSphere Application * * Server Liberty Profile- Batch * **************************************************************** * PROBLEM DESCRIPTION: When a job is stopped some partitions * * are wrongly bypassing execution on * * restart. * **************************************************************** * RECOMMENDATION: * **************************************************************** When a job with a partitioned step is stopped in the middle of running that partitioned step: Case 1.) A remote partition (running on a separate server than the top-level job is executing on) may wrongly end with a BatchStatus of COMPLETED, rather than the correct status of STOPPED. Although the partitioned step as a whole may re-run on a restart of the job, the individual partition will not re- run, since it is detected as already complete. The business logic will not run for this partition and the partition analyzer will not receive a call to analyzeStatus() (which it would have received on completion). Note that due to various circumstances such as the reason for which the job is being stopped, it is possible that for a given partitioned step, some number, 0..M of the partitions will hit this problem while some number, 0..N do not. Case 2.) If at least one partition has a BatchStatus and another partition in that step does not (it has not been started yet) prior to the job being stopped, a restart of that job will not perform properly. In this case when a restart of the job is performed only the partitions that have a BatchStatus that is not COMPLETED will be run (the partitions that have no BatchStatus are skipped). This is due to the fact that a partition's information only gets persisted in the database once it reaches a BatchStatus of STARTED. Note that if none of the partitions have a BatchStatus then the job will restart properly and all the partitions will run as desired.
Problem conclusion
Case 1 has been fixed by adding a new change that sets the step- level status to STOPPING (ultimately STOPPED) so it doesn't wrongly leave a partition with a BatchStatus of COMPLETED. This ensures that if a STOP command was done against a top-level job running remotely the remote partitions would be set to STOPPED as well. Case 2 has been fixed by querying the database only for partitions that have a BatchStatus of COMPLETED instead of looking for the partitions that do not. The values in the list of completed partitions are then removed from the full list of partitions (the list that would be used if none of the partitions had been started yet). That list is used to perform a proper restart and ensure that all of the partitions that should be run are. The fix for this APAR is currently targeted for inclusion in fix pack 8.5.5.9. Please refer to the Recommended Updates page for delivery information: http://www.ibm.com/support/docview.wss?rs=180&uid=swg27004980
Temporary fix
Comments
APAR Information
APAR number
PI57100
Reported component name
WAS LIBERTY COR
Reported component ID
5725L2900
Reported release
855
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2016-02-11
Closed date
2016-02-12
Last modified date
2016-02-12
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Fix information
Fixed component name
WAS LIBERTY COR
Fixed component ID
5725L2900
Applicable component levels
R855 PSY
UP
Document Information
Modified date:
28 April 2022