IBM Support

PI16658: COLLECTOR DOESN'T WORK ON RESTART WHEN SUBJOBS HADN'T YET BEEN S UBMITTED ON ORIGINAL EXECUTION.

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • Collector doesn't work on restart when subjobs hadn't yet been
    submitted on original execution.
    

Local fix

Problem summary

  • ****************************************************************
    * USERS AFFECTED:  Users of WebSphere Extended Deployment      *
    *                  Compute Grid 8.0 and the batch function of  *
    *                  WebSphere Application Server who use the    *
    *                  parallel job manager function.              *
    ****************************************************************
    * PROBLEM DESCRIPTION: The application's SubJobAnalyzer is     *
    *                      not successfully invoked from the       *
    *                      application's SubJobCollector on a      *
    *                      restart execution for a subjob that     *
    *                      hadn't been successfully submitted      *
    *                      and dispatched.                         *
    ****************************************************************
    * RECOMMENDATION:                                              *
    ****************************************************************
    When a top-level job is restarted, it may get dispatched to a
    different endpoint (cluster member) than it was dispatched to
    on the original execution.  In some cases where one or more
    original subjobs have not been fully submitted and dispatched
    themselves (during the original execution), and the top-level
    job is dispatched to a different endpoint (than the original
    execution), the collector call to the analyzer call fails.
    That is, the application's SubJobAnalyzer does not get
    successfully invoked from the application's SubJobCollector on
    a restart execution, for one of these subjobs that did not get
    submitted and dispatched on the original execution.
    From the perspective of the relevant subjob(s), messages such
    as the following may appear in the subjob endpoint server
    SystemOut logs:
    [1/23/14 7:51:16:731 EDT] 0000003e MBeanCollecto E   Subjob
    MailerSample:000133:000136 failed to call collector SPI due to
    null
    [1/23/14 7:51:16:934 EDT] 0000003c MBeanCollecto I   Unable to
    create JMX AdminClient to
    WebSphere:*,type=ParallelJobManagerMBean,node=MyNode,process=MyE
    ndpoint
    From the top-level job's perspective, the SubJobAnalyzer
    simply fails to get called.
    

Problem conclusion

  • The code was fixed to correctly identify the top-level job
    endpoint location in this scenario.
    
    APAR PI16658 is currently targeted for inclusion in Service
    Level (Fix Pack) 8.0.0.4 of WebSphere Compute Grid 8.0.
    
    Please refer to the Recommended Updates page for delivery
    information:
    http://www.ibm.com/support/docview.wss?uid=swg27022998
    

Temporary fix

  • An interim fix is available upon request.
    

Comments

APAR Information

  • APAR number

    PI16658

  • Reported component name

    WXD COMPUTE GRI

  • Reported component ID

    5725C9301

  • Reported release

    800

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt

  • Submitted date

    2014-04-24

  • Closed date

    2014-06-06

  • Last modified date

    2014-06-06

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    WXD COMPUTE GRI

  • Fixed component ID

    5725C9301

Applicable component levels

  • R800 PSY

       UP

[{"Business Unit":{"code":"BU029","label":"Software"},"Product":{"code":"SSFVRM","label":"WebSphere Extended Deployment Compute Grid"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"8.0"}]

Document Information

Modified date:
28 April 2022