IBM Support

PM90474: WSWS3713E: CONNECTION TO THE REMOTE HOST FAILED ERRORS IN SCHEDULER LOGS.

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • The Scheduler performs a purge on the endpoint as part of
    the purgeJob processing when reclaiming job numbers. On certain
    conditions, the job is getting purged from the job status table,
    but not from the global jod id assignment table.
    
    When the scheduler attempts to reclaim the job number, it cannot
    find any removable jobs in the job status store, which results
    in the NullPointerException.
    
    In the scheduler joblogs, the following exception will be logged
    
    WebServicesFault
     faultCode:
    {http://schemas.xmlsoap.org/soap/envelope/}Server.generalExcepti
     on
     faultString: WSWS3713E: Connection to the remote host
                  ibmtest failed.Received the following error:
                  java.net.ConnectException: Connection refused
     faultActor: null
     faultDetail:
    
    
    WSWS3713E: Connection to the remote host ibmtest
    failed .Received the following error: java.net.ConnectException:
    Connection refused
     at
    com.ibm.ws.webservices.engine.transport.http.HttpOutboundChannel
    Connection.connect(HttpOutboundChannelConnection.java:965)
    
    ...
    
     at
    com.ibm.ws.batch.BatchGridDiscriminatorSoapBindingStub.purgeJob(
    BatchGridDiscriminatorSoapBindingStub.java:428)
     at
    com.ibm.ws.batch.BatchGridDiscriminatorProxy.purgeJob(BatchGridD
    iscriminatorProxy.java:137)
     at
    com.ibm.ws.batch.SchedulerSingleton.invoke61PlusEndpoint(Schedul
    erSingleton.java:5705)
     at
    com.ibm.ws.batch.SchedulerSingleton.invokeEndpoint(SchedulerSing
    leton.java:5592)
     at
    com.ibm.ws.batch.SchedulerSingleton.endpointPurge(SchedulerSingl
    eton.java:3538)
     at
    com.ibm.ws.batch.SchedulerSingleton.purgeJob(SchedulerSingleton.
    java:3483)
     at
    com.ibm.ws.batch.SchedulerSingleton.getJobNumber(SchedulerSingle
    ton.java:1787)
     at
    com.ibm.ws.batch.SchedulerSingleton.privateReserveJobNumberBlock
    (SchedulerSingleton.java:13939)
     at
    com.ibm.ws.batch.SchedulerSingleton.privateReserveJobNumberStrin
    gBlock(SchedulerSingleton.java:13969)
     at
    com.ibm.ws.batch.JobSchedulerBean.reserveJobNumberBlock(JobSched
    ulerBean.java:346)
    

Local fix

Problem summary

  • ****************************************************************
    * USERS AFFECTED:  All users of WebSphere Compute Grid         *
    *                  Version 8.                                  *
    ****************************************************************
    * PROBLEM DESCRIPTION: Jobs remain in executing state          *
    *                      indefinitely when job numbers have      *
    *                      wrapped and need to be reclaimed.       *
    ****************************************************************
    * RECOMMENDATION:                                              *
    ****************************************************************
    The logs show numerous WSWS3713E: Connection refused messages
    when trying to do a purge on the endpoint as part of the
    purgeJob processing when reclaiming job numbers. When this
    occurs, the code was not handling it correctly and the job was
    getting purged from the job status table, but not from the
    global job id assignment table. Therefore, when an attempt
    was made to reclaim the job number, no removable jobs can be
    found in the job status store resulting in a
    NullPointerException.
    

Problem conclusion

  • For jobs in the ended, cancelled or execution failed state,
    the information has already been cleaned up on the endpoint,
    so there is no need to issue the purgeEndpoint call which was
    failing because of the connection error.
    The fix for this APAR is currently targeted for inclusion in
    fixpack 8.0.0.3.
    Please refer to the Recommended Updates page for delivery
    information:
    http://www.ibm.com/support/docview.wss?uid=swg27022998
    

Temporary fix

Comments

APAR Information

  • APAR number

    PM90474

  • Reported component name

    WXD COMPUTE GRI

  • Reported component ID

    5725C9301

  • Reported release

    800

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt

  • Submitted date

    2013-06-05

  • Closed date

    2013-08-13

  • Last modified date

    2013-08-13

  • APAR is sysrouted FROM one or more of the following:

    PM78641

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    WXD COMPUTE GRI

  • Fixed component ID

    5725C9301

Applicable component levels

  • R800 PSY

       UP

[{"Business Unit":{"code":"BU029","label":"Software"},"Product":{"code":"SSFVRM","label":"WebSphere Extended Deployment Compute Grid"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"8.0"}]

Document Information

Modified date:
28 April 2022