IBM Support

PI34012: Collective controller unable to establish a TCP connection with its replicas

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • In some instances an inter-replica communication failure is
    not handled properly, resulting with a replica being unable
    to re-establish connection to another replica.
    
    The messages.log should contain the following messages:
    
    E CWWKX6020E: A collective controller internal error
    occurred: java.nio.channels.CancelledKeyException. The
    replica needs to be restarted.
    E CWWKX6008E: The collective controller is unavailable,
    probably due to a loss of majority of the replica set, or a
    communications failure. Current active replica set is [...].
    The configured replica set is [...]. The connected standby
    replicas are [...].
    E CWWKX6001W: The collective controller is unable to
    establish a TCP connection or communicate with replica ...
    

Local fix

Problem summary

  • ****************************************************************
    * USERS AFFECTED:  All users of IBM WebSphere Application      *
    *                  Server Liberty Profile collectives.         *
    ****************************************************************
    * PROBLEM DESCRIPTION: Collective controllers in a multi-      *
    *                      replica replica set are unable to       *
    *                      establish a TCP connection with each    *
    *                      other                                   *
    ****************************************************************
    * RECOMMENDATION:                                              *
    ****************************************************************
    In some instances an inter-replica communication failure is not
    handled properly, resulting with a replica being unable to re-
    establish connection to another replica.
    
    The messages.log will contain the following messages:
    
    E CWWKX6020E: A collective controller internal error occurred:
    java.nio.channels.CancelledKeyException. The replica needs to be
    restarted.
    E CWWKX6008E: The collective controller is unavailable, probably
    due to a loss of majority of the replica set, or a
    communications failure. Current active replica set is [...]. The
    configured replica set is [...]. The connected standby replicas
    are [...].
    E CWWKX6001W: The collective controller is unable to establish a
    TCP connection or communicate with replica ...
    

Problem conclusion

  • The inter-replica communication code was improved to gracefully
    handle and recover from this failure.
    
    The fix for this APAR is currently targeted for inclusion in fix
    pack 8.5.5.5.  Please refer to the Recommended Updates page for
    delivery information:
    http://www.ibm.com/support/docview.wss?rs=180&uid=swg27004980
    

Temporary fix

  • Stop and then restart the failing replica.
    

Comments

APAR Information

  • APAR number

    PI34012

  • Reported component name

    WAS LIBERTY COR

  • Reported component ID

    5725L2900

  • Reported release

    855

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2015-02-03

  • Closed date

    2015-02-06

  • Last modified date

    2015-02-06

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    WAS LIBERTY COR

  • Fixed component ID

    5725L2900

Applicable component levels

  • R855 PSY

       UP

[{"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSD28V","label":"WebSphere Application Server Liberty Core"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"855","Line of Business":{"code":"LOB45","label":"Automation"}}]

Document Information

Modified date:
28 April 2022