What if I want the cluster to be less sensitive to network, especially WAN latency?
Tungsten Clustering supports having clusters at multiple sites with active-active replication meshing them together.
This is extraordinarily powerful, yet at times high network latency can make it harder for messaging between the sites to arrive in a timely manner.
This is evidenced by seeing the following in the Manager log files named tmsvc.log
:
2018/07/08 16:51:05 | db3 | INFO [Rule_0604$u58$_DETECT_UNREACHABLE_REMOTE_SERVICE1555959201] - CONSEQUENCE: [Sun Jul 08 16:51:04 UTC 2018] CLUSTER global/omega(state=UNREACHABLE)
...
2018/07/08 16:51:42 | db3 | INFO [Rule_2025$u58$_REPORT_COMPONENT_STATE_TRANSITIONS1542395297] - CLUSTER 'omega@global' STATE TRANSITION UNREACHABLE => ONLINE
The delta is 37 seconds in the above example between state=UNREACHABLE and UNREACHABLE => ONLINE
The default timeout is 60 seconds.
If the delay above were longer than 60 seconds, one site would shun the other, and traffic would be blocked by the Connector proxy to the remote site.
This timeout may be tuned to be longer, however.
This is the policy.remote.service.shun.threshold
setting, and the default value is 6.
Whatever this property is set to is multiplied by 10 seconds to come up with the final interval, so 60 seconds by default.
Find all gaps shown in the logs, figure out the time differences and then take the peak value in seconds, add 10 seconds as a buffer and then divide by 10. Take the INT of that and you have your new value.
Add property=policy.remote.service.shun.threshold={new_value}
to your tungsten.ini
and update your clusters!
This tuning will provide better cluster stability via insulation from high latency network link timeouts.
Questions? Contact us.
Comments
Add new comment