Alex
Last few days one of my servers is being marked as "Server connection failed" and I wonder why and how. The server is operational just fine but last 3 days I've receieved the "Server connection failed" notification at 04:01.
I would like to propose a few improvements (if not implemented already):
- A retry, try again 5 minutes later instead of immediately marking the server as offline
- A constant retry where it's being retried a few hours later to see if it fixed itself and send a "Server connection restored" notification (it makes sense you stop retrying after a day or so of course)
- Update the notification subject to: "Server connection failed:
Hope any of the above sound like improvements to the server connection handling and stops me having to manually step in every few days for some transient error 😉
Dennis
The flow has changed a little bit for testing connections now. The flow is like this when a server is completely unreachable:
- Test connection: detect failure -> wait 30 seconds for next attempt
- Test connection: detect failure -> wait 60 seconds for next attempt
- Test connection: detect failure -> wait 120 seconds for next attempt
- Test connection: detect failure -> wait 300 seconds for next attempt
- Send an email it failed to connect (with the server name in subject)
The flow is like this when a server is restored somewhere in between:
- Test connection: detect failure -> wait 30 seconds for next attempt
- Test connection: detect failure -> wait 60 seconds for next attempt
- Test connection: detect connection possible
- Send an email the connection has restored (wiht the server name in subject)
It will try to test the connection for 5 times maximum. We also show this in the settings page connection card:
Server connection failed improvements
-
Dennis moved item to board Live
3 days ago -
Dennis moved item to board In progress
3 days ago -
Dennis moved item to board Planned
1 year ago -
Alex moved item to project Panel Requests
1 year ago -
Alex created the item
1 year ago