Please be sure you are looking at the correct time period. Your uptime is 84% for the month of November. And the current average of just a 4-5 days will average out over the next 25 days.
I apologize the node didn’t recover from the error with permissioning and you had to reboot your computer.
I am sorry, can not go along with just your remark about the errors and apologies. The consequences are for me. You have to adjust my figures Why? Because you posted this message:
There is nothing you need to do. The issue is on our side with the permissioning server.
Once it has been resolved, your node will connect.
Sorry for the inconvenience.
Before the offline permissioning server my offline time was 0% and your message comforted me, I am the one who brought the error up first and now you punish me by this? Not acceptable
If at the end of November, your node does not meet the Uptime requirements, please contact me and I will ensure you are not punished for the downtime you accrued due to my mistake.
What are the requirements for November? I hate downtime at all!
What can I do to prevent this from happening, I trust a comforting message from xxnetwork but I have to control that?
Is there a smart way to check the node.log automatically on specific terms that indicate something is really wrong, a lot of messages in between that do not indicate the node process is stopped just waiting for something, and the whole process needs to get restarted?
“What can I do to prevent this from happening, I trust a comforting message from xxnetwork but I have to control that?”
We are in BetaNet and things don’t always work as expected. We have anticipated these kinds of things which is why we have set the requirements we have. We feel they’re reasonable and shouldn’t be too much of a burden.
Will show you that state of the processes rather than the service. It will switch to <defunct> if there is a round failure, but the services usually do a good job of restarting the processes. If it gets stuck in the <defunct> for more than a few minutes, that indicates the process has crashed.
Thanks for the hints and the help. Searching for FATAL, PANIC and ERROR in the logs apart from checking if the process is running are possibilities as I understand. As the node process can stop and be restarted with success by the wrapper does not make it that easy to just search for stopped states because a confirmed ERROR state or FATAL does not mean it can not be restarted, but you do not know for sure. Those states are frequently appearing in the log. Did not find PANIC now.
When directly checking you can watch but it needs to check automatically Idea is to make a script to check on states ERROR or FATAL for the last 10 lines of the log or so and in time interval check again when such a state occurs and repeat that for a couple of times and after that check if the node process is running because than you are sure the wrapper script did not restart it.
Does state PANIC mean the node process can not be restarted by the wrapper script? So in that case just restart.
What’s the average time it takes to complete the restart process once the wrapper script detects it needs to restart?