Setting up crontab to automatic restart services

If you have problems with your node stopping or going in error you can use crontab to restart your services automatic :slight_smile:

Here’s a guide for setting it up. Auto restart node and gateway service every 15min.

Update system
sudo apt update

Install crontab
sudo apt install cron

Enable Service
sudo systemctl enable cron

Verify if the cron service is running
sudo systemctl status cron

Config cron jobs
sudo crontab -e

Put this in file and save it

# β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ minute (0 - 59)
# β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ hour (0 - 23)
# β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ day of the month (1 - 31)
# β”‚ β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ month (1 - 12)
# β”‚ β”‚ β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ day of the week (0 - 6) (Sunday to Saturday;
# β”‚ β”‚ β”‚ β”‚ β”‚                                   7 is also Sunday on some systems)
# β”‚ β”‚ β”‚ β”‚ β”‚
# β”‚ β”‚ β”‚ β”‚ β”‚
# * * * * * <command to execute>

# 15min Node Service Restart
*/15 * * * * systemctl restart xxnetwork-node.service

# 15min Gateway Service Restart
*/15 * * * * systemctl restart xxnetwork-gateway.service

Check that file is correct
sudo crontab -l

Maybe so frequent restarts will increase the node and network failure rates (proportion of β€œtimeout” rounds), as many will interrupt normal round processing. Note that the β€œerror” status does not affect neither your nominal uptime nor your failure rate.

Hi, good workaround but I think we have to be patient. We are on early beta times, I prefer waiting for the updates from the xx team. :face_with_monocle:

What would be the recommended timeframe for restarts?

By the way I notice in the gateway and node logs it seems that everything has stopped but in the dashboard a lot of SUCCESS and a couple of TIMEOUT rounds so the services must be doing something well, only the logging has stopped?

No, after the ERROR status no more rounds are processed by the node, that’s exactly the meaning of ERROR status.

I’d not recommend any timeframe, it’s a way to hide the issue. But if you still want to do it, use something bigger, like a few hours.

Right, if we all configure frequent automatic restarts then it’ll become harder for the team to analyze the problem and test potential solutions.

Well if I see the β€œupdating to ERROR” message in the logs the node service is restarted by the xxwrapper script and service is continued with more rounds executed after that.

I describe the situation nothing is written to the logs but the server seems to work as the dashboard registers SUCCESS and some TIMEOUT rounds

I do not restart the services on node- or gatewayserver with cron

I guess we have to get the XX teams opinion on using cron, and if yes on what time schedule.

The reason why i would like to use it is so that i don’t have to reset the service everyday since my node gets stuck sometimes and shows offline in the dashboard. I talked with ben and keith about it. They say its not the node operators problem, but a fault in the software.

Sure you want to keep the services running and if cron is your only friend now, you have to stick to it. Hopefully the main reason for the problem will be found and solved.

From Keith aka LordVetinari in discord, on automatic service restarts:

β€œEvery time you kill your node mid-round you cause an error in two other nodes. FYI.”
β€œFor testing it really is better to let the node run its course even if it means running aground. We’d much rather see unrecoverable errors than recoverable ones which are handled by the wrapper script.”

So the team discourages automatically restarting services with cron.

1 Like