Setting up crontab to automatic restart services

Eyez · July 25, 2020, 10:46am

If you have problems with your node stopping or going in error you can use crontab to restart your services automatic

Here’s a guide for setting it up. Auto restart node and gateway service every 15min.

Update system
sudo apt update

Install crontab
sudo apt install cron

Enable Service
sudo systemctl enable cron

Verify if the cron service is running
sudo systemctl status cron

Config cron jobs
sudo crontab -e

Put this in file and save it

# ┌───────────── minute (0 - 59)
# │ ┌───────────── hour (0 - 23)
# │ │ ┌───────────── day of the month (1 - 31)
# │ │ │ ┌───────────── month (1 - 12)
# │ │ │ │ ┌───────────── day of the week (0 - 6) (Sunday to Saturday;
# │ │ │ │ │                                   7 is also Sunday on some systems)
# │ │ │ │ │
# │ │ │ │ │
# * * * * * <command to execute>

# 15min Node Service Restart
*/15 * * * * systemctl restart xxnetwork-node.service

# 15min Gateway Service Restart
*/15 * * * * systemctl restart xxnetwork-gateway.service

Check that file is correct
sudo crontab -l

lonewolf · July 25, 2020, 5:56pm

Maybe so frequent restarts will increase the node and network failure rates (proportion of “timeout” rounds), as many will interrupt normal round processing. Note that the “error” status does not affect neither your nominal uptime nor your failure rate.

fmlopes · July 25, 2020, 10:17pm

Hi, good workaround but I think we have to be patient. We are on early beta times, I prefer waiting for the updates from the xx team.

Mash · July 26, 2020, 6:15am

What would be the recommended timeframe for restarts?

By the way I notice in the gateway and node logs it seems that everything has stopped but in the dashboard a lot of SUCCESS and a couple of TIMEOUT rounds so the services must be doing something well, only the logging has stopped?

alexdupre · July 26, 2020, 7:05am

No, after the ERROR status no more rounds are processed by the node, that’s exactly the meaning of ERROR status.

I’d not recommend any timeframe, it’s a way to hide the issue. But if you still want to do it, use something bigger, like a few hours.

lonewolf · July 26, 2020, 7:19am

Right, if we all configure frequent automatic restarts then it’ll become harder for the team to analyze the problem and test potential solutions.

Mash · July 26, 2020, 8:11am

Well if I see the “updating to ERROR” message in the logs the node service is restarted by the xxwrapper script and service is continued with more rounds executed after that.

I describe the situation nothing is written to the logs but the server seems to work as the dashboard registers SUCCESS and some TIMEOUT rounds

I do not restart the services on node- or gatewayserver with cron

Eyez · July 26, 2020, 8:42am

I guess we have to get the XX teams opinion on using cron, and if yes on what time schedule.

The reason why i would like to use it is so that i don’t have to reset the service everyday since my node gets stuck sometimes and shows offline in the dashboard. I talked with ben and keith about it. They say its not the node operators problem, but a fault in the software.

Mash · July 26, 2020, 12:25pm

Sure you want to keep the services running and if cron is your only friend now, you have to stick to it. Hopefully the main reason for the problem will be found and solved.

lonewolf · July 27, 2020, 11:29am

From Keith aka LordVetinari in discord, on automatic service restarts:

“Every time you kill your node mid-round you cause an error in two other nodes. FYI.”
“For testing it really is better to let the node run its course even if it means running aground. We’d much rather see unrecoverable errors than recoverable ones which are handled by the wrapper script.”

So the team discourages automatically restarting services with cron.