Guidelines how to handle the recent network instability

Today, as those slow nodes were disabled, it seems other changes were made to the newtork (or did it get overloaded?)

Whatever the reason, now a lot of nodes are in “error” state (what does that even mean?) while some still manage to hang onto their 100% uptime.

image

Couple hours ago only 15-16 nodes were online, some for several hours.

What should a node owner do if their node ends up in “error” state? Reboot? Restart service? Wait for it to recover on its own?

You can restart services to resume processing rounds and getting back “online”. But it may revert recurrently to “error” until the team fixes problems caused by yesterday’s updates.

1 Like

Today I checked my node and it was with error.
Rebooted and was ok but a few hours later it’s again with Error

Right, so rinse & repeat until you see your version updated (if auto-update is ON as per the handbook) as lonewolf suggested.

The yellow ERROR indicates that a node has not participated in a round in 5 minutes.

I’ve received varying reports from people but one common report is that the node doesn’t crash and therefore doesn’t restart the node process.

We are trying to determine why this is happening.

In the meantime, one remedy for not participating in rounds is to restart the node and gateway services. However, as stated earlier, this is not expected to be a solution to the ERROR state.