Node went offline shortly before but also after gateway 1.7.0 update today

Since 12:23 UTC my node has been down.

I see after that there was an update after that (around 2pm UTC):

[INFO] 15-Jan-21: Executing command: {'command': 'update', 'info': {'path': 'bin/gateway-1.7.0.binary', 

But I don’t know what happened before that. Now my node is down - it seems it cannot connect to permissioning. I restarted services, no change. I see the network is doing fine (4K TPS) so it seems like it’s just me. I haven’t touched the server for over one week when I last updated some minor packages with apt-update.

I don’t think it’s been eliminated due to insufficient h/w specs or latency (also I didn’t get any notices).

Edit: here’s what I’m seeing in node logs:

INFO 2021/01/16 Connecting to Attempt number 0 of 4294967295
DEBUG 2021/01/16 Timing out in: 2s
INFO 2021/01/16 Successfully connected to
INFO 2021/01/16 Attempting to establish authentication with host Uxxxxxxxxxxx
INFO 2021/01/16 Shutting down node server listener: ...
ERROR 2021/01/16 Cannot start, permissioning is unavailable, retrying in 10s...

Any idea on what to do next?

Go to the discord channel and ask for help. Ask Keith he’s the guy to help.

1 Like

Took me 15 min to figure out where XX Network discord is and recover my pass and by the time I was logged in and ready ask a question, I looked at tail output and saw the node was behaving normally again. Whew!

But it worries me that this can happen to isolated nodes. In the worst case it can make one drop below minimum uptime requirements for the month.

I used to think this scenario wasn’t possible because network problems always affect a bunch of nodes at once, but it’d be a nightmare to prove it. For example (I took this screenshot moments ago) whatever this was it impacted only a handful of nodes at most. If you end up being the only guy who complains it was the network and not you… That thought makes me nervous! :sweat_smile: