Node in Error status - restart of node and gateway is not helping

I’ve detected non-correctable issue, the node in Dashboard is in permanent error status, the restart neither of gateway nor node is helping.

The logs are below, just node IP replaced.

INFO 2020/07/27 23:45:18 All config params: [user_whitelist_file clean_period port ip_leakybucket_rate listeningaddress user_leakybucket_capacity log idfpath max_duration certpath localaddress servercertpath ip_leakybucket_capacity keypath user_leakybucket_rate permissioningcertpath messagetimeout loglevel nodeaddress ip_whitelist_file]
INFO 2020/07/27 23:45:18 config: /opt/xxnetwork/gateway.yaml
INFO 2020/07/27 23:45:18 Params:
 map[certpath:/opt/xxnetwork/creds/gateway_cert.crt clean_period:30m idfpath:/opt/xxnetwork/gateway-logs/gatewayIDF.json ip_leakybucket_capacity:4000 ip_leakybucket_rate:5e-06 ip_whitelist_file: keypath:/opt/xxnetwork/creds/gateway_key.key listeningaddress:0.0.0.0 localaddress:0.0.0.0 log:/opt/xxnetwork/gateway-logs/gateway.log loglevel:0 max_duration:15m messagetimeout:10m0s nodeaddress:[IP]:11420 permissioningcertpath:/opt/xxnetwork/creds/permissioning_cert.crt port:22840 servercertpath:/opt/xxnetwork/creds/node_cert.crt user_leakybucket_capacity:4000 user_leakybucket_rate:5e-06 user_whitelist_file:]
INFO 2020/07/27 23:45:18 Gateway port: 22840
INFO 2020/07/27 23:45:18 Gateway listen IP address: 0.0.0.0
INFO 2020/07/27 23:45:18 Gateway node: [IP]:11420
WARN 2020/07/27 23:45:18 Could not load whitelist: failed to create whitelist file : open : no such file or directory
ERROR 2020/07/27 23:45:18 Could not load initiate whitelist: Failed to read whitelist file: open : no such file or directory
INFO 2020/07/27 23:45:18 Starting server with TLS...
INFO 2020/07/27 23:45:18 Beginning polling NDF...
INFO 2020/07/27 23:45:21 Host ZHVtbXkAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAC not connected, attempting to connect...
INFO 2020/07/27 23:45:21 Connecting to [IP]:11420. Attempt number 0 of 2147483647
INFO 2020/07/27 23:45:21 Successfully connected to [IP]:11420
INFO 2020/07/27 23:45:21 Attempting to establish authentication with host ZHVtbXkAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAC
INFO 2020/07/27 23:45:21 Successfully obtained NDF!
INFO 2020/07/27 23:45:21 Shutting down gateway server listener: &{%!s(*net.netFD=&{{{1 0 0} -1 {0} <nil> 0 0 true true false} 10 1 false tcp 0xc0001c5140 <nil>}) {%!s(func(string, string, syscall.RawConn) error=<nil>) %!s(time.Duration=0)}}
INFO 2020/07/27 23:45:22 Starting server with TLS...
INFO 2020/07/27 23:45:22 Starting xx network gateway v1.3.2
INFO 2020/07/27 23:45:23 Host C40rOF3ILTN1wzcNsgn4vTgWi+V1vJm3OnzZ7rxRl7cB not connected, attempting to connect...
INFO 2020/07/27 23:45:23 Connecting to [IP]:11420. Attempt number 0 of 2147483647
INFO 2020/07/27 23:45:23 Successfully connected to [IP]:11420
INFO 2020/07/27 23:45:23 Attempting to establish authentication with host C40rOF3ILTN1wzcNsgn4vTgWi+V1vJm3OnzZ7rxRl7cB

This is the log of the gateway, nothing strange here, but useless.
The log of the node will be more helpful.

Node log is too silent, seems the permissioning server is silent, dev centralization? :slight_smile:

> INFO 2020/07/27 23:44:53 Loading certificates from disk
> INFO 2020/07/27 23:44:53 Host UGVybWlzc2lvbmluZwAAAAAAAAAAAAAAAAAAAAAAAAAA not connected, attempting to connect...
> INFO 2020/07/27 23:44:53 Connecting to permissioning.prod.cmix.rip:11420. Attempt number 0 of 2147483647
> INFO 2020/07/27 23:44:53 Successfully connected to permissioning.prod.cmix.rip:11420
> INFO 2020/07/27 23:44:54 Host UGVybWlzc2lvbmluZwAAAAAAAAAAAAAAAAAAAAAAAAAA not connected, attempting to connect...
> INFO 2020/07/27 23:44:54 Connecting to permissioning.prod.cmix.rip:11420. Attempt number 0 of 2147483647
> INFO 2020/07/27 23:44:54 Successfully connected to permissioning.prod.cmix.rip:11420
> INFO 2020/07/27 23:44:54 Attempting to establish authentication with host UGVybWlzc2lvbmluZwAAAAAAAAAAAAAAAAAAAAAAAAAA
> INFO 2020/07/27 23:44:54 Adding dummy users to registry
> INFO 2020/07/27 23:44:54 Waiting on communication from gateway to continue
> INFO 2020/07/27 23:45:21 Communication from gateway received
> INFO 2020/07/27 23:45:21 Updating to WAITING

The perm server is not scheduling rounds for your node. It happened to my node in the early days, too. I don’t think you did anything wrong and you cannot do anything, just waiting for a fix.

1 Like

What is your Node ID or application ID? I can look and see.

Sorry, missed the message, now after (I think central server upgrade) all works perfectly fine. My node ID is: https://dashboard.xx.network/nodes/C40rOF3ILTN1wzcNsgn4vTgWi-V1vJm3OnzZ7rxRl7cC

Hi,

i have the same issue right now after a reboot.
My Node ID: 9ggq4+IESrPoFhLajFj12AZKxwpGadPkpi9gxfzXCY4C

What should I do?
Thanks.