Gateway States for February, 2025

Hi All –

Gateway statistics for February 2025 are here:

https://pad.carback.us/drive/#/2/drive/view/-OmT+xmwm78g8P07ss9NLwXe2ZRHjZrxWlu1AvLm0k8/

You need to only be worried about the latency.grpc and latency.websocket columns. The value -1 means your gateway failed the test. Any positive number means it passed.

The highest score achievable is 80. You should ignore this metric for now as it is used for internal consistency check and is not relevant criteria at this time.

Main takeaways:

  1. There are nodes who are not exposing the proper ports to the internet via the test machine we used, located in the US.
  2. A large number of nodes are accessible, but do not have websocket certificates either manually set up or via the automated foundation-run zerossl system. This means they are inaccessible to haven and other webclients except via the built-in node proxying.
  3. Most nodes have pretty good stats otherwise.

We are looking at a solution to collate this data from different test machines across the internet to eliminate #1 above.

If you are in category 1 AND not in a country that blocks all US machines then you have a problem that needs to be resolved now. Your node’s gateway is not accessible via an open machine on the internet.

You can check your connectivity with the following command:

openssl s_client -alpn h2 -showcerts -connect "[gateway ip address]:22840"

Where [gateway ip address] is the gateway ip address or dns hostname and 22840 is the port number hosting the gateway service.

To run this command, you will need to have openssl installed. If you are running a mac or a linux machine, it should be installed already. Otherwise, ask a search engine to get instructions. When you run this command, you should see something like:

% openssl s_client -alpn h2 -showcerts -connect "[gateway ip address]:22840"
Connecting to 194.163.163.77
CONNECTED(00000003)
Can't use SSL_get_servername
depth=0 C=PT, ST=PT, L=Porto, O=xxnetwork, OU=nodes, CN=xx.network, [email protected]
verify error:num=18:self-signed certificate
verify return:1
depth=0 C=PT, ST=PT, L=Porto, O=xxnetwork, OU=nodes, CN=xx.network, [email protected]
verify return:1
---
...

If the command produces no output after 5 seconds, your configuration is broken. Most likely you have installed a firewall or are not forwarding ports properly to and from the internet. Please ask for support in the forum or discord.

If you are in category 1, the foundation will no longer be allowing your node to use the staking tool starting in Q2. The foundation expects to warn nodes for which we have contact information by the end of Q1 and to start banning nodes in Q2. We will also be raising this issue of low-performing and low-connectivity nodes in council to decide what else can be done (slashing, banning, etc).

If you are in category 2 then you do not need to do anything at this time. We will post more information on how to fix this at a later date once we have analyzed the root cause of why some nodes are not working.

For developers, you can download and run the node stats tool yourself here:

NOTE: Non-developers should not bother attempting to run this tool. Download the csv or preview it directly and search for your gateway’s ip or dns name.

2 Likes

More scrutiny is welcome.

We will also be raising this issue of low-performing and low-connectivity nodes in council to decide what else can be done (slashing, banning, etc).

Years ago there were ideas about penalizing realtime failures on mixing nodes.
It may be more laborious and costly to track gateways, but we could run lambda functions on AWS or CF to probe gateways from time to time. Maybe it would cost less by running such probes on some “compute” focused blockchains (IC, for example).


The Gateway statistics table for my node shows latency.websocket = -1, which means the test failed.
But when executing the command openssl s_client -alpn h2 -showcerts -connect "gate_IP:22840" the result is displayed immediately and is the same as in the example:

CONNECTED(00000003)
Can't use SSL_get_servername
depth=0 C = KY, ST = " ", L = George Town, O = xxnetwork, OU = x0d, CN = xx.network, emailAddress = [email protected]
verify error:num=18:self-signed certificate
verify return:1
depth=0 C = KY, ST = " ", L = George Town, O = xxnetwork, OU = x0d, CN = xx.network, emailAddress = [email protected]
verify return:1
---
Certificate chain
 0 s:C = KY, ST = " ", L = George Town, O = xxnetwork, OU = x0d, CN = xx.network, emailAddress = [email protected]
   i:C = KY, ST = " ", L = George Town, O = xxnetwork, OU = x0d, CN = xx.network, emailAddress = [email protected]
   a:PKEY: rsaEncryption, 4096 (bit); sigalg: RSA-SHA256
   v:NotBefore: Oct 26 17:39:54 2023 GMT; NotAfter: Oct 23 17:39:54 2033 GMT

To check, I ran this command for another node, which has a positive latency.websocket value, the output of the command is the same as above.
But if I run the command with the specified IP [REDACTED]:22840, then the result from different machines available to me will be as follows:

40973B309A7F0000:error:8000006F:system library:BIO_connect:Connection refused:../crypto/bio/bio_sock2.c:125:calling connect()
40973B309A7F0000:error:10000067:BIO routines:BIO_connect:connect error:../crypto/bio/bio_sock2.c:127:
connect:errno=111

In the Gateway statistics table for [REDACTED] latency.websocket is specified as =-1, i.e. the same as my gateway, but as you can see, when executing the openssl -showcerts command, my certificate data is displayed, but [REDACTED] does not.
I also compared two machines available to me with gateway with the same configuration and settings, but one has latency.websocket =-1, while the second has a positive value, and both openssl -showcerts returns certificate data immediately without an error.
Tell me, how can I accurately determine if there is a problem with websocket?

You need to ensure the node/gateway is ONLINE.

As far as I understand, there is no reason to worry. There are rare failures, I will think that it is possible that it was a coincidence that the statistics were collected when the node was temporarily offline.