Latency from where?

Regarding these measurements and the latency figures published in recent weeks, I wonder how objective they are.

Where is the latency measured from?

Today I performed a ping test (10,000 times) to a 1,000 km distant hyperscaler DC from the system where XX Network is currently running.

Count             : 10000
Average           : 2.30209
Sum               : 23020.9
Maximum           : 37.5
Minimum           : 1.49
StandardDeviation : 1.48813973399234

This is much better than test results published by the project.

I wondered about this before but in recent weeks node latency has become a big topic so maybe someone can throw some light on how the project measures latency and from where.

If what you have done is a ping between your node and the permissioning server then this has almost nothing to do with the (realtime) duration of rounds, that is mainly a function of latency and bandwidth of all three nodes in the team (and also latency of perm server, but very marginally)

I pinged a hyperscaler DNS IP which is 1,000 km away from my node.

(realtime) duration of rounds, that is mainly a function of latency and bandwidth of all three nodes in the team (and also latency of perm server, but very marginally)

That seems to imply that those whose node is “remote” (from most other nodes on the network) but otherwise has no problems (generic latency to regional Internet, bandwidth, hardware performance, uptime) is screwed and could ultimately be removed unless they reinstate their node in the US or EU. I’m talking about this Q&A.

Q: What happens if my failure rate cannot be brought down?
A: Nodes who are unable to make corrective actions and are unable to bring up their performance may be cut from the network.

It used to be said that nodes should preferrably be spread around the world to maximize resilience and decentralization, but the focus on minimizing failure rates and mTPS, I wonder if that’s still as important as it used to be, and whether “corrective actions” will include a suggestion to move the node to a centralized hyperscaler DC.

Apart from making changes to service settings, I don’t see what other major corrective actions I could possibly apply to my node. All specs are same or better than recommended, it’s dedicated to XX Network and barely utilized (2-3%) and ping latency to regional Internet hubs is around 2.3ms.

Latency/Duration and Failure Rate are loosely coupled…you are right that a remote/isolated node has a higher latency, but that doesn’t necessarily means that will have a higher failure rate.

Have you been asked to make changes to your node?

RE: Corrective Actions. I have been in contact with dozens of node operators whose node had some sort of problem that affected their performance, not to threaten them but to understand why their node performs the way it does and to suggest how they can improve their performance. I’ve helped correct incorrectly configured nodes, well below specification hardware and/or network connections, wrong driver version, faulty hardware, etc.

Of all the node operators I’ve spoken to very few have required me to ask them to make changes to their system and that was usually because they had deviated so far from our recommended configuration that it was clearly the cause of their poor node performance.

We are not out to kick anyone off the network, we need to understand how even one node can have an affect on the entire network and if something can be done. If the operator cannot meet basic specifications and absolutely refuses to be cooperative we must determine what is the best course of action for the entire project.

The recent test of Realtime latency showed that some our code was the cause of lower overall network performance. So no one needs to feel threatened and if we contact anyone, we’ll cross that bridge when we get there.

Thanks for running a node.

Have you been asked to make changes to your node?

No Keith I haven’t (yet), but I have a “remote” node with supposedly “high latency” so I’m using some downtime during weekend to plan ahead before that happens. For example today I’ve started looking at the possibility of using network QOS to eliminate the possibility of network congestion on my internal network (I don’t think that’s necessary, but I can’t think of any other low hanging fruit that’s under my control).

I’m suggesting you don’t have to. There will be conditions that are out of anyone’s control. On the development side we have to determine what we can do in the codebase. On your side, you seem to have the specs covered. We as a network must find a balance but that doesn’t mean just excluding nodes that are, as you said, remote. Because like you mentioned it’s relative and as the network grows and code is improved, your node may not stand out at a later date. But since it does now it is a great resource to understand how we must improve. You really don’t need to be concerned, you’ve got the node, let’s use it to build a better network!

3 Likes