As we prepare for the xx messenger to transition from the TestNet to the BetaNet, we need to review node metrics across the network and improve performance. From day one, we have been saying a time would come when it would be necessary to meet the higher expectations of a production communications network. While we are sympathetic to operators who cannot meet these requirements for no fault of their own, we cannot compromise network performance.
Currently, the network has a 99.5% average Realtime Success Rate, with the majority of those failures being caused by roughly 15% of nodes. These nodes are a mixture of nodes with poor configurations, low bandwidth, and/or high latency due to location. We believe that by dealing with the configuration and bandwidth issues, those nodes with higher latency (which also tend to be those nodes in less represented jurisdictions) can make the cut. Due to the importance of supporting nodes in as many jurisdictions as is possible, we will focus on fixing the other issues so nodes in locations such as Asia, Africa, and South America can continue operating.
Stating Monday, May 3rd, we will be disabling nodes based on Realtime Decrypt and Realtime Permute statistics which show poor bandwidth performance. We expect to disable roughly 1/8th of nodes and we expect this process to reveal what portion of network performance issues are caused by poor bandwidth vs latency.
Disabled nodes will be contacted by the team. We will provide some tools to help diagnose what is wrong with nodes and work with them individually.
We will be working on a policy allowing nodes who need to relocate due to internet issues the time to do so. Many node operators may be unaware of their bandwidth issues and we would rather not hurt dedicated operators. We expect to issue grace periods in a variety of cases.
As always, in the event your node is disabled, it is not an accusation or punishment, it is merely the consequence of the data we have collected. We will extend support to try to determine the cause of the issue and try to help you overcome it. You should not necessarily take the node offline if your node is disabled. You should leave it in ERROR, ready and waiting to be enabled if we need to. There is not much we can do if you do take the node offline and you will lose Uptime towards the May 2021 Uptime Policy. We may consider that a concession and you are expressing you no longer wish to participate.
While we expect these growing pains to be difficult for some, we would like to retain all node operators if possible. While we are faced with real technical challenges that drive our decisions and results, we will make allowances where we can.
In short, the May 2021 Uptime Requirements as as follows:
- 80% Uptime
- 90% Success
- 1% Realtime Timeout
All policies can be downloaded from Get your node running!
Direct link to the May 2021 Uptime Policy
Thanks for running a node!