Xx network Economic Tweaks - Realtime Failure Deductions

At some point you can’t do anything but moving away from home.

I don’t have a above average failure rate but I have problems keeping my home installation reliable. As a home hoster you have to deal with power outages at home, power outages at your internet provider, hardware getting to hot in summer, visitors unplugging the router from the electricity to charge their phones instead (no joke, I had this problem more than once), iOS updates clogging the connection, Netflix 4k streams clogging the connection, the consumer grade routers coming from your internet provider randomly start dropping packets due to whatever reason after running for too long or e.g. under heavy UDP load (I used to restart my router every night with a cronjob to keep my failure rate low).

You have noise while you sleep and you loose rewards in case you allow yourself to go on holiday for a week and something happens while you are gone and there is no one to call (last charging incident for me was when I was away for 5 days and my girlfriend had someone visiting her). In theory this can all be solved and accounted for preemptively. But in practice you have a much easier life just NOT hosting at home. I thought about this more than once, especially now that no one penalizes me for using Hetzner anymore.

If home hosting is a design goal of the network, then the rewards mechanism should make sure that I don’t loose money on running on imperfect infrastructure compared to pay for colocation in one of the big cheap datacenters. I’m not able to do the exact math since I’m too stupid and there are no tools available, but I can see that my node receives slightly below average era points. I’m also lacking the knowledge and tools to identify the root cause but it’s probably my latency since I’m hosting in a remote country on a residential fiber connection.

It seems clear to me that buying a new machine and sending it to Hetzner or Ikoula would be amortized in 1 month. It doesn’t have to be Hetzner, but it would probably end up in Frankfurt, since anything else will be vastly more costly or further from the other nodes so not optimal regarding failure rate. For instance my local DC would charge me $1000 per month for 200 MBit/s and still have high latency to 90% of the other nodes.

In conclusion, I consider my location a feature and accept less payout out of a slightly idealistic motivation, but this proposal scares me and I will see how my rewards will change if it’s implemented and make a decision as others already implied based on the economic incentives presented to me.

Finally some napkin math that I came up with to reason about the problem while typing this response.

Currently my node received 17% less era points than the best performing node as displayed in the explorer. At 1100 xx payout for me during the last 6 days that amounts to 225 xx less rewards than the top performer or 1125 xx less per month. At the community sale price of effectively 80 cents this is $900 per month in missed rewards. This is my incentive to use a centralized datacenter instead of supporting decentralization.

Since the best performing node takes roughly 8% commission, I have to assume that I actually loose something close to 10% of potential income because I’m staking my testnet rewards instead of simply delegating them. Or in other words, my incentive to do nothing and instead use my hardware to play games is an extra $500 per month.

now come and reduce my rewards.

3 Likes

Power outage if short but regular can be solved with an UPS if your ISP stay up during power outage.
Plug your router on another outlet or put a power strip…
You need ~100-200mb dedicated bandwidth so if you need 200mb for netflix you take a second internet connection or one with 500mb and if you need even more 1gb. Some routers have bandwidth limiter too so you can restrict some ports for reserve bandwidth to your node.

Ask your ISP to update/change your router if not work more than 24h.
You can host a VPN for solve some issue in remote on a public wireless available almost everywhere.

183xx/day is a good result !
And if you not want host this computer at home and/or not pay for DC you can still nominate other nodes and received some rewards too without pay/host anything. The network should be almost perfect so everyone need to work together for reach this goal :slight_smile:

Thank you for your recommendations, but you don’t get my point. By just delegating I wouldn’t have to deal with any of this. If I can make more money by renting the machine in a DC I could take that and still don’t have to deal with any of this. And by “me” I mean the silent mass of xx node operators that plan their next move. I think you can agree with me that home hosting is inherently more error prone than professional hosting (which is my point when I enumerated all the sources of failure) and if decentralization is a design goal this should be incentivized instead of being disincentivized by less rewards.

I can not even buy 200 MBit/s btw. I have 3 times 100 MBit/s instead. One of them for Netflix. One for the gateway, one for cmix. I bought new routers and disabled the ones shipping from the providers. I got myself some UPS but they seem to be broken. need new ones. I have a gasoline generator just in case but I assume when my home is out of electricity the internet connection would be down anyway. I forwarded ssh from my public IP so I can ssh into the node via 2 different connections.

I assume that only few operators go as far as me in that case. I make good money with this setup and I’m not complaining about it. But I’m pointing towards the elephant in the room that by not doing anything I would make even more money.

1 Like

I think the proposal is fair, as long as the causes of real-time failure are well understood. I don’t really think they are. I suspect they are network related, but I haven’t seen any solid analysis confirming it.

If the team does this, I encourage them to perform some analysis into the causes of higher RTF before punishing people for it. I am speaking mostly of people of the high end but still under 1%, there is a big variance sub-1%.

2 Likes

a lot can be done to improve latency and I have found the connection between node and gateway to be important. (network peering and vpn is fun!)
I have never experienced a power outage… so with a synchronous internet connection that is “somewhat” dedicated (set limits in router) - rewarding high performance nodes makes a lot of sense to me.
I actually stopped using the machines for other stuff on the side. like I did for a while… this made performance take a noticeable but not immediately obvious leap. I deactivated syncthing, ensured I had no aggressive anti-virus or rootkit scans running… shaved several small percentage points off my timeouts.
sure - this economic tweak can be seen as punishing low or medium performing nodes and we can all agree that “earning less coin” is not as good as “earning more coin”… but this is a decentralized, high performance project - which needs incentivization towards performance and quality of service.
please activate this tweatk.

This may be a coincidence but in recent days (in fact it may even be two weeks), we’ve had a small number (maybe 3-4) of home-based nodes visibly impact the entire network. Two of them have been having frequent CUDA failures, impacting realtime. I took this just now and currently the second (12.54%) worst is now finally offline, but the other three are happily c-mixing.

image

There are also issues with high precomp rates among these but also different nodes).

image

So maybe it’s one of those days, but you can see that realtime is more impacted by home-based nodes than by cloud-based nodes. (I checked ISP column, none are hyperscaler-based). Maybe there are other instances where hyperscaler nodes impact more - this may be anecdotal.

But in any case, the result today is that everyone is suffering and there’s nothing we can do.

I have a longer post about this elsewhere but not directly related to realtime failures, so I’ll just say that another suggestion I have for the XX Team in that other place is to consider how to wind down Multiplier Program because these problems show it’s not just one unlucky node. Today we lost 20% of throughput - it’s like being attacked except that it’s probably our own validators.

image

I could make a better argument for this with the help of Excel, but it is my intuition that it’s going to be very hard to fix realtime failures as long as fixed and generous subsidies remain in place. Chilling knocks you out for a short while, but as long as you’ve been around for 6 months or longer, it’s very likely that you can easily get reelected because you’ll again get that 100’000+xx multiplier.

The other scenario is a validator can have a persistently annoying realtime (or precomp) failure rate of say 3.8% and still make a decent ROI using elevated commission rates thanks to the multiplier and ease of getting elected.

Today it’s realtime cMix rounds, tomorrow it’s could be slow gateway nodes and then we’ll have another discussion, about gateway network and database performance. So I’d suggest to consider a realtime penalty and an aggregate Multiplier Decay Factor that would impact node multiplier so that nodes that perform poorly over time (weeks) lose Team Multiplier sooner than the rest. For example, worst 10% of nodes would lose it in 4-5 months, average node in 8, and best in 10. Or deduct that realtime penalty from both the multiplier and cMix earnings to make the reelection of bad nodes gradually more difficult.

1 Like