BetaNet software update 8/6/2020

We will be releasing a software update next Thursday (8/6/2020) to the network per the release schedule found in the handbook. This update will include a significant improvement in how log uploading is achieved using CloudWatch, as well as a series of stability improvements. It also has a restructuring of the comms code to allow it to be better shared between Elixxir and Praxxis.

This release also includes an updated node handbook, which can be found here.

This update will go out over Auto Update on 8/6/2020, you only need to update manually if that is disabled. (Auto Update is enabled by default)

Wrapper Script

MR: https://gitlab.com/elixxir/wrapper/-/merge_requests/28

  • Overhauled how logs are handled, they are now uploaded piece by piece through CloudWatch.
  • Log backups to AWS S3 have been disabled
  • The wrapper script now sends log messages to a CloudWatch Log Stream. As this does not duplicate any data, bandwidth usage is greatly reduced.

Crypto

MR: https://gitlab.com/elixxir/crypto/-/merge_requests/212

  • Updated internal dependencies

elixxir/comms

MR: https://gitlab.com/elixxir/comms/-/merge_requests/257

  • Removed connect package components of mixmessages and moved components used by both Elixxir and Praxxis into a separate package, xx_network/comms
  • Updated internal dependencies

xx_network/comms

MR: https://gitlab.com/xx_network/comms/-/merge_requests/10

  • Created the repository
  • Moved elixxir/comms/connect package
  • Moved part of elixxir/comms/mixmessages package to xx_network/comms/messages

Gateway

Version: 1.4.0

MR: https://gitlab.com/elixxir/gateway/-/merge_requests/147

  • Support for comms split
  • Updated internal dependencies
  • Removed old code to handle the notification bot to stop erroneous warning logs
  • Spelling fixes

Server

Version: 1.4.0

MR: https://gitlab.com/elixxir/server/-/merge_requests/564

  • Implemented Gateway IP configuration for overriding gateway IP and special support for gateway on the same machine as the server
  • Support for comms split
  • Updated internal dependencies
  • Improved testing
  • Hacky fix for memory leak when gpu is enabled. Underlying cause has not been determined, but threads which were not closing now have timeouts
  • Increased size of outbound databuffer from computation to ensure it is not blocked on transmission

Future Work

We have been working hard on the next steps for the network, which are the re-release of the xx messenger and the launch of xx consensus.

The Elixxir and Praxxis teams worked together on the comms split and have continued to work together to implement a genetic implementation of the Gossip protocol within the shared comms repo. This will be used by Elixxir to build support for the xx messenger by having gateways gossip various information. We will release a blog post with more details about the gossip protocol next week.

We have also been rewriting the cryptographic storage for ArrowSDK in preparation for getting it ready for xx messenger running on the BetaNet.

Over the next few weeks we hope to turn a more detailed focus on the network and improving its stability and performance.

8 Likes

Cool stuff!
Assuming the progress and success/failure of the update will be visible somewhere in the logs, right?

@dainiusk
Yes, you’ll see a message of the update in the node and gateway wrapper log files.

1 Like

On 06-Aug-20 18:18:17 there was a gateway wrapper log entry:
An error occurred (413) when calling the PutLogEvents operation:
Since then, the log gets a new entry every 1-30 seconds:
[ERROR] 10-Aug-20 18:38:35: Parameter validation failed: Invalid type for parameter sequenceToken, value: None, type: <class 'NoneType'>, valid types: <class 'str'>

Is there something to manually update on my side, or OK to ignore this for now?

You may need to update the wrapper script if it wasn’t done automatically.

According to gateway wrapper log it was:
[INFO] 06-Aug-20 18:17:07: Completed command: {'command': 'stop', 'nodes': None} [INFO] 06-Aug-20 18:17:07: Executing command: {'command': 'update', 'info': {'path': 'bin/wrapper.py', 'sha256sum': '1f3cde136e86654b48fb314d68375b0aea044b04dac8a511e281ffd4ae39d803', 'install_path': 'wrapper'}, 'nodes': None} [INFO] 06-Aug-20 18:17:07: Updating file at gateway/bin/wrapper.py to /opt/xxnetwork/xxnetwork-wrapper.py... [INFO] 06-Aug-20 18:17:08: Wrapper script updated, exiting now...

If there was an updated pushed since then, can try updating manually. On gitlab I only see a version modified a week ago.

No, that’s the update.
What version of boto3 is installed?

Was boto3-1.14.32, now updated to boto3-1.14.39, restarted the gateway, the error persists:
[INFO] 10-Aug-20 19:31:02: Completed command: {'command': 'start', 'nodes': None} [ERROR] 10-Aug-20 19:31:08: An error occurred (InvalidParameterException) when calling the PutLogEvents operation: Log event too large: 1395334 bytes exceeds limit of 262144 [ERROR] 10-Aug-20 19:31:09: Parameter validation failed: Invalid type for parameter sequenceToken, value: None, type: <class 'NoneType'>, valid types: <class 'str'> [ERROR] 10-Aug-20 19:31:21: Parameter validation failed: Invalid type for parameter sequenceToken, value: None, type: <class 'NoneType'>, valid types: <class 'str'> [ERROR] 10-Aug-20 19:31:22: Parameter validation failed: Invalid type for parameter sequenceToken, value: None, type: <class 'NoneType'>, valid types: <class 'str'> [ERROR] 10-Aug-20 19:31:55: Parameter validation failed: Invalid type for parameter sequenceToken, value: None, type: <class 'NoneType'>, valid types: <class 'str'> [ERROR] 10-Aug-20 19:31:56: Parameter validation failed: Invalid type for parameter sequenceToken, value: None, type: <class 'NoneType'>, valid types: <class 'str'>

I’ll file a bug report. This looks related to CloudWatch/boto3. Can you DM me your nodeID and I’ll check to see if logs are being uploaded at all?

1 Like

Can you provide the complete output from the following:
$ pip3 show boto3

Seems I cannot: Sorry, you cannot send a personal message to that user. :slight_smile:

~$ pip3 show boto3
Name: boto3
Version: 1.14.39
Summary: The AWS SDK for Python
Home-page: GitHub - boto/boto3: AWS SDK for Python
Author: Amazon Web Services
Author-email: UNKNOWN
License: Apache License 2.0
Location: /home/ubuntu/.local/lib/python3.6/site-packages
Requires: jmespath, s3transfer, botocore
Required-by:

I was able to reproduce the error
An error occurred (InvalidParameterException) when calling the PutLogEvents operation: Log event too large: 1395334 bytes exceeds limit of 262144

You can reduce the logLevel in the yaml files. But if you insist on logLevel higher than INFO then you can disable uploading logs with --disable-cloudwatch in the node and gateway service files. DEBUG and TRACE data does not need to be uploaded to AWS.

Got it, thanks! I had increased the log level in the very beginning and never turned it back down.

1 Like