Couple of crashes

Hello,
2 days ago my dappnode stopped staking. I had to reboot it a couple of times. Then it was staking fine for one day. And again failed…

Sometimes it works back but not all validator are enable. Look likes I had issue with geth. But not sure. I just switched to light client i will see how it goes. I upgraded to lastest geth and prysm ( I dont know why but it was not up to date? Althought auto update is enable).

My internet connection seems rock solid.

Access is done through android, not the easiest to past logs here.

I can upload a 16mb log not sure how usefull it would be. Any input is welcome!

Thanks
Nuci5 1to hdd

Prysm Version: 1.0.10 (v1.3.3 upstream)
Geth Version: 0.1.14 (v1.10.1 upstream)

Before filing a new topic, please provide the following information.

Core DAppNode Packages versions

  • bind.dnp.dappnode.eth: 0.2.6
  • core.dnp.dappnode.eth: 0.2.42
  • dappmanager.dnp.dappnode.eth: 0.2.38, commit: 85e2b4cf
  • ipfs.dnp.dappnode.eth: 0.2.14
  • vpn.dnp.dappnode.eth: 0.2.8, commit: f9a8743e
  • wifi.dnp.dappnode.eth: 0.2.6

System info

  • docker version: Docker version 18.09.8-ce, build 0dd43dd87fd530113bf44c9bba9ad8b20ce4637f
  • docker compose version: docker-compose version 1.25.5, build unknown
  • platform: linux, x64, 5.9.0-4-amd64
  • Disk usage: 56%

Geth is not the problem. I was staking fine with light eth1.0 client for all night, then today crash again. A couple of reboot later I only get half of my validator running correctly. Openvpn connection doesnt work. I need a hdmi cable coming tomorrow for troubleshoot. Could be a bad nvme?
Worth doing a new fresh install?
Cheers

Working fine for a couple of hours.
Ipfs disable
Light client
Monitoring package disable
One issue not sure if it is related :
Prysm ui is not fully fonctional and I have a warning message I cant fully copy paste being on a phone :
Http failure response from http://prysm.dappnode/api/v2

Firstly you have the latest version:

  • Geth
  • Prysm
  • DAppmanager

Do you have access to the DAppNode UI or the Prysm UI?

If you have access to the DAppNode UI, it would be great to paste your logs or download them and send them to see what is happening. You can do it if you go to Packages > Prysm > logs.

The Prysm UI error seams something visual, not related to the validator. If you have access you can access the logs from this UI in the Analytics section. Check if you are using some ad blocker, with brave for example you have to disable the shields to navigate well for the UI.
With no logs we can’t know what is happening.

Few versions before there were some error seems this (Validators stopped working after a while) but they were fixed in the last 2 version.

Thanks for your answer
Your right I used different browser and it fixes the prysm ui bug.

Ipfs disable and light client. No crash so far not 1 attestation missed.

Will keep it like that for a week
If it crash again I will copy past.
Here is some error message I managed to save, not sure if its related :

time="2021-03-22 13:07:52" level=warning msg="Subscription next failed" error="context canceled" prefix=sync topic="/eth2/b5303f2a/beacon_attestation_34/ssz_snappy"
time="2021-03-22 13:07:56" level=info msg="Waiting for enough suitable peers before syncing" prefix=initial-sync required=3 suitable=0
time="2021-03-22 16:51:19" level=warning msg="Running on ETH2 Mainnet" prefix=flags
time="2021-03-22 16:51:19" level=info msg="Using "max_cover" strategy on attestation aggregation" prefix=flags
time="2021-03-22 16:51:19" level=info msg="Checking DB" database-path="/data/beaconchaindata" prefix=node
time="2021-03-22 16:51:24" level=info msg="Deposit contract: 0x00000000219ab540356cbb839cbe05303d7705fa" prefix=node
time="2021-03-22 16:51:24" level=info msg="Waiting for state to be initialized" prefix=initial-sync
time="2021-03-22 16:51:24" level=info msg="Starting beacon node" prefix=node version="Prysm/v1.1.0/9b367b36fc12ecf565ad649209aa2b5bba8c7797. Built at: 2021-01-18 19:47:14+00:00"
time="2021-03-22 16:51:24" level=info msg="gRPC server listening on port" address="[0.0.0.0:4000](http://0.0.0.0:4000/)" prefix=rpc
time="2021-03-22 16:51:24" level=warning msg="You are using an insecure gRPC server. If you are running your beacon node and validator on the same machines, you can ignore this message. If you want to know how to enable secure connections, see: https://docs.prylabs.network/docs/prysm-usage/secure-grpc" prefix=rpc
time="2021-03-22 16:51:24" level=info msg="Starting JSON-HTTP API" address="[0.0.0.0:3500](http://0.0.0.0:3500/)" prefix=gateway
time="2021-03-22 16:51:24" level=error msg="Could not connect to powchain endpoint" error="could not dial eth1 nodes: Post "[http://geth.dappnode:8545](http://geth.dappnode:8545/)": dial tcp [172.33.0.8:8545](http://172.33.0.8:8545/): connect: connection refused" prefix=powchain
time="2021-03-22 16:51:24" level=info msg="Blockchain data already exists in DB, initializing..." prefix=blockchain
time="2021-03-22 16:51:24" level=info msg="Cleaning up dirty states" count=1 prefix=db
time="2021-03-22 16:51:26" level=info msg="New gRPC client connected to beacon node" addr="[172.33.0.9:52238](http://172.33.0.9:52238/)" prefix=rpc
time="2021-03-22 16:51:26" level=info msg="Chain genesis time reached" genesisStateRoot=7e76880eb67bbdc86250aa578958e9d0675e64e714337855204fb5abaaf82c2b genesi
Time="2020-12-01 12:00:23 +0000 UTC" genesisValidators=21063 prefix=slotutil
time="2021-03-22 16:51:27" level=info msg="Starting initial chain sync..." prefix=initial-sync
time="2021-03-22 16:51:27" level=info msg="Waiting for enough suitable peers before syncing" prefix=initial-sync required=3 suitable=0
time="2021-03-22 16:51:27" level=info msg="Started discovery v5" ENR="enr:-LK4QMqW0Z1GiJzPixcygdNbT49HMyR3nZjVv2VqhA9ZwTfJAEKlOO8tc0gmJ1PUSbvbdb0raYsBrJ97wU5UxLsW798Bh2F0dG5ldHOIAAAAAAAAAACEZXRoMpC1MD8qAAAAAP__________gmlkgnY0gmlwhKwhAAaJc2VjcDI1NmsxoQLYdvmh-dz47_1xc6XhxcfKb9Q_RAkTfdfE2LPFylGC2IN0Y3CCMsiDdWRwgi7g" prefix=p2p
time="2021-03-22 16:51:27" level=info msg="Node started p2p server" multiAddr="/ip4/[172.33.0.6/tcp/13000/p2p/16Uiu2HAm9zfGKTTnsGxJuZkzCTXg63uau3LzgBnRMSUVq7GvbQYP](http://172.33.0.6/tcp/13000/p2p/16Uiu2HAm9zfGKTTnsGxJuZkzCTXg63uau3LzgBnRMSUVq7GvbQYP)" prefix=p2p
time="2021-03-22 16:51:41" level=info msg="Processing block batch of size 31 starting from 0xc3118d7e... 797089/800656 - estimated time remaining 38m21s" blocksPerSecond=1.6 peers=21 prefix=initial-sync
time="2021-03-22 16:51:52" level=info msg="Processing block batch of size 64 starting from 0xbdb2d32a... 797120/800657 - estimated time remaining 12m24s" blocksPerSecond=4.8 peers=32 prefix=initial-sync
time="2021-03-22 16:51:54" level=info msg="Connected to eth1 proof-of-work chain" endpoint="[http://geth.dappnode:8545](http://geth.dappnode:8545/)" prefix=powchain
time="2021-03-22 16:51:55" level=info msg="Processing block batch of size 64 starting from 0x316bd697... 797184/800657 - estimated time remaining 7m16s" blocksPerSecond=8.0 peers=32 prefix=initial-sync
time="2021-03-22 16:52:27" level=info msg="Peer summary" activePeers=33 inbound=0 outbound=33 prefix=p2p
time="2021-03-22 16:52:35" level=info msg="Processing block batch of size 63 starting from 0x5d6c0130... 797248/800661 - estimated time remaining 18m3s" blocksPerSecond=3.1 peers=32 prefix=initial-sync
time="2021-03-22 16:52:36" level=error msg="Unable to process past headers provided time is later than the current eth1 head" prefix=powchain
time="2021-03-22 16:52:37" level=info msg="Processing block batch of size 62 starting from 0x86beb7d2... 797312/800661 - estimated time remaining 8m55s" blocksPerSecond=6.2 peers=32 prefix=initial-sync
time="2021-03-22 16:52:39" level=info msg="Processing block batch of size 63 starting from 0x06c430b1... 797376/800661 - estimated time remaining 5m49s" blocksPerSecond=9.4 peers=31 prefix=initial-sync
time="2021-03-22 16:52:42" level=info msg="Processing block batch of size 64 starting from 0x34178546... 797440/800661 - estimated time remaining 4m15s" blocksPerSecond=12.6 peers=30 prefix=initial-sync
time="2021-03-22 16:52:45" level=info msg="Processing block batch of size 64 starting from 0xbe86375b... 797504/800661 - estimated time remaining 3m19s" blocksPerSecond=15.8 peers=30 prefix=initial-sync
time=

No issue so far with ipfs disable and light client.
Every 2-3 days I miss a handfull of attestation in a row. I check prysm log (beacon and validator) but I cant see anything wrong. How can I troubleshoot that please?

Attesting performance issues is a hard task so I would recommend installing some tooling to facilitate it:

  • Make sure you are running the latest version of Prysm
  • Install the DMS package so you have access to the beacon node + validator Prometheus + Grafana metrics
  • Gather logs around the times you missed attestations (accounting for timezone offsets in the logs)
  • Go to Prysm’s Discord and share you case there. They have the expertise to help in this case.

Small updates. I missed a couple of attestation in a row, then back to normal. I try to reach my dappnode but openvpn connection don’t work. I will keep like that but something is wrong…