Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NVMe mounting sometimes not working as it should #8

Open
itamararjuan opened this issue May 16, 2019 · 0 comments
Open

NVMe mounting sometimes not working as it should #8

itamararjuan opened this issue May 16, 2019 · 0 comments

Comments

@itamararjuan
Copy link
Member

I have seen an instance of setting up a couple of machines using this project
which resulted in a wrong EBS disk being mounted for the Ethereum storage.

From looking the cloud-init output file you can clearly see that the cloud-init script has found one of the NVMe's mounted disks to be the 500GB one

...
...
...
Waiting to see if the 500G disk was mounted..
Found the 500G SSD disk on nvme2n1 , attempting to mount it..
meta-data=/dev/nvme2n1           isize=512    agcount=4, agsize=32768000 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=0, rmapbt=0, reflink=0
data     =                       bsize=4096   blocks=131072000, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0 ftype=1
log      =internal log           bsize=4096   blocks=64000, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 42.3M  100 42.3M    0     0  28.9M      0  0:00:01  0:00:01 --:--:-- 28.9M
ethereum: available
healthcheck: available
ethereum: added process group
healthcheck: added process group
Cloud-init v. 18.5-45-g3554ffe8-0ubuntu1~18.04.1 running 'modules:final' at Wed, 15 May 2019 19:28:56 +0000. Up 19.21 seconds.
Cloud-init v. 18.5-45-g3554ffe8-0ubuntu1~18.04.1 finished at Wed, 15 May 2019 19:29:22 +0000. Datasource DataSourceEc2Local.  Up 45.19 seconds
Cloud-init v. 18.5-45-g3554ffe8-0ubuntu1~18.04.1 running 'init-local' at Thu, 16 May 2019 07:39:09 +0000. Up 8.95 seconds.
...
...
...

but something - TBD (maybe from AWS side?) has caused the NVMe of the 500GB to move or change it's identification. So in the end result when ssh'ing into the server
the output of lsblk provides the following, showing clearly that the 500GB disk is not attached anywhere.

ubuntu@ip-172-31-100-210:~$ lsblk
NAME        MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
loop0         7:0    0    18M  1 loop /snap/amazon-ssm-agent/1335
loop1         7:1    0  89.4M  1 loop /snap/core/6818
nvme2n1     259:0    0     8G  0 disk
└─nvme2n1p1 259:3    0     8G  0 part /
nvme1n1     259:1    0 139.7G  0 disk
nvme0n1     259:2    0   500G  0 disk

It's important to note this didn't happen on 2 other Parity machines which we're provisioned at the exact same time. So this is very much likely an Amazon issue which we probably can circumvent around if we decide to tackle it.

Another option is to add information how to verify this isn't happening post-terraform for any operators of this project to make sure the node doesn't fill up the 8GB disk space quickly and stand still being a useless box from that point on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant