Ancestors

Toot

Written by Lem453@lemmy.ca on 2024-11-07 at 00:58

Help with ZFS Array

https://lemmy.ca/post/32471131

=> More informations about this toot | More toots from Lem453@lemmy.ca

Descendants

Written by just_another_person@lemmy.world on 2024-11-07 at 01:03

When you say “the drives for renamed”, do you mean you renamed them while the array was online? That sounds like what this means.

In that case, you can find out which drive is the problem, clear it, and repair the array. Should be pretty quick.

=> More informations about this toot | More toots from just_another_person@lemmy.world

Written by Lem453@lemmy.ca on 2024-11-07 at 02:00

I didn’t rename them. I suspect it happened during a reboot or maybe a bios update that I may have done last month.

How do I clear or repair it?

=> More informations about this toot | More toots from Lem453@lemmy.ca

Written by just_another_person@lemmy.world on 2024-11-07 at 02:25

The device names and aliases in /dev don’t just simply change between reboots. Something else happened here.

What are the path or IDs of the drives that are in there now under /dev/nvme*?

=> More informations about this toot | More toots from just_another_person@lemmy.world

Written by Shdwdrgn@mander.xyz on 2024-11-07 at 02:43

Are you sure about that? Ever hear about this supposed predictable network names in recent linux versions? Yeah those can change too. I was trying to set up a new firewall with two internal NICs plus a 4-port card, and they kept moving around. I finally figured out that if I cold-booted the NICs would come up in one order, and if I warm-booted they would come up in a completely different order (like the ports on the card would reverse which order they were detected). This was completely the fault of systemd because when I installed an older linux and used udev to map the ports, it worked exactly as predicted. These days I trust nothing.

=> More informations about this toot | More toots from Shdwdrgn@mander.xyz

Written by Lem453@lemmy.ca on 2024-11-07 at 21:10

I may have done a bios update around the time it went down, I don’t remember for sure but I haven’t added to physically changed the hardware in anyway. Its working now with the above suggestions so thanks!

=> More informations about this toot | More toots from Lem453@lemmy.ca

Written by just_another_person@lemmy.world on 2024-11-07 at 21:26

It really shouldn’t have. It doesn’t make sense that all your other drives were still addressed except for this one.

=> More informations about this toot | More toots from just_another_person@lemmy.world

Written by hendrik on 2024-11-07 at 01:13

I don't know anything about ZFS, but in the future you might want to address them by /dev/disks/by-uuid/... and not by /dev/nvme..

=> More informations about this toot | More toots from hendrik@palaver.p3x.de

Written by Lem453@lemmy.ca on 2024-11-07 at 01:58

Is there a way to change this on an existing zpool?

=> More informations about this toot | More toots from Lem453@lemmy.ca

Written by Shdwdrgn@mander.xyz on 2024-11-07 at 02:36

OP – if your array is in good condition (and it looks like it is) you have an option to replace drives one by one, but this will take some time (probably over a period of days). The idea is to remove a disk from the pool by its old name, then re-add the disk under the corrected name, wait for the pool to rebuild, then do the process again with the next drive. Double-check, but I think this is the proper procedure…

zpool offline poolname /dev/nvme1n1p1 zpool replace poolname /dev/nvme1n1p1 /dev/disk/by-id/drivename

Check zpool status to confirm when the drive is done rebuilding under the new name, then move on to the next drive. This is the process I use when replacing a failed drive in a pool, and since that one drive is technically in a failed state right now, this same process should work for you to transfer over to the safe names. Keep in mind that this will probably put a lot of strain on your drives since the contents have to be rebuilt (although there is a small possibility zfs may recognize the drive contents and just start working immediately?), so be prepared in case a drive does actually fail during the process.

=> More informations about this toot | More toots from Shdwdrgn@mander.xyz

Written by Lem453@lemmy.ca on 2024-11-07 at 21:09

Thanks for this! Luckily the above suggestion to export and import worked right away so this was not needed.

=> More informations about this toot | More toots from Lem453@lemmy.ca

Written by Shdwdrgn@mander.xyz on 2024-11-08 at 02:34

Yeah I figured there would be multiple answers for you. Just keep in mind that you DO want to get it fixed at some point to use the disk id instead of the local device name. That will allow you to change hardware or move the whole array to another computer.

=> More informations about this toot | More toots from Shdwdrgn@mander.xyz

Written by qupada on 2024-11-07 at 04:46

Generally, you just need to export the pool with zpool export zfspool1, then import again with zpool import -d /dev/disk/by-id zfspool1.

I believe it should stick after that.

Whether that will apply in its current degrated state I couldn't say.

=> More informations about this toot | More toots from qupada@fedia.io

Written by Lem453@lemmy.ca on 2024-11-07 at 21:09

Thanks, this worked. I made the ZFS array in the proxmox GUI and it used the nvmeX names by default. Interestingly, when I did zfs export, nothing seemed to happen and it -> I tried zpool import and is said no pools available to import, but then when I did zpool status it showed the array up and working with all 4 drives showing healthy and it was now using device IDs. Odd but seems to be working correctly now.

pool: zfspool1

state: ONLINE

scan: resilvered 8.15G in 00:00:21 with 0 errors on Thu Nov 7 12:51:45 2024

config:

	NAME                                                                                 STATE     READ WRITE CKSUM

	zfspool1                                                                             ONLINE       0     0     0

	  raidz1-0                                                                           ONLINE       0     0     0

		nvme-eui.000000000000000100a07519e22028d6-part1                                  ONLINE       0     0     0

		nvme-nvme.c0a9-313932384532313335343130-435431303030503153534438-00000001-part1  ONLINE       0     0     0

		nvme-eui.000000000000000100a07519e21fffff-part1                                  ONLINE       0     0     0

		nvme-eui.000000000000000100a07519e21e4b6a-part1                                  ONLINE       0     0     0

errors: No known data errors

Any idea how to identify which the physical drive from the device ID? The IDs don’t seem to match up with the serial numbers in any way.

=> More informations about this toot | More toots from Lem453@lemmy.ca

Written by hendrik on 2024-11-08 at 09:45

Strange. Okay, hope that spares you from similar troubles in the future.

=> More informations about this toot | More toots from hendrik@palaver.p3x.de

Written by Shdwdrgn@mander.xyz on 2024-11-07 at 02:25

That is definitely true of zfs as well. In fact I have never seen a guide which suggests anything other than using the names found under /dev/disk/by-id/ or /dev/disk/by-id/uuid and that is to prevent this very problem. If the proper convention is used then you can plug the drives in through any available interface, in any order, and zfs will easily re-assemble the pool at boot.

So now this begs the question… is proxmox using some insane configuration to create drive clusters using the name they happen to boot up with???

=> More informations about this toot | More toots from Shdwdrgn@mander.xyz

Written by Lem453@lemmy.ca on 2024-11-07 at 21:11

Thanks! I got it setup by IDs now. I originally set it up via the proxmox GUI and it defaulted to NVME names

=> More informations about this toot | More toots from Lem453@lemmy.ca

Written by Possibly linux on 2024-11-08 at 04:47

I don’t believe this is the case

=> More informations about this toot | More toots from possiblylinux127@lemmy.zip

Written by hendrik on 2024-11-08 at 09:45

Care to explain?

=> More informations about this toot | More toots from hendrik@palaver.p3x.de

Written by Possibly linux on 2024-11-08 at 16:01

I believe ZFS is smart enough to automatically find the disk on the system as it looks at all the other information like the disk id. It shouldn’t just lose a drive.

zpool just shows the original path of the disk when it was added. Behind the scenes ZFS knows your drives.

What is the output of lsblk? Any missing drives?

=> More informations about this toot | More toots from possiblylinux127@lemmy.zip

Written by hendrik on 2024-11-08 at 16:42

Fair enough. Judging by OP's later comments, the pool is online again.

=> More informations about this toot | More toots from hendrik@palaver.p3x.de

Written by yeehaw on 2024-11-07 at 04:12

Weird. I suspect your disk is dead, but in case it’s not, I’d mark that disk for replacement, then “replace” it at the software level and allow resilver to see what happens.

No idea how to do this in proxmox or the commands but I know how to do it in truenas with the GUI.

=> More informations about this toot | More toots from cyberpunk007@lemmy.ca

Written by Possibly linux on 2024-11-08 at 04:47

ZFS is aware of the phyicial disks so it won’t randomly start using a different disk.

The disk is no longer working. There is a hardware fault somewhere

=> More informations about this toot | More toots from possiblylinux127@lemmy.zip

Proxy Information

Original URL: gemini://mastogem.picasoft.net/thread/113438957561296170
Status Code: Success (20)
Meta: text/gemini
Capsule Response Time: 317.839636 milliseconds
Gemini-to-HTML Time: 4.931675 milliseconds

This content has been proxied by September (3851b).