#7 Linux LVM, RAID, failed drive recovery 2021

2021-03-05, by René Rebe

In the previous SSD cached LVM/RAID notes some time ago (in a galaxy far, far away) we setup a moder, mdadm-less Linux software RAID setup.

However, in the meantime maybe a drive failed, and as I found the error recovery a bit tricky, let's document exactly what commands are needed to recover the LVM RAID from an error:

RAID5 failed / missing drive

pvcreate /dev/sd*new
vgextent vg0 /dev/sd*new
lvconvert --repair vg0/lv

As this will still leave the old UUID missing, you might want to clean that up with:

vgreduce --removemissing vg0

Easier replaceing an online driver

If you notice an aging and degrading, or already partially failed drive, you should be able to replace that a bit easier when still online:

pvcreate /dev/sd*new
vgextent vg0 /dev/sd*new
lvconvert --replace /dev/sd*old vg0/lv /dev/sd*new
vgreduce vg0 /dev/sd*old

Added bonus: periodically srub the whole thing to prevent bit rot!!1!

As magnetically or otherwise stored data bits might decay and thus "bit rot" after some time, some consider it a good idea to periodically scrub and refresh it. This also can help to catch drives going bad earlier:

lvchange --syncaction check vg0/lv

External links

The Author

René Rebe studied computer science and digital media science at the University of Applied Sciences of Berlin, Germany. He is the founder of the T2 Linux SDE (System Development Environment), and contributer to various projects in the open source landscape for more than 10 years, now. He also founded the Berlin-based software company ExactCODE GmbH. A company dedicated to exact software solutions that just work, everyday.