Checking Raid
There is an old data sever that has lots of hard drives in it and I needed to check on the server to see how the raid was doing and what it was configured for.
Check the raid daemon to see if it is running
$ sudo /etc/init.d/mdadm status
[ ok ] mdadm is running.
That's good
Check for raid devices
Check where the raid devices are mounted
$ cat /etc/mdadm/mdadm.conf
# mdadm.conf
#
# Please refer to mdadm.conf(5) for information about this file.
#
# by default (built-in), scan all partitions (/proc/partitions) and all
# containers for MD superblocks. alternatively, specify devices to scan, using
# wildcards if desired.
#DEVICE partitions containers
# auto-create devices with Debian standard permissions
CREATE owner=root group=disk mode=0660 auto=yes
# automatically tag new arrays as belonging to the local system
HOMEHOST <system>
# instruct the monitoring daemon where to send mail alerts
MAILADDR root
# definitions of existing MD arrays
# This file was auto-generated on Sat, 18 Jan 2014 16:16:30 -0800
# by mkconf 3.2.5-5
ARRAY /dev/md0 metadata=1.2 name=RaidZero:0 UUID=9eabb82e:90fa6184:64166a99:a695ec78
ARRAY /dev/md1 metadata=1.2 name=RaidOne:1 UUID=1d572ab6:81ce3b79:9f4fa4d5:ff1c1452
There are two devices: /dev/md0
and /dev/md1
Query the device information
$ sudo mdadm --query /dev/md0
/dev/md0: 48436.68GiB raid6 15 devices, 0 spares. Use mdadm --detail for more detail.
$ sudo mdadm --query /dev/md1
/dev/md1: 48436.68GiB raid6 15 devices, 0 spares. Use mdadm --detail for more detail.
md0 has 15 drives
md1 has 15 drives
No spares, so that's kinda bad. However with 15, we can lose a few before something goes wrong.
$ sudo mdadm --detail /dev/md0
/dev/md0:
Version : 1.2
Creation Time : Sun Jan 19 00:46:49 2014
Raid Level : raid6
Array Size : 50789535888 (48436.68 GiB 52008.48 GB)
Used Dev Size : 3906887376 (3725.90 GiB 4000.65 GB)
Raid Devices : 15
Total Devices : 15
Persistence : Superblock is persistent
Update Time : Tue Dec 20 23:36:06 2022
State : clean
Active Devices : 15
Working Devices : 15
Failed Devices : 0
Spare Devices : 0
Layout : left-symmetric
Chunk Size : 16K
Name : RaidZero:0
UUID : 9eabb82e:90fa6184:64166a99:a695ec78
Events : 615
Number Major Minor RaidDevice State
0 8 0 0 active sync /dev/sda
1 8 16 1 active sync /dev/sdb
2 8 32 2 active sync /dev/sdc
3 8 48 3 active sync /dev/sdd
4 8 64 4 active sync /dev/sde
5 8 80 5 active sync /dev/sdf
6 8 96 6 active sync /dev/sdg
7 8 112 7 active sync /dev/sdh
8 8 128 8 active sync /dev/sdi
9 8 144 9 active sync /dev/sdj
10 8 160 10 active sync /dev/sdk
11 8 176 11 active sync /dev/sdl
12 8 192 12 active sync /dev/sdm
13 8 208 13 active sync /dev/sdn
14 8 224 14 active sync /dev/sdo
Everything looks good!
$ sudo mdadm --detail /dev/md1
/dev/md1:
Version : 1.2
Creation Time : Sat Dec 13 21:47:25 2014
Raid Level : raid6
Array Size : 50789535888 (48436.68 GiB 52008.48 GB)
Used Dev Size : 3906887376 (3725.90 GiB 4000.65 GB)
Raid Devices : 15
Total Devices : 15
Persistence : Superblock is persistent
Update Time : Tue Dec 20 23:36:12 2022
State : clean
Active Devices : 15
Working Devices : 15
Failed Devices : 0
Spare Devices : 0
Layout : left-symmetric
Chunk Size : 16K
Name : RaidOne:1
UUID : 1d572ab6:81ce3b79:9f4fa4d5:ff1c1452
Events : 766
Number Major Minor RaidDevice State
0 8 240 0 active sync /dev/sdp
1 65 0 1 active sync /dev/sdq
2 65 16 2 active sync /dev/sdr
3 65 32 3 active sync /dev/sds
4 65 48 4 active sync /dev/sdt
5 65 64 5 active sync /dev/sdu
6 65 80 6 active sync /dev/sdv
7 65 96 7 active sync /dev/sdw
8 65 112 8 active sync /dev/sdx
9 65 128 9 active sync /dev/sdy
10 65 144 10 active sync /dev/sdz
11 65 160 11 active sync /dev/sdaa
12 65 176 12 active sync /dev/sdab
13 65 192 13 active sync /dev/sdac
14 65 208 14 active sync /dev/sdad
Everything looks good!
Check raid health
$ cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
md1 : active raid6 sdp[0] sdad[14] sdac[13] sdab[12] sdaa[11] sdz[10] sdy[9] sdx[8] sdw[7] sdv[6] sdu[5] sdt[4] sds[3] sdr[2] sdq[1]
50789535888 blocks super 1.2 level 6, 16k chunk, algorithm 2 [15/15] [UUUUUUUUUUUUUUU]
md0 : active raid6 sda[0] sdo[14] sdn[13] sdm[12] sdl[11] sdk[10] sdj[9] sdi[8] sdh[7] sdg[6] sdf[5] sde[4] sdd[3] sdc[2] sdb[1]
50789535888 blocks super 1.2 level 6, 16k chunk, algorithm 2 [15/15] [UUUUUUUUUUUUUUU]
unused devices: <none>
md0 has 15 drives, all of them are up
md1 has 15 drives, all of them are up
That's good
Check to see what block devices are on those devices
$ lsblk /dev/md0
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
md0 9:0 0 47.3T 0 raid6
|-raidlvm-lv0 (dm-0) 254:0 0 47.3T 0 lvm /mnt/raid
`-raidlvm-lv1 (dm-1) 254:1 0 47.3T 0 lvm /mnt/raid1
$ lsblk /dev/md1
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
md1 9:1 0 47.3T 0 raid6
`-raidlvm-lv1 (dm-1) 254:1 0 47.3T 0 lvm /mnt/raid1
So each md0 has two logical volumes and raid2 has one logical volume, but the device (dm-1) is on both devices? That's really weird. There is this node in the fstab file:
$ cat /etc/fstab
...
/dev/raidlvm/lv0 /mnt/raid ext4 defaults 1 2
/mnt/raid/imagery /nfs/folder1 none bind 0 0
# moved to raid1 /mnt/raid/folder1 /nfs/folder1 none bind 0 0
# 2nd raid
/dev/raidlvm/lv1 /mnt/folder1 ext4 defaults 1 2
/mnt/raid1/backups /nfs/folder2 none bind 0 0
/mnt/raid1/imagery /nfs/folder3 none bind 0 0
I don't know who moved this to raid1, or why. It doesn't seem possible that there was a larger array of disks and they took some out, also it doesn't make sense that the md0 device has a logical volume that is on the md1 device.
I thought it was that there was originally one raid, then a second was created. If they moved the data over, then they just had to move the mount point, and not keep it in lv0. But perhaps it's left over an needs a reboot.