Reducing raid 5 disks with mdadm

I currently have a 4 disk raid 5 array that’s just over 50% full so I want to remove one of the drives, there’s a lot of information scattered around on google relating to this and some of it is unfortunately outdated. mdadm now has the option to reduce arrays and remove drives, you just need to do it in the right order.

First and foremost, unmount the array. Once we’ve got the array shrunk and it’s reshaping we can mount it again, however the reshape will be slower if you’re using the array.

I’m wanting to reshape /dev/md1 so it’s time to unmount it.

# umount /dev/md1

As I’ve a filesystem on the array it needs to be reduced first. This is not LVM and I wouldn’t recommend trying this with LVM on raid.

We’re going to resize to the smallest possible size (-M) and show progress (-p)

# resize2fs -p -M /dev/md1 
resize2fs 1.43.3 (04-Sep-2016)
Please run 'e2fsck -f /dev/md1' first.

Oops, it’s a good job e2fsck has my back here, lets check the filesystem then.

# fsck -f /dev/md1
fsck from util-linux 2.28.2
e2fsck 1.43.3 (04-Sep-2016)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information

/dev/md1: ***** FILE SYSTEM WAS MODIFIED *****
/dev/md1: 3149/183148544 files (0.1% non-contiguous), 355795893/732571008 blocks

OK let’s try the resize again.

# resize2fs -pM /dev/md1 
resize2fs 1.43.3 (04-Sep-2016)
Resizing the filesystem on /dev/md1 to 351199179 (4k) blocks.
Begin pass 2 (max = 178767390)
Relocating blocks             XXXXXXXXXXXXXXX-------------------------
Begin pass 3 (max = 22357)
Scanning inode table          XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Begin pass 4 (max = 131)
Updating inode references     XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
The filesystem on /dev/md1 is now 351199179 (4k) blocks long.

That took quite a while, about 8 hours for the size of this filesystem.

Let’s check the raid array status before we start on this.

# cat /proc/mdstat 
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] [faulty] 
md1 : active raid5 sdl1[3] sdc1[0] sdb1[1] sda1[2]
      2930284032 blocks level 5, 512k chunk, algorithm 2 [4/4] [UUUU]
      bitmap: 0/8 pages [0KB], 65536KB chunk

unused devices: <none>

As you can see there’s 4 devices in md1, I want to reduce it down to 3.

Fortunately mdadm yet again has my back again and tells me I need to reduce the array size first to fit it to 3 drives.

# mdadm --grow -n3 /dev/md1
mdadm: this change will reduce the size of the array.
       use --grow --array-size first to truncate array.
       e.g. mdadm --grow /dev/md1 --array-size 1953522688

So we reduce the size of the array.

# mdadm --grow /dev/md1 --array-size 1953522688

And check the status again

# cat /proc/mdstat 
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] [faulty] 
md1 : active raid5 sdl1[3] sdc1[0] sdb1[1] sda1[2]
      1953522688 blocks level 5, 512k chunk, algorithm 2 [4/4] [UUUU]
      bitmap: 0/8 pages [0KB], 65536KB chunk

unused devices: <none>

All looks good so we’ll continue.

Now mdadm should start doing the reshape

# mdadm --grow -n3 /dev/md1
mdadm: Need to backup 3072K of critical section..
mdadm: /dev/md1: Cannot grow - need backup-file
mdadm:  Please provide one with "--backup=..."

and again it’s being helpful, so we run again giving the backup option. It only needs ~3M so I’ll put this on the root partition.

# mdadm --grow -n3 /dev/md1 --backup=/mdadm_temp
mdadm: Need to backup 3072K of critical section..

And that’s it, the raid array’s reshaping now.

# cat /proc/mdstat 
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] [faulty] 
md1 : active raid5 sdl1[3] sdc1[0] sdb1[1] sda1[2]
      1953522688 blocks super 0.91 level 5, 512k chunk, algorithm 2 [3/3] [UUU]
      [>....................]  reshape =  0.0% (124856/976761344) finish=651.7min speed=24971K/sec
      bitmap: 0/8 pages [0KB], 65536KB chunk

unused devices: <none>

That’s all the critical stuff done, the reshape will run in the background now.

As we reduced the filesystem to the smallest possible it’s going to be 100% full if mounted now. Checking the reshaping array, it’s now 1.8TiB.

# fdisk -l /dev/md1
Disk /dev/md1: 1.8 TiB, 2000407232512 bytes, 3907045376 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 524288 bytes / 1572864 bytes

Grow the filesystem to fill the array now.

# resize2fs /dev/md1

I mounted the partition again and started any services I had to stop to prevent read/write.

Using iostat it looks like the data is being read from all 4 drives and written back to the first 3. When complete I expect drives sda,b and c to be the data drives and sdl to have become a spare (S)

# iostat | grep -E "(Device|sd[lcba])"
Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
sda              14.91      2597.39      3573.86  477994476  657692778
sdb              15.17      2609.44      3561.04  480212396  655333618
sdc              15.42      2674.85      3519.70  492249562  647726130
sdl              12.55      2542.55      1495.80  467901643  275270542

The next step is to wait for the reshape to complete.

Now it’s complete, we’re left with sda,b and c used for the array and sdl as a spare as suspected.

# cat /proc/mdstat 
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] [faulty] 
md1 : active raid5 sdl1[3](S) sdc1[0] sdb1[1] sda1[2]
      1953522688 blocks level 5, 512k chunk, algorithm 2 [3/3] [UUU]
      bitmap: 0/8 pages [0KB], 65536KB chunk

unused devices: <none>

I want to repurpose this drive so it needs removing from the array

# mdadm /dev/md1 --remove /dev/sdl1
mdadm: hot removed /dev/sdl1 from /dev/md1

and checking the array again shows

# cat /proc/mdstat 
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] [faulty] 
md1 : active raid5 sdc1[0] sdb1[1] sda1[2]
      1953522688 blocks level 5, 512k chunk, algorithm 2 [3/3] [UUU]
      bitmap: 0/8 pages [0KB], 65536KB chunk

unused devices: <none>

As this drive has been a spare in the above raid array we need to do a little more clean up

Checking what mdadm knows about the drive gives us this

# mdadm -E /dev/sdl
/dev/sdl:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : 1552d15b:d0c81835:fe0db9aa:2a156f8c (local to host share)
  Creation Time : Fri May 22 19:57:22 2015
     Raid Level : raid5
  Used Dev Size : 976761344 (931.51 GiB 1000.20 GB)
     Array Size : 1953522688 (1863.02 GiB 2000.41 GB)
   Raid Devices : 3
  Total Devices : 4
Preferred Minor : 1

    Update Time : Sun Mar 12 02:58:58 2017
          State : clean
Internal Bitmap : present
 Active Devices : 3
Working Devices : 4
 Failed Devices : 0
  Spare Devices : 1
       Checksum : 9fcf2ce6 - correct
         Events : 45093

         Layout : left-symmetric
     Chunk Size : 512K

      Number   Major   Minor   RaidDevice State
this     3       8      177        3      spare   /dev/sdl1

   0     0       8       33        0      active sync   /dev/sdc1
   1     1       8       17        1      active sync   /dev/sdb1
   2     2       8        1        2      active sync   /dev/sda1
   3     3       8      177        3      spare   /dev/sdl1

As you can see it knows this drive was a spare as part of the set containing sd[a-c]1, leaving this data intact may confuse things later on if we need to rebuild the array so we’ll clean it

# mdadm --zero-superblock /dev/sdl

There’s no output so lets run the examine (-E) command again.

# mdadm -E /dev/sdl
/dev/sdl:
   MBR Magic : aa55
Partition[0] :   1953523120 sectors at         2048 (type fd)

So now the only information is that the 1st partition is of type fd (linux raid)

I’m wanting to use this drive as a single entitiy in the current machine so I want to swap the type to standard linux partition.

# fdisk -l /dev/sdl
Disk /dev/sdl: 931.5 GiB, 1000204886016 bytes, 1953525168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x00000000

Device     Boot Start        End    Sectors   Size Id Type
/dev/sdl1        2048 1953525167 1953523120 931.5G fd Linux raid autodetect

I could use fdisk to interactively change the type of the 1st partition to Linux (83) but that doesn’t copy/paste too well so I’ll use sfdisk and some sed magic.

# sfdisk -d /dev/sdl | sed "s/type=fd/type=83/g" | sfdisk /dev/sdl
Checking that no-one is using this disk right now ... OK

Disk /dev/sdl: 931.5 GiB, 1000204886016 bytes, 1953525168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x00000000

Old situation:

Device     Boot Start        End    Sectors   Size Id Type
/dev/sdl1        2048 1953525167 1953523120 931.5G fd Linux raid autodetect

>>> Script header accepted.
>>> Script header accepted.
>>> Script header accepted.
>>> Script header accepted.
>>> Created a new DOS disklabel with disk identifier 0x00000000.
Created a new partition 1 of type 'Linux' and of size 931.5 GiB.
/dev/sdl2: 
New situation:

Device     Boot Start        End    Sectors   Size Id Type
/dev/sdl1        2048 1953525167 1953523120 931.5G 83 Linux

The partition table has been altered.
Calling ioctl() to re-read partition table.
Syncing disks.

As you can see this is now a Linux (type 83) partition and I can continue with putting a filesystem on this and mounting it somewhere on the system.

If I was moving this to another system I may have to be able to identify which drive to remove. Fortunately I can pull SMART diagnostics from this drive and so use smartctl to pull the available information.

# smartctl -i /dev/sdl
smartctl 6.4 2015-06-04 r4109 [x86_64-linux-4.9.6-gentoo-r1] (local build)
Copyright (C) 2002-15, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital RE3 Serial ATA
Device Model:     WDC WD1002FBYS-18A6B0
Serial Number:    WD-WMATV1223999
LU WWN Device Id: 5 0014ee 0567a7731
Add. Product Id:  DELL�
Firmware Version: 03.00C06
User Capacity:    1,000,204,886,016 bytes [1.00 TB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    7200 rpm
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA8-ACS (minor revision not indicated)
SATA Version is:  SATA 2.5, 3.0 Gb/s
Local Time is:    Sun Mar 12 12:47:53 2017 GMT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

As you can see the drive we’ve been working with is a Dell branded Western Digital drive and we’ve even got the serial number, so once powered down we can easily identify the physical drive for disconnecting and removal.