distraction in action

Like my work? Check out HexaLex, my game for iPhone & iPod Touch. It's a crossword game like Scrabble, but played with hexagonal tiles. http://www.hexalex.com

(Note: I’ve gone back and forth about posting this article. I really am not an expert in this area and I don’t want to advise people to do something stupid. OTOH, this is something that I’ve never seen described before. I’ve decided to post it, but it’s probably best treated as a curiosity, not a data storage strategy.)

Let’s do a little Linux magic. We’re going to create a RAID 1 (mirrored) pair, transform it to a 2-disk RAID 5 array (you know, the kind that “requires” 3 or more disks), then grow it to a 3-disk RAID 5 array.

Just so nobody gets too confused, I’m using a fairly fresh install of Ubuntu 6.06 Dapper Drake, which includes EVMS 2.5.4.

First let’s make some 10MB “disks” to play with.

$ mkdir disks
$ cd disks
$ dd if=/dev/zero of=img0 bs=1M count=10
10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 0.734714 seconds, 14.3 MB/s
$ # These produce similar output:
$ dd if=/dev/zero of=img1 bs=1M count=10    
$ dd if=/dev/zero of=img2 bs=1M count=10

And let’s mount them as loopback devices:

$ sudo su  # From here on out we'll need root privileges
# losetup /dev/loop0 img0
# losetup /dev/loop1 img1
# losetup /dev/loop2 img2

Now we’re ready to work in evms. Launch the NCurses interface to evms with the command evmsn. Our first step is to delete the “compatibility volumes” that evms automatically creates for us. Select Actions > Delete > Volume (by typing a d v) and select all of the /dev/evms/loopX volumes listed (using the space bar and arrow keys). Don’t select anything else that might be listed! Hit enter to go ahead with the deletion. The dialogs that you’ll see next are not important for our purposes — just chose the defaults until you get back to the evms interface and see the message “The Delete command(s) completed successfully.” at the bottom of the screen.

Next, we will build our first RAID, a level 1 mirrored pair. Select Actions > Create > Region, select the MD Raid 1 Region Manager, and hit enter. Now select loop0 and loop1 and hit enter. Leave the configuration options alone, hitting enter once more to create the Raid region. You’ll get a congratulatory message that informs you that the storage object md/md0 has been produced.

Now if this were a real storage array we were building we would probably put an LVM2 container on top of md0 and create some logical volumes, but in this tutorial we’re just going to create a volume from md0 directly.

Select Actions > Create > EVMS Volume, select md/md0 and give the volume the name RaidVol. Hit enter to go through with the creation. Now we should see /dev/evms/RaidVol in the list of logical volumes.

Let’s put a filesystem on it, so we can save some test data to the volume. Select Actions > File System > Make, select Ext2/3 File System Interface Module, and hit enter. Now choose RaidVol and hit enter twice to finish up with the default configuration options.

At this point your logical volumes list (hit zero to show it if it’s not there already) should look something like this:

 Name                          Size Modified Active R/O  Plug-in   Mountpoint
 /dev/evms/RaidVol           9.9 MB    X                 Ext2/3

EVMS is a cautious system and doesn’t actually change anything until you save your changes. At this point we need to save our progress, so select Actions > Save. Once you’ve saved your changes you can mount your shiny new volume! Select Actions > File Systems > Mount and mount RaidVol at /mnt/raidvol. We’re done with evmsn for the moment, so select Actions > Quit.

It’s easy for me to tell you this works, but if you want to have confidence in this procedure you’ll want to verify it for yourself. Copy some files to /mnt/raidvol — images, text files, whatever. Try to fill up the drive as completely as possible. (Hint: One way to fill it up is dd if=/dev/zero of=/mnt/raidvol/zeros) Once you’ve filled the disk, run md5sum on your files:

# ls -l /mnt/raidvol/zeros
-rw-r--r-- 1 root root 8971264 2006-06-30 00:03 /mnt/raidvol/zeros
# md5sum /mnt/raidvol/zeros
37e6cfefc792d550d54f0422c8521fea  /mnt/raidvol/zeros

You might also want to cat /proc/mdstat to see that the Raid is doing its thing:

$ cat /proc/mdstat
Personalities : [raid1] [raid5]
md0 : active raid1 loop1[1] loop0[0]
      10176 blocks [2/2] [UU]

unused devices: <none>

Now we’re ready to start making magic. (Hint: You might want to back up img0 and img1 now so you can skip many of the previous steps if you want to experiment in the future.) Launch evmsn once more. Unmount the volume with Actions > File System > Unmount.

We’re going to be doing some things behind the back of evms, so let’s convert the volume to a “compatibility volume” with Actions > Convert > EVMS Volume to Compatibility Volume. Save your changes and quit.

The next bit is somewhat voodoo. It’s based on the semi-obscure fact that the RAID 5 algorithm can indeed be applied to two disks, and when you do so the disks end up mirrored! This works because the parity of a single block is equal to the block itself, though the normal way of setting up RAID 5 arrays involves computing the parity of at least two blocks. As a consequence, the only difference between a 2-disk RAID 1 pair and 2-disk RAID 5 array is the metadata! By rewriting the RAID metadata, we can instantly convert our array to RAID 5.

We’re going to use mdadm to rewrite the RAID metadata. First we’ll need to stop the array:

# mdadm --stop /dev/evms/md/md0

Now we’re going to “create” a RAID 5 array using our existing loopback devices. In effect, this should just change the metadata and give us a functional array. mdadm is going to get suspicious, but it’ll let us proceed:

# mdadm --create /dev/md0 --level=5 -n 2 /dev/loop0 /dev/loop1
mdadm: /dev/loop0 appears to contain an ext2fs file system
    size=10172K  mtime=Fri Jun 30 00:28:04 2006
mdadm: /dev/loop0 appears to be part of a raid array:
    level=1 devices=2 ctime=Thu Jun 29 23:20:15 2006
mdadm: /dev/loop1 appears to contain an ext2fs file system
    size=10172K  mtime=Fri Jun 30 00:28:04 2006
mdadm: /dev/loop1 appears to be part of a raid array:
    level=1 devices=2 ctime=Thu Jun 29 23:20:15 2006
Continue creating array? y
mdadm: array /dev/md0 started.

Now the moment of truth! Let’s try mounting /dev/md0:

# mount /dev/md0 /mnt/raidvol
# ls /mnt/raidvol
lost+found  zeros

Hooray! Let’s make sure nothing got corrupted:

# ls -l /mnt/raidvol/zeros
-rw-r--r-- 1 root root 8971264 2006-06-30 00:03 /mnt/raidvol/zeros
# md5sum /mnt/raidvol/zeros
37e6cfefc792d550d54f0422c8521fea  /mnt/raidvol/zeros

And let’s convince ourselves that we really have a RAID 5 array:

# cat /proc/mdstat
Personalities : [raid1] [raid5]
md0 : active raid5 loop0[0] loop1[1]
      10176 blocks level 5, 64k chunk, algorithm 2 [2/2] [UU]

unused devices: <none>

Part II: Growing the Array

Hot-damn, it worked! Now let’s add another disk to the RAID 5 array. First we unmount the volume, then go back to evms:

# umount /mnt/raidvol/
# evmsn

Growing a RAID 5 array is actually quite easy in evms, but the documentation gives one little-to-no help understanding how it’s done. Trial-and-error led me to the following procedure:

Actions > Convert > Compatibility Volume to EVMS Volume
    use the name RaidVol and select /dev/evms/md/md0
Actions > Expand > Volume
    choose RaidVol
    choose md/md0 as the "expand point"
    choose loop2 as the object
Actions > Save

You should get a list of messages that the procedure has produced, and, with luck, none of them should be error messages! Choose cancel to dismiss the dialog, and notice that RaidVol is now 20MB! Mount the volume (Actions > File System > Mount), quit evmsn and let’s make sure everything is ok:

# ls /mnt/raidvol/
lost+found  zeros
# df -h /mnt/raidvol/
Filesystem            Size  Used Avail Use% Mounted on
/dev/evms/RaidVol      20M  9.7M  9.0M  52% /mnt/raidvol
# ls -l /mnt/raidvol/zeros
-rw-r--r-- 1 root root 8971264 2006-06-30 00:03 /mnt/raidvol/zeros
# md5sum /mnt/raidvol/zeros
37e6cfefc792d550d54f0422c8521fea  /mnt/raidvol/zeros

Nifty, eh? Now before you run down to your data center and try this procedure on your client’s drives, you should be aware that I’m pretty clueless about all this stuff and there may very well be extremely good reasons not to do this. If you do attempt it, make sure you have recent backups, and don’t say I didn’t warn you!

  • Share/Bookmark


1. Scott Wallace replies:

Well, I tried your RAID1 to RAID5 conversion this weekend on my home workstation and it worked a treat.

I’ve blogged about the experience at, http://scott.wallace.sh/node/1521.

I just wanted to drop you a note thanking you for making my life that little bit easier. I didn’t have to reinstall my entire workstation to double my disk space!


2. Marki replies:

Nice… Just one small tip which was handy when I had to recustruct LVM with failed PV (which was not mirrored).
It allows you to create loopback devices bigger than your RAM. Suppose you need to simulate 160 GB HDD. You can create it in RAM even when booted from Live CD.
# dd if=/dev/zero of=/dev/shm/file1 bs=1M seek=10240 count=1

This command will create a 10 GB file, which will need only 1 MB of RAM. What this command does, is create a sparse file – nothing is allocated except last 1 MB of data, so only 1 MB of RAM is used. You can then vgcfgrestore PV headers on it – and used RAM will grow only by the amount of data you write to the file.

3. chutz replies:

Dude, that’s some nifty magic and it was certainly insightful. I had no idea that checksum(1 disk) = 1 disk.

Anyway, it is so much easier to just create the raid 5 on two disks instead of all this trickery. The added benefit is that you can grow it online (especially with SATA disks that can also be hotplugged).


[...] Then I stumbled across this blog entry in which a guy creates some experimental loopback devices, creates a RAID1 array and then converts [...]

5. Buddy Butterfly replies:

Great artikel.
I wonder if it would be possible to revert the procedure. I have 2 disks running in raid5 as you converted it to. It’s used in a Xen System which has a quite slow i/o performance. Now I want to try to convert back to raid1. As I understood this should also be possible with a similar procedure? I hope to get some more i/o performance.

Chears, Buddy

6. n8 replies:

It seems like you ought to be able to reverse the procedure, but I’ve never tried it. I’d recommend trying it on disk images first like I did in the article.

7. Buddy Butterfly replies:

Tested it with the images and the loop devices. The reverse procedure also works! Same complaints from mdadm but no probs when confirming with yes. Before going on to the real system. Do you think there will be a noticeable performance gain in I/O when reverting back to raid1? On top of raid I have LVM. I could already gain a heavy performance increase via converting the the virtual machine images to “raw” format. This gives a real performance boost. There is a huge performance loss when using standard virtual image format on top of LVM.

8. n8 replies:

I’m really not an expert at this at all, so I’m not going to guess. :)

9. Buddy Butterfly replies:

Just wanted to test it but… The f.. LVM sits on top of it and I do not have a spare disk at hand. I am not able to stop the raid device because LVM holds it. What I need is the functionality of LVM OLR but that seems not to be possible with linux lvm. Is there any chance to remove a device from a volume group with keeping all data and LVs intact such that it can be readded? Maybe someone has an hint.

Chears, Buddy

[...] Then I stumbled across this blog entry in which a guy creates some experimental loopback devices, creates a RAID1 array and then converts [...]

[...] real arrays containing my data so I wanted to have a practice first.  Using some info I found here http://www.n8gray.org/blog/2006/09/05/stupid-raid-tricks-with-evms-and-mdadm/, I was able to create some files which I could use as hard [...]

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre lang="" line="" escaped="">

Please type this word with the letters reversed: live

Like my work? Check out HexaLex, my game for iPhone & iPod Touch. It's a crossword game like Scrabble, but played with hexagonal tiles. http://www.hexalex.com