Skip to topic | Skip to bottom
Home
Socialtools
login



Socialtools.RaidLvmHowtor1.1 - 02 Jul 2005 - 19:20 - TWikiGuesttopic end

Start of topic | Skip to actions

The SocialTools Debian Root-on-LVM-on-RAID-on-IDE HOWTO

By BenjaminGeer and ToniPrug

version 1.8.1

ALERT! NOTE: We upgraded to Debian stable, sarge.
These instruction are not valid any more, since sarge installer has RAID/LVM options.
Use with caution. (2nd Jul 2005).

Copyright and Disclaimer

Copyright © 2003 Benjamin Geer and Toni Prug

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.1 or any later version published by the Free Software Foundation; with no Invariant Sections, with no Front-Cover Texts, and with no Back-Cover Texts.

All information herein is presented "as-is", with no warranties expressed nor implied. There is no guarantee whatsoever, that any of the software, or this information, is in any way correct, nor suited for any use whatsoever. Back up all your data before experimenting with this. Better safe than sorry.

Conventions in this Document

Most of the process described here needs to be done with root permissions, so we won't remind you to log in as root, or to type sudo.

Introduction

Background on RAID

Source: Software RAID HOWTO

On machine with a single hard disk, if the disk crashes and you have backups of your data, you can be glad you haven't lost any data. But you still have to reinstall and reconfigure the operating system. Even if you have a spare hard disk handy, this is time-consuming. During that time, your server is down, and you may find yourself working late into the night to fix it.

RAID (Redundant Array of Independent Disks) allows the operating system to treat an array of disks as a single disk. When data is written to the array, it is written to all the disks. If a disk crashes, your system can continue running with the remaing disks. You can remove the damaged disk and install a new one, and RAID will copy the data from the other disks on to the new disk.

RAID can be implemented in software or hardware; here we will be setting up software RAID, using the implementation in the Linux 2.4 kernel. The implementation in kernel 2.6 is different, and is not covered in this HOWTO.

Background on Logical Volume Management

Source: LVM HOWTO.

Logical volume management is a more flexible alternative to disk partitions. It allows you to allocate drive space (perhaps including several drives) to 'logical volumes', which behave like resizable partitions. You can reallocate space to logical volumes as needed, while the system is running. If you run out of space on a volume, you can give it some space from another volume, or from a new disk drive.

This HOWTO uses the logical volume management implementation in the Linux 2.4 kernel. The implementation in kernel 2.6 is different, and is not covered in this HOWTO.

Requirements

  • A machine with two identical blank IDE drives. If at all possible, get two drives of the same brand and the exact same model number. If the model number is different, the drives may have different geometries (even if they have the same total size), which will make it more difficult to set up RAID. It is also a good idea to put the two drives on separate IDE controllers, if possible (see the Software-RAID-HOWTO for the reasons); in this document, we will assume your drives are primary and secondary IDE masters, i.e. /dev/hda and /dev/hdd (assuming that /dev/hdc is a CD-ROM drive). However, please note that we have not tested this configuration, and have in fact only tested LVM-on-RAID using /dev/hda and /dev/hdb. If you test the two-controller configuration, please let us know.
  • Installation media for Debian Woody.
  • Patience, pizza (or kebabs) and your favourite caffeinated beverage. It took us about 12 hours to do this for the first time; we spent most of that time searching for information on the Internet. This guide is meant to shorten that process.

Note: If you're sure your drives are identical, but they're still being recognised with different geometries, this may be because one of them is on an IDE controller that supports UDMA/66, while the other is on a controller that only supports UDMA/33. A workaround is to put both drives on the same controller.

The Goal

We want to:

  • Use RAID to provide basic redundancy at low cost, in case one disk fails.
  • Allocate the space in the RAID array to logical volumes using LVM.

To accomplish this, we'll use RAID-1, which provides basic redundancy, using two IDE drives of the same size; we'll call these 'disk 1' and 'disk 2'.

More specifically, we want the final result to look like this:

  RAID Array /dev/md0 RAID Array /dev/md1
Physical Partitions /dev/hda1, /dev/hdd1 /dev/hda2, /dev/hdd2
Volume Group (none) /dev/vg1
Filesystems /boot /usr, /home, etc.

There are two RAID arrays. Each RAID array is composed of two physical disk partitions, one on each drive. One of them has a logical volume group on it, which is divided into logical volumes such as /usr and /home. The other one has the /boot filesystem on it, containing the files needed to boot up the operating system; it isn't included in the LVM setup, because the LILO boot loader can't handle logical volumes.

RAID arrays are traditionally called md on Linux; it stands for Multiple Devices.

Overview of the Procedure

  1. Partition the two disks identically.
  2. Install a basic system on disk 1.
  3. Create two RAID arrays consisting only of the partitions on disk 2.
  4. Set up logical volumes on top of those RAID arrays.
  5. Copy everything from disk 1 into the logical volumes on disk 2.
  6. Make disk 2 bootable, and reboot into disk 2.
  7. Add the partitions on disk 1 to the RAID arrays (this copies the contents of disk 2 on to disk 1).
  8. Make disk 1 bootable again, and reboot into disk 1.

The result is that the two disks are identical; you can configure the BIOS to boot into either one.

Preparation

Partitioning the Disks

On each disk, make one small partition (about 16-24 MB), and another partition containing the remaining space. Make sure the corresponding partitions are exactly the same sizes on each disk. If by some misfortune, the two disks don't have the same geometry, it may not be possible to make partitions of the same size on both disks. In that case, make slightly smaller partitions on disk 2, to ensure that, when RAIDs are set up on disk 2, they'll fit into the partitions on disk 1.

To partition disk 2 exactly like disk 1, you can type:

sfdisk -d /dev/hda | sfdisk /dev/hdd

Initial Debian Install

Install Debian on disk 1, using the small partition for /boot, and the larger one as the root partition. Use the bf24 kernel and ext3 partitions, as described in StandardDebianInstall. Using the BIOS, set disk 1 as the first disk in the boot order.

Now is a good time to install devfsd and raidtools2:

apt-get install devfsd raidtools2

Recompiling the Kernel

First install a new procps as described in StandardKernelConfig (the one that comes with Debian Woody has trouble with 2.4 kernels).

Download the latest 2.4 kernel (at least 2.4.26) from your nearest kernel.org mirror. Install Debian's kernel-package:

apt-get install kernel-package

Set the following line in /etc/kernel-pkg.conf, to tell kernel-package to put symbolic links to kernel images in /boot instead of in /:

image_in_boot := True

Unpack and configure the kernel, using the .config from your Debian bf24 kernel as a starting point:

cd /usr/src
tar jxf linux-2.4.26.tar.bz2
cd linux-2.4.26
cp /boot/config-2.4.18.bf2.4 .config
make oldconfig
(hold down the Enter key to accept all the defaults)
make menuconfig

Go through the options, removing anything you're sure you don't need. (See StandardKernelConfig). Make sure you include these options:

  • Block Devices
    • Loopback Device Support (CONFIG_BLK_DEV_LOOP): Y
    • RAM Disk Support (CONFIG_BLK_DEV_RAM): Y
    • Initial RAM Disk (initrd) Support (CONFIG_BLK_DEV_INITRD): Y
  • ATA/IDE/MFM/RLL support
    • IDE, ATA and ATAPI Block devices
      • Use PCI DMA by default when available (CONFIG_IDEDMA_PCI_AUTO): Y
  • Multi-device support (RAID and LVM): say yes to everything here.
  • File Systems
    • Device file system (CONFIG_DEVFS_FS): Y
      • Automatically mount at boot (CONFIG_DEVFS_MOUNT): Y

If you negect CONFIG_DEVFS_MOUNT, devfsd will fail to start, with the error message:

Error opening file: ".devfsd" No such file or directory

(Source: Devfs FAQ.)

We found that we had to leave 'Enable loadable module support' turned on (otherwise the lvmcreate_initrd command, described later, would generate spurious errors), but disable 'Set version information on all module symbols' and 'kernel module loader' (otherwise the modutils that comes with Debian Woody seemed to create spurious modules, once again causing lvmcreate_initrd to report errors). However, another user has reported that he didn't experience any problems with 'kernel module loader' activated.

Compile and install the kernel:

make-kpkg --revision=custom.1.0 kernel_image
dpkg -i ../kernel-image-2.4.26_custom.1.0_i386.deb

Say no to all the questions it asks. Check that it's made correct symbolic links in /boot:

/boot/vmlinuz -> /boot/vmlinuz-2.4.26
/boot/vmlinuz.old -> /boot/vmlinuz-2.4.18-bf2.4

Edit your /etc/lilo.conf and make sure there is an image stanza pointing to an image in /boot for each kernel you now have installed, like this:

default=Linux

image=/boot/vmlinuz
    label=Linux
    read-only

image=/boot/vmlinuz.old
    label=LinuxOLD
    read-only

Make sure that /etc/lilo.conf also contains the following lines:

delay=20
prompt
timeout=100

Rewrite your changes to the disk's master boot record (MBR) by running lilo. Make sure it doesn't report any errors, or say it skipped any images.

Reboot to make sure the new kernel works:

shutdown -r now

Making RAID Arrays

Use cfdisk to change the partition types of both partitions on disk 2 (not on disk 1) to 'Linux RAID Autodetect' (hexadecimal code fd):

cfdisk /dev/hdd

Make sure to write your changes to the partition table before quitting cfdisk. Reboot again:

shutdown -r now

Create /etc/raidtab as follows:

raiddev /dev/md0
        raid-level              1
        nr-raid-disks           2
        nr-spare-disks          0
        chunk-size              32
        persistent-superblock   1
        device                  /dev/hdd1
        raid-disk               0
        device                  /dev/hda1
        failed-disk             1

raiddev /dev/md1
        raid-level              1
        nr-raid-disks           2
        nr-spare-disks          0
        chunk-size              32
        persistent-superblock   1
        device                  /dev/hdd2
        raid-disk               0
        device                  /dev/hda2
        failed-disk             1

Create the RAID arrays:

sudo mkraid --force /dev/md0
sudo mkraid --force /dev/md1

You'll be told to type a different command if you really want to do this; it's OK to go ahead and type it.

Setting up LVM

Install the LVM software.

apt-get install lvm-common lvm10

Allocate /dev/md1 as a physical volume managed by LVM:

vgscan
pvcreate /dev/md1
pvscan

The output of pvscan should show that you now have a physical volume. Then:

vgcreate vg1 /dev/md/1

This should produce a lot of output but no errors.

Use lvcreate to create a logical volume for each filesystem you'll want. For example:

lvcreate -L 1.56G -n root vg1
lvcreate -L 7G -n home vg1
lvcreate -L 4G -n usr vg1
lvcreate -L 4G -n local vg1
lvcreate -L 2.93G -n var vg1
lvcreate -L 8.69G -n db vg1
lvcreate -L 7.81G -n mail vg1
lvcreate -L 13.77G -n www vg1
lvcreate -L 8G -n chroot vg1
lvcreate -L 1000M -n tmp vg1
lvcreate -L 512M -n swap vg1

One of your logical volumes should be called root (for the / filesystem), and one should be called swap (for kernel swap space). It's a good idea to have 2-4 times as much swap space as you have memory.

Making Filesystems

Make an ext3 filesystem on /dev/md0 (the RAID array that we'll use for /boot):

mke2fs -j /dev/md0
tune2fs -c 0 -i 0 /dev/md0

Note that we're using tune2fs to disable automatic filesystem checking (fsck). For an explanation of this, see the section 'Filesystem check intervals' in Andrew Morton's document Using the ext3 filesystem in 2.4 kernels.

Make an ext3 filesystem on each of the logical volumes you created (except swap), e.g.:

mke2fs -j /dev/vg1/root
tune2fs -c 0 -i 0 /dev/vg1/root
mke2fs -j /dev/vg1/usr
tune2fs -c 0 -i 0 /dev/vg1/usr
mke2fs -j /dev/vg1/home
tune2fs -c 0 -i 0 /dev/vg1/home

Create a swap filesystem on your swap logical volume:

mkswap /dev/vg1/swap

Transferring Filesystems to LVM-on-RAID

Create mount a mount point called /mnt, where you'll mount your new root filesystem:

mkdir -p /mnt

Add a line to /etc/fstab to specify that /dev/vg1/root should be mounted on /mnt:

/dev/vg1/root   /mnt            ext3    defaults   0       0

Mount that filesystem:

mount /mnt

Make more mount points under /mnt, where you'll mount your new logical volumes (except swap):

mkdir /mnt/boot
mkdir /mnt/usr
mkdir /mnt/home

In /etc/fstab, add a line for /dev/md0 and for each of your logical volumes (except swap), so that they mount under /mnt, like this:

/dev/md0        /mnt/boot       ext3    defaults   0       0
/dev/vg1/usr    /mnt/usr        ext3    defaults   0       0
/dev/vg1/home   /mnt/home       ext3    defaults   0       0

Mount the filesystems:

mount -a

(Note that when we want a filesystem's mount point to be within another filesystem, we have to mount the outer filesystem first, then create the mount point, then mount the inner filesystem. The same is true if, for example, you mount a filesystem for /usr/local inside /usr.)

Copy everything from disk 1 to disk 2 using cp -a. Firstl, copy everything containing regular files (i.e. not /dev, /cdrom, /floppy or /proc) into /mnt:

cp -a /boot /bin /etc /home /lib /opt /root /sbin /usr /var /mnt

then make the rest of the top-level directories under /mnt (including /initrd, which the kernel seems to expect):

mkdir /mnt/cdrom /mnt/floppy /mnt/proc /mnt/dev /mnt/initrd

Making an initrd file for LVM

The kernel will need an initrd file, created specifically for LVM, which it can load into a ramdisk when booting. To create this file, you need to use the lvmcreate_initrd command. Normally, lvmcreate_initrd calculates the size of this file automatically. However, if you have a big hard disk (e.g. 80GB), this automatically-calculated size might not be big enough, and when the system boots, you will get a an 'ERROR 28 writing volume group backup file /etc/...' The way to avoid this problem is to calculate the size yourself, as described in this message by the LVM author.

Start with the space reported by the following command.

du -chs /mnt/etc/lvmconf

(If you're reading this because you've already successfully set up RAID and LVM, and you're just recompiling your kernel, use /etc/lvmconf instead.) Add 4MB. Convert that number to kilobytes. (We found that for 80G disks, 6000K was about right.) Run the following command, using the size you calculated as the value of INITRDSIZE, and passing the version number of your kernel as an argument.

INITRDSIZE=6000 lvmcreate_initrd 2.4.26

This creates an initrd file, called something like initrd-lvm-2.4.26.gz, in /boot. Copy that file into /mnt/boot.

Making Disk 2 Bootable

Now that you've copied the data from disk 1 to disk 2, you can fix /mnt/etc/fstab (note: not /etc/fstab) so it reflects your final setup. Comment out the lines that refer to /hda1 and /hda2. Take out the /mnt from the LVM filesystem paths, fix the options to reflect what you'd have in a normal system, and add the swap volume:

/dev/vg1/root   /         ext3    errors=remount-ro   0       1
/dev/md0        /boot     ext3    defaults            0       2
/dev/vg1/swap   none      swap    sw                  0       0
/dev/vg1/usr    /usr      ext3    defaults            0       0
/dev/vg1/home   /home     ext3    defaults            0       0

Add the following to /mnt/etc/lilo.conf (not /etc/lilo.conf):

disk=/dev/hdd
  bios=0x80
disk=/dev/hda
  bios=0x81

If you get this wrong, you'll get a LILO error when you reboot: LILO will notice that the BIOS hasn't given it the right disk to boot from, and will say L 07 07 07...

Add this line to make LILO use the kernels in our new /boot filesystem:

boot=/dev/md0

Add this line to make LILO write its configuration to the master boot record (MBR) of drive 2 (but not drive 1):

raid-extra-boot="/dev/hdd"

In the the image stanza for the kernel you just installed, add the following lines, using your calculated initrd size as the value of ramdisk_size, and the filename of the initrd file you created using lvmcreate_initrd.

    initrd=/boot/initrd-lvm-2.4.26.gz
    append="ramdisk_size=6000"
    root=/dev/vg1/root

You can copy these lines into the other stanza, for your old kernel. Delete any other root lines in the file.

Mount the /dev filesystem in /mnt so that LILO can use it:

mount --bind /dev /mnt/dev

Run LILO in a chroot:

chroot /mnt /sbin/lilo

Before you reboot, make sure LILO hasn't overwritten its configuration in the MBR on disk 1. To do this, run lilo again, the normal way:

lilo

Reboot. In the BIOS setup, move disk 2 into first position in the boot order, and make sure disk 1 is in second position. Then let the system boot. If it fails to boot from disk 2, your initrd file might be too small. To fix it, reboot from disk 1, make a new initrd file as described above, mount /dev/vg1/boot temporarily as /mnt/boot, copy the new initrd file into /mnt/boot, and try again to reboot from disk 2.

Including Disk 1 in the LVM-on-RAID System

If you've followed these instructions carefully, the permissions on your moint points should be correct, but if not, you might want to check them. Here are some examples:

chmod a+rwxt /tmp
chgrp staff /usr/local
chmod g+ws /usr/local
chgrp mail /var/mail
chmod g+ws /var/mail

Use cfdisk to change the partition types of /dev/hda to fd ('Linux RAID Autodetect'), just as you did before for /dev/hdd.

In /etc/raidtab, change failed-disk to raid-disk.

Add the partitions on disk 1 to the RAID arrays:

raidhotadd /dev/md0 /dev/hda1
raidhotadd /dev/md1 /dev/hda2 

SARGE NOTE ALERT! : mdadm is now used instead of raditools on debian stable. adding syntax now looks like:

mdadm /dev/md0 --add  /dev/hda1
mdadm /dev/md1 --add  /dev/hda2

If you make a mistake typing the above, you can accidentally add two partitions on the same disk to the same RAID array. As a result, the second raidhotadd won't work, and you'll get error -17 (invalid argument). Use raidhotremove to remove the extra partition (which should go in the other array), and try again.

RAID will now start recovering disk 1 (i.e. copying data from disk 2 to disk 1). You can monitor its progress by typing:

while true; do clear; cat /proc/mdstat; sleep 5; done

This will take a while, possibly several hours, depending on your hard disks. (On our machines, it has tended to take about 2 hours for an 80GB disk.)

Once it's finished, edit /etc/lilo.conf to remove the disk and bios lines. Add /dev/hda to raid-extra-boot:

raid-extra-boot="/dev/hda, /dev/hdd"

Run lilo to save the boot information in the MBRs of both disks.

Reboot, and change the BIOS settings to put disk 1 back at the top of the boot order. It should boot normally.

Congratulations! You now have your LVM-on-RAID system running.

Testing

To test your setup, use the procedure described in the Software RAID HOWTO. The idea is to see whether the system will run with one of the drives unplugged, then to see if it will rebuild that drive once the drive is plugged in again. When one drive is unplugged or needs to be rebuild, you must tell the BIOS to boot from the other drive.

  1. Power down the machine.
  2. Unplug the power cable from the slave disk.
  3. Restart the system, and log in. If everything looks OK, this means that your system can run without that drive.
  4. Type cat /proc/mdstat. It should show that one of the partitions is missing from each array.
  5. Tell the kernel to rebuild the first partition on /dev/hdd1, by typing: sudo raidhotadd /dev/md0 /dev/hdd1
  6. Look at /proc/mdstat again; you should see the kernel rebuilding /dev/hdd1. This should only take a few seconds.
  7. Type: sudo raidhotadd /dev/md1 /dev/hdd2
  8. The kernel should now start rebuilding /dev/hdd2. This will take just as long as it took when you first set it up, in the last section.

Then repeat the process with the other drive.

Monitoring

To get email notification when one of your drives fails, first:

mkdir /etc/raidcheck.d

Install /etc/raidcheck.d/readme.txt and /etc/raidcheck.d/raidcheck. Then:

chown root.root /etc/raidcheck.d/raidcheck
chmod 755 /etc/raidcheck.d/raidcheck
cp /proc/mdstat /etc/raidcheck.d/mdstat.reference
cp /etc/raidcheck.d/raidcheck /etc/cron.daily/raidcheck

Recovery

If one of your drives should fail, this is the general procedure (assuming /dev/hdd has failed):

  • Replace the drive.
  • Partition /dev/hdd with exactly the same partitions as /dev/hda. You can either use sfdisk as shown above (sfdisk -d /dev/hda | sfdisk /dev/hdd), or if you prefer to see what you're doing:
    1. Type cfdisk /dev/hda to look at the partitions on /dev/hda, copy down their sizes, and quit cfdisk.
    2. Type cfdisk /dev/hdd, and create partitions of the same sizes. Set their type to Linux RAID autodetect.
    3. Tell cfdisk to write your changes to the partition table, and quit cfdisk.
    4. Check the contents /etc/raidtab, and change any failed-disk to raid-disk.
  • Use raidhotadd to add the new partitions to each array:
    1. raidhotadd /dev/md0 /dev/hdd1
    2. raidhotadd /dev/md1 /dev/hdd2
  • Run lilo again, to write the master boot record on the new drive.

Using LVM

Resizing, adding and removing logical volumes is easy, but be sure to read the relevant sections of the LVM HOWTO before you attempt these operations.

The easiest approach is to unmount the filesystem in question, then use the e2fsadm command, as described in the LVM HOWTO.

Example: Creating a New Logical Volume

We have a top-level directory, /chroot, which is currently in the root filesystem. BIND is running chrooted in this directory, and we'd like to put it on its own logical volume, as a security measure; this way, if it fills up, it won't disturb any of the other filesystems. We have another logical volume, /dev/vg1/db, with some extra space on it. We'll reduce the size of /dev/vg1/db by 100 MB, and use that space for the new logical volume.

First, we reduce the size of /dev/vg1/db and the filesystem it contains.

umount /dev/vg1/db
e2fsadm --size -100M /dev/vg1/db

We create a new logical volume, and put an ext3 filesystem on it.

lvcreate --size 100M -n chroot vg1
mke2fs -j /dev/vg1/chroot
tune2fs -c 0 -i 0 /dev/vg1/chroot

We stop BIND, move our current /chroot out of the way, and make a new mount point for the new /chroot.

/etc/init.d/bind9 stop
mv /chroot /chroot-old
mkdir /chroot

We add a line to /etc/fstab to mount the new filesystem:

/dev/vg1/chroot /chroot     ext3     defaults    0       0

We remount our filesystems.

mount -a

We copy everything from /chroot-old into the new /chroot, and delete /chroot-old.

cp -a /chroot-old/* /chroot

rm -rf /chroot-old

We can now restart BIND.

/etc/init.d/bind9 start

Resizing the Root Filesystem

You'll need the ext2resize package:

apt-get install ext2resize

Normally you need to unmount a filesystem before resizing it. In order to unmount the root filesystem, you have to reboot the system into single-user mode (type Linux 1 at the boot: prompt), and log in using the root password. You can then unmount and resize the root filesystem, and resize /dev/vg1/root, using the procedures described in the LVM HOWTO. Note:

  1. As explained in the LVM HOWTO, you have to:
    • Extend the volume before extending the filesystem.
    • Reduce the filesystem before reducing the volume.
  2. It seems that e2fsadm and resize2fs don't work for the root filesystem; you need to use ext2resize, along with lvextend or lvreduce.
  3. You have to unmount the filesystem immediately before resizing it; if you do anything else in between, it seems to remount itself.

Upgrading the Kernel

Once your RAID and LVM configuration is working properly, the next time you want to upgrade the kernel, you must:

  1. Copy your old kernel's .config into the new kernel directory, type make oldconfig, and follow the prompts regarding any new options.
  2. Compile and install the kernel as described above.
  3. Run lvmcreate_initrd again as described above (the initrd file will be created in /boot, where it belongs).
  4. Edit your /etc/lilo.conf so that it contains a stanza for the new kernel, and make that kernel the default. This stanza must contain an initrd line like the one for the old kernel (but pointing to the new initrd file you just made in /boot). Don't forget to include the append="ramdisk_size=size" line.
  5. Double-check that any symbolic links in /boot are correct, and that they correspond to what's in /etc/lilo.conf.
  6. Run lilo to save your changes, and reboot. You should be able to choose either kernel from the boot menu.

to top

I Attachment sort Action Size Date Who Comment
readme.txt manage 3.2 K 05 Jan 2004 - 22:29 BenjaminGeer /etc/raidcheck.d/readme.txt
raidcheck manage 0.4 K 05 Jan 2004 - 22:30 BenjaminGeer /etc/raidcheck.d/raidcheck

You are here: Socialtools > RaidLvmHowto

to top

Copyright © 1999-2010 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Open-org? Send feedback