Simple Solaris SDS/SVM Boot Disk Mirroring Howto. Version 0.42 - Barbie LeVile - 29.06.2006 What we want to do: 1) We want to create a mirrored systemdisk environment that can be booted from and survives a disk failure. In out example case we have 2 drives we want to use for that. A three way mirror can be done in much the similar way. 2) Some background on SDS concepts Metadbs: A metadb stores state informations of the metadevices, they are critical to SDS operation. In SDS metadbs must follow this rules to work: To be able to boot a SDS system, at least 2 and 50%+1 metadbs of all existing metadbs need to be in a ok state. Yes, both statements mut be true. For that reason metadbs are normaly placed on their own small slice on each drive. Less change for accidently deleting them. As a rule of thumb, 3 are placed on each drive in a 2 drive setup, and 2 in a 3 drive setup. However, a SDs system with only 2 disks can NEVER fullfill the minium requierments in case of diskfailure. There are two choices for such situations, 1) set the kernel to ignore the problem and keep booting. Or 2) fix the problem prior to getting back up manualy. In this example we will set the kernel to ignore it, since in most cases its the prefered mode of operation. On should however not forget to fix the problem ASAP. man metadb for details Metadevices: Metadevices are the building blocks of SDS. First one creats a metadevice out of a logical disk device like a slice, then assembles those into another metadevices. Example: We have 2 disks, each as 5 slices, each slice becomes a metadevice. In a mirror those are called submirrors, and then one from each disk gets assembled into actual mirror. Metadevices are manipulated via metainit. man metainit for details Metaroot: The metaroot command sets up a) the kernel to be able to handle a sds mirror boot device and b) creates a metadevice entry for the root filesystem in the /etc/vfstab. It can be used to revert the root device back to non metadevice mode too. man metaroot for details Hints: SDS likes to have drives of the same geometry. If thats not possible, make sure that the drive that holds your data has the smaller slices. You can always mirror a smaller slice to a bigger one, but not the other way. 3) The example disklayout: We have 2 drives, c0t0d0 and c1t0d0, drive 1 is sliced as follows: c0t0d0s0 swap c0t0d0s1 / c0t0d0s2 backup (whole disk) c0t0d0s3 /var c0t0d0s4 /opt c0t0d0s5 /export/home c0t0d0s6 unused c0t0d0s7 here we place out metadbs We have no seperate /usr in this example, add it yourself if you want it. But generaly its not needed anymore. s7 should be around 50mb, more is not needed, even with 10s of metadbs. Now we plan how to name out metadevices. I usualy name my metadevices in the following sheme: d Slice 0 for the mirror, 1 for the submirror on disk 1, 2 for disk 2, 3 for disk 3, except for swap which lays on s0, there it just gets ommited Which gives us the following: Disk 1 Disk 2 Metadevices c0t0d0s0 -> d1 c1t0d0s0 -> d2 d1 + d2 -> d0 c0t0d0s1 -> d11 c1t0d0s3 -> d12 d11 + d12 -> d10 c0t0d0s3 -> d31 c1t0d0s3 -> d32 d31 + d32 -> d30 c0t0d0s4 -> d41 c1t0d0s4 -> d42 d41 + d42 -> d40 c0t0d0s5 -> d51 c1t0d0s5 -> d52 d51 + d52 -> d50 4) Time to get our hands dirty! The following steps should be done while in single usermode idealy. 4.1) Making both drives the same. We start with slicing the second drive in the same way as our first drive, the master. # prtvtoc /dev/rdsk/c0t0d0s2 | fmthard -s - /dev/rdsk/c1t0d0s2 No need to newfs the second drive slices here, that will automaticaly done by the mirror syncing later. Note: If your drives of a different geometry you need to create the slices on the second disk by hand and not use the command above. In such a case, make sure the second disk has identical slice sizes or the bigger slices. 4.2) Metadbs We can now setup our metadbs. # metadb -a -f -c3 c0t0d0s7 c1t0d0s7 Since its the initial creation of the metadbs, we need to force it with -f -a adds the metadbs, and -c tells it how many to create. You can see the results with metadb -i. metadb -i is a very handy tool to determine the state of your metadb states. 4.3) Initializing the devices Now we go to setup the initial metadevices. # metainit -f d1 1 1 c0t0d0s0 # metainit -f d11 1 1 c0t0d0s1 # metainit -f d31 1 1 c0t0d0s3 # metainit -f d41 1 1 c0t0d0s4 # metainit -f d51 1 1 c0t0d0s5 # metainit -f d2 1 1 c1t0d0s0 # metainit -f d12 1 1 c1t0d0s1 # metainit -f d32 1 1 c1t0d0s3 # metainit -f d42 1 1 c1t0d0s4 # metainit -f d52 1 1 c1t0d0s5 Like metadb metainit must be forced with -f, but this time not because its the initial creation, but because we work on mounted filesystems. So, here we create a 1 way concatenation of our actual slices and form the needed submirrors. 4.3) Mirroring fun part 1! # metainit d0 -m d1 # metainit d10 -m d11 # metainit d30 -m d31 # metainit d40 -m d41 # metainit d50 -m d51 Here are the actualy mirroing intialized, the -m tells SDS that we want to build a mirror with the name in the first colum and consiting of the submirror in the third colum. We now have a one way mirror of our system drive, but its not active yet. 4.4) Setting the root device # cp /etc/vfstab /etc/vfstab_pre_sds # cp /etc/system /etc/system_pre_sds # metaroot d10 4.5) Setting up /etc/vfstab #device device mount FS fsck mount mount #to mount to fsck point type pass at boot options # fd - /dev/fd fd - no - /proc - /proc proc - no - # # sds drives # /dev/md/dsk/d0 - - swap - no - /dev/md/dsk/d10 /dev/md/rdsk/d10 / ufs 1 no logging /dev/md/dsk/d30 /dev/md/rdsk/d30 /var ufs 1 no logging,noatime /dev/md/dsk/d40 /dev/md/rdsk/d40 /opt ufs 2 yes logging /dev/md/dsk/d50 /dev/md/rdsk/d50 /export/home ufs 2 yes logging # # non sds drives # #/dev/dsk/c1t0d0s0 - - swap - no - #/dev/dsk/c1t0d0s1 /dev/rdsk/c1t0d0s1 / ufs 1 no logging #/dev/dsk/c1t0d0s3 /dev/rdsk/c1t0d0s2 /var ufs 1 no logging,noatime #/dev/dsk/c1t0d0s4 /dev/rdsk/c1t0d0s4 /opt ufs 2 yes logging #/dev/dsk/c1t0d0s5 /dev/rdsk/c1t0d0s5 /export/home ufs 2 yes logging # # swap slices # swap - /tmp tmpfs - yes - 4.5.1) Optional: Making sure we can boot in case of disk failure. USE ONLY WHERE DATA INTEGRETY IS LESS IMPORTANT THEN SERVICE AVILABILITY. THIS CAN LEAD TO FILE CORUPTION. To make sure we can boot in case a disk fails, we need to tell the kernel to ignore the quota on metadbs, otherwise we can't boot in a two disk setup, because we can't never fullfill the requirments. For that we add the following to /etc/system set md:mirrored_root_flag=1 # echo "set md:mirrored_root_flag=1" >> /etc/system I sugges to create a copy of the /etc/system file and modify the copy instead of the original. This allows to run the system to operate with the data save settings, and allows for switching the /etc/system file at boot if needed via boot -a. 4.6) Our first reboot! Bring down the system to the OBP, don't reboot yet fully. We need to setup the boot devices now. OK setenv boot device disk0:b disk1:b Make sure the devaliases of disk0 and disk1 are actualy pointing to the correct hardware decives. Now we boot our system up for the first time on the mirror. If all went well we are up and running in a few seconds. 4.7) Mirroring fun part 2! Now its time to hook up the second drive so we have actualy mirrored slices. # metattach d0 d2 # metattach d10 d12 # metattach d30 d32 # metattach d40 d42 # metattach d50 d52 This will take considereble amount of time. Use metastat to check on the progress of the syncing Example metastat d30 4.8) Swap Since swap is now located on a metadevice, we want to tell the system that: dumpadm -d /dev/md/dsk/d0 and since resyncing swap at boot is just wasted time, we disable that: metaparam -p 0 /dev/md/dsk/d0 5) All done, enjoy 6) Performance and other Tips To speed up the rsync process, especialy with modern disks raise the physical I/O buffer size. Default is 32KB, 1MB is what we set here, however its a setting that needs experimenting for best effect /etc/system - add the following line set maxphys=0x100000 /etc/rc2.d/S95svm.resync modify the `$METASYNC -r' line to be `$METASYNC -r 2048' To prevent write-on-write problems, which can lead to the sides of the mirror been written on by different programs at the same time and resulting in two different files set: /etc/system - add the following line set md_mirror:md_mirror_wow_flg=0x20 7) Troubleshooting 7.1) metattach: submirror too small to attach Format shows both mirrors to be the same size, metastat however shows they are of different size. Whats going one? - You proably have put the metadbs on one of the data slice on one disk, but not on the other. since metadbs take diskspace as well, this inconsitency happens. Place metadbs either in their own slice or make sure to have the same amount on the same slices on both disks.