Showing posts with label udev. Show all posts
Showing posts with label udev. Show all posts

Tuesday, May 18, 2010

Dynamic Linux Disk

To start with, I am going to assume udev and multipath are setup as per my last post. Udev isn't required for scanning or device naming but it is responsible for permissions and device location (directory). Device naming is actually controlled by the multipath driver, which in a modern Linux distribution is conveniently included in the kernel.

The second assumption is that multipath has the basic setup for your particular storage frame. Now, lets ensure multipath is running and set to start on every boot:

# service multipathd status
multipathd is stopped

# chkconfig --list multipathd
multipathd 0:off 1:off 2:off 3:off 4:off 5:off 6:off

# chkconfig multipathd on

# chkconfig --list multipathd
multipathd 0:off 1:off 2:off 3:on 4:off 5:on 6:off

# service multipathd start

As in the last post, I am dealing with RedHat 5.4 and an EMC CLARiiON array. Without any LUNs allocated multipath -ll should look something like this:
# multipath -ll
sdb: checker msg is "emc_clariion_checker: Logical Unit is umbound or LUNZ"
sdc: checker msg is "emc_clariion_checker: Logical Unit is umbound or LUNZ"
sdd: checker msg is "emc_clariion_checker: Logical Unit is umbound or LUNZ"
sde: checker msg is "emc_clariion_checker: Logical Unit is umbound or LUNZ"

These entries are the four paths available to CX controllers.

Adding Devices

No Existing Devices
I have so far been unable to use the simple scan method on an HBA without any devices at all, so this process will unload and reload the adapter driver. It's a disruptive process on the fibre channel bus but there aren't any devices anyway, so it shouldn't matter.

First find the driver you are using, it is likely either an Emulex (lpfc) or Qlogic (qla). In this example I am using an Emulex card.
# lsmod | grep lpfc
lpfc 352909 0
scsi_transport_fc 73801 1 lpfc
scsi_mod 196569 10 scsi_dh,sr_mod,sg,usb_storage,lpfc,scsi_transport_fc,mptsas,mptscsih,scsi_transport_sas,sd_mod

remove the module
# rmmod lpfc

insert the module
# modprobe lpfc

Instead of modprobe, you can also use insmod. The difference being insmod will only load the specified driver and modprobe will load the driver and any dependent drivers.

This will allow the device(s) to show under /dev/ora_rdsk but won't create any multipath entries. To do that we simply run multipath.
# multipath
reload: 36006016015a01900796464949a36df11 DGC,RAID 5
[size=50G][features=1 queue_if_no_path|features=1
queue_if_no_path][hwhandler=1 emc][n/a]
\_ round-robin 0 [prio=2][undef]
\_ 4:0:1:0 sdc 8:32 [active][ready]
\_ 5:0:1:0 sde 8:64 [undef][ready]
\_ round-robin 0 [prio=0][undef]
\_ 4:0:0:0 sdb 8:16 [undef][ready]
\_ 5:0:0:0 sdd 8:48 [undef][ready]

Existing Devices
If you have at least one Fibre device existing, you can simply rescan the bus. This will not take down the existing devices and is able to operate one path at a time ensuring I/O can continue to flow. You will need to know which host devices are your fibre HBAs. To find that, we can list the know fibre adapters as follows and then issue a scan for each.
# ls -l /sys/class/fc_host
drwxr-xr-x 3 root root 0 Apr 17 08:50 host4
drwxr-xr-x 3 root root 0 Apr 17 08:50 host5

# echo "- - -" > /sys/class/scsi_host/host4/scan
# multipath -ll
36006016015a01900e2acd0d4a549df11 dm-3 DGC,RAID 5
[size=8.0G][features=1 queue_if_no_path|features=1
queue_if_no_path][hwhandler=1 emc][rw]
\_ round-robin 0 [prio=1][active]
\_ 1:0:0:1 sdh 8:112 [active][ready]
\_ round-robin 0 [prio=0][enabled]
\_ 1:0:1:1 sdi 8:128 [active][ready]

# echo "- - -" > /sys/class/scsi_host/host5/scan
# multipath -ll
36006016015a01900e2acd0d4a549df11 dm-3 DGC,RAID 5
[size=8.0G][features=1 queue_if_no_path|features=1
queue_if_no_path][hwhandler=1 emc][rw]
\_ round-robin 0 [prio=2][enabled]
\_ 1:0:0:1 sdh 8:112 [active][ready]
\_ 2:0:0:1 sdj 8:144 [active][ready]
\_ round-robin 0 [prio=0][enabled]
\_ 1:0:1:1 sdi 8:128 [active][ready]
\_ 2:0:1:1 sdk 8:160 [active][ready]

# ls -lL /dev/ora_rdsk
brw-rw---- 1 root root 253, 3 Apr 17 15:57 36006016015a01900e2acd0d4a549df11

Renaming Devices
The newly scanned device has a WWID name which isn't terribly useful for something like Oracle as we want udev to apply appropriate permissions. To do this, cut and paste the ID into /etc/multipath.conf so it looks something like this:
multipath {
wwid 36006016015a01900796464949a36df11
alias ora_test
}

And then remove the old device name and re-import it into multipath
# multipath -f 36006016015a01900796464949a36df11

# multipath
create: ora_test (36006016015a01900796464949a36df11) DGC,RAID 5
[size=50G][features=1 queue_if_no_path|features=1
queue_if_no_path][hwhandler=1 emc][n/a]
\_ round-robin 0 [prio=2][undef]
\_ 1:0:1:0 sdc 8:32 [undef][ready]
\_ 2:0:1:0 sde 8:64 [undef][ready]
\_ round-robin 0 [prio=0][undef]
\_ 1:0:0:0 sdb 8:16 [undef][ready]
\_ 2:0:0:0 sdd 8:48 [undef][ready]

# ls \-lL /dev/ora_rdsk
brw-rw---- 1 oracle dba 253, 2 Apr 17 09:44 ora_test

If you get an error "must provide a map name to remove" when running multipath -f, make sure you don't have a shell inside /dev/ora_rdsk directory. Also, be careful not to use multipath -F as that will remove all devices, probably not what you want.

Removing Devices

Before doing the actual removal you will need to note several pieces of information; the multipath device name and all block devices assigned to it. All of which can be obtained from multipath -ll.

# multipath -ll
*ora_test2* (36006016015a01900e2acd0d4a549df11) dm-3 DGC,RAID 5
[size=8.0G][features=1 queue_if_no_path|features=1
queue_if_no_path][hwhandler=1 emc][rw]
\_ round-robin 0 [prio=2][active]
\_ 1:0:0:1 *sdh* 8:112 [active][ready]
\_ 2:0:0:1 *sdj* 8:144 [active][ready]
\_ round-robin 0 [prio=0][enabled]
\_ 1:0:1:1 *sdi* 8:128 [active][ready]
\_ 2:0:1:1 *sdk* 8:160 [active][ready]

To remove the multipath device

# multipath -f ora_test2

Then remove the appropriate block devices from the system with 'echo 1 >
/sys/block/*dev*/device/delete'

# echo 1 > /sys/block/sdh/device/delete
# echo 1 > /sys/block/sdj/device/delete
# echo 1 > /sys/block/sdi/device/delete
# echo 1 > /sys/block/sdk/device/delete

Sunday, May 9, 2010

Linux udev and multipath

While technically /dev is always under the control of udev, as system administrators we rarely need to think about it let alone control it. However, the best use case I have come up with is when configuring storage for Oracle RAC. Oracle requires specific permissions on the devices, you can either use ASMlib (which I think a lot of companies do) or you can use udev. Obviously I like the udev approach as it eliminates yet another vendor supplied software package that needs to be maintained. Being built into Linux means compatibility and updates are handled just like any other OS patch.

A similar argument can be made with multipath (device mapper) although use cases for this spread far beyond just Oracle. Many vendors tout the use of their own multipath software, and while they can have some extra features, I am of the opinion that 99% of the what is required today is available in this free and integrated solution.

Multipath Setup
First off, make sure your storage array is supported by multipath. From my knowledge, all of the major vendors are, but you may have one that isn't on the list. My setup is RedHat 5.4 with an EMC CX array which is convenient as multipath has integrated support for the CLARiiON line. The key section to note from your vendor is any special device entry under /etc/multipath.conf they may need.

Next is to ensure multipath is installed. For RedHat this is in the package device-mapper-multipath, for SuSE it is multipath-tools. Do a package search for multipath and you should find it.

Configuration File
defaults {
user_friendly_names no
}

multipaths {
multipath {
wwid wwid (e.g. 36006016015a0190038585d690b53df11)
alias ora_data1
}
multipath {
wwid wwid (e.g. 36006016015a0190038585d690b53df12)
alias ora_data2
}
}
If you leave user_friendly_name to yes, Linux will create a lovely mpath device for you. I hate these. They are shorter than a WWID but they aren't guaranteed to be consistent across reboots and certainly not across multiple servers. So by specifying “no” you will end up with a device named after the World Wide Identifier (WWID) of the SAN target.

The multipath entries take this WWID and turns it into any name specified by the alias entry. Because we want to change the device permissions, a specific name or prefix is used. In this case I have chosen “ora_” but you could use whatever you like, or multiple prefixes for different functions. For example you could have a vote_ or an ocr_ prefix using different permissions although those disks aren't really required anymore with 11gR2.

You can get WWIDs from the storage array or simply let Linux tell you when it scans them, we'll cover that in a later post.

Optional Configuration
Now if you have a different storage array you may also have to add the vendor specified devices entry, perhaps something like this:

devices {
device {
vendor “NETAPP”
product “LUN”
path_group_policy multibus
getuid_callout “/sbin/scsi_id -g -u -s /block/%n”
prio_callout “/sbin/mpath_prio_ontap /dev/%n”
features "1 queue_if_no_path"
path_checker directio
failback immediate
flush_on_last_del yes
}
}

This is only an example, check with your vendor. If they support multipath, they have documentation with the correct entry for the current features and models.

You can also include a blacklist entry for your local devices but I haven't found it to be required. However, if you like an entry such as this would do:

blacklist
{
devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*"
devnode "^hd[a-z]"
devnode "^cciss!c[0-9]d[0-9]*[p[0-9]*]"
}

Multipath Daemon
The last step is to get the daemon running. Its job is to reconfigure paths when something breaks or when a link is put back in service.

Under RedHat / SuSE this is relatively simple to do:
chkconfig multipathd on

You can validate this with:
chkconfig --list multipathd

And finally startup the daemon immediately with:
service multipathd start
- or -
/etc/init.d/multipathd start

Udev Configuration
Unfortunately distributions can vary greatly in how their udev rules are laid out, so here is the configuration for RedHat, your mileage may vary. I am going to store my new configuration as part of the multipath rules under /etc/udev/rules.d/40-multipath.rules. The changes from the default are highlighted:

# multipath wants the devmaps presented as meaninglful device names
# so name them after their devmap name
SUBSYSTEM!="block", GOTO="end_mpath"
KERNEL!="dm-[0-9]*", ACTION=="add", PROGRAM=="/bin/bash -c '/sbin/lsmod | /bin/grep ^dm_multipath'", RUN+="/sbin/multipath -v0 %M:%m"
KERNEL!="dm-[0-9]*", GOTO="end_mpath"
PROGRAM!="/sbin/mpath_wait %M %m", GOTO="end_mpath"
ACTION=="add", RUN+="/sbin/dmsetup ls --target multipath --exec '/sbin/kpartx -a -p p' -j %M -m %m"
PROGRAM=="/sbin/dmsetup ls --target multipath --exec /bin/basename -j %M -m %m", RESULT=="?*", NAME="%k", SYMLINK="ora_rdsk/%c", GOTO="update_oracle_devs"
PROGRAM!="/bin/bash -c '/sbin/dmsetup info -c --noheadings -j %M -m %m | /bin/grep -q .*:.*:.*:.*:.*:.*:.*:part[0-9]*-mpath-'", GOTO="end_mpath"
PROGRAM=="/sbin/dmsetup ls --target linear --exec /bin/basename -j %M -m %m", NAME="%k", RESULT=="?*", SYMLINK="ora_rdsk/%c", GOTO="update_oracle_devs"
GOTO="end_mpath"
LABEL="update_oracle_devs"
RESULT=="vote*",OWNER="oracle",GROUP="oinstall",MODE="644"
RESULT=="ocr*",GROUP="oinstall",MODE="640"
RESULT=="ora*",OWNER="oracle",GROUP="dba",MODE="660"
OPTIONS="last_rule"
LABEL="end_mpath"

The ora_rdsk entry is a way of keeping the oracle disk (multipath disk) in a unique location. The default is mpath, its up to you. This means that all of my multipath devices will show up under /dev/ora_rdsk.

GOTO=”end_mpath” is an entry which acts as a catch for devices that aren't specifically sent to update_oracle_devs which of course follows. This is where the magic happens, it basically does a match based on a certain prefix assigns an owner, a group, and permissions. You can get quite creative here if you like with different regular expressions or device entries for different functions.

Strictly speaking you don't require partitions to use ASM under Oracle RAC, however, if you would like to, they will show up as your designated alias plus a p. For example ora_data1p1, ora_data1p2, etc. I generally don't use them but if you want to change this it is controlled earlier in the file when kpartx is run.

Putting it All Together
Just allocate your disk, put the entries into /dev/multipath.conf and you are off to the races. Your devices should all magically show up in /dev/ora_rdsk/. Next time I'll cover off how to add and remove devices dynamically so you don't have to know your WWIDs up front or do any nasty rebooting.