Sunday, May 9, 2010

Linux udev and multipath

While technically /dev is always under the control of udev, as system administrators we rarely need to think about it let alone control it. However, the best use case I have come up with is when configuring storage for Oracle RAC. Oracle requires specific permissions on the devices, you can either use ASMlib (which I think a lot of companies do) or you can use udev. Obviously I like the udev approach as it eliminates yet another vendor supplied software package that needs to be maintained. Being built into Linux means compatibility and updates are handled just like any other OS patch.

A similar argument can be made with multipath (device mapper) although use cases for this spread far beyond just Oracle. Many vendors tout the use of their own multipath software, and while they can have some extra features, I am of the opinion that 99% of the what is required today is available in this free and integrated solution.

Multipath Setup
First off, make sure your storage array is supported by multipath. From my knowledge, all of the major vendors are, but you may have one that isn't on the list. My setup is RedHat 5.4 with an EMC CX array which is convenient as multipath has integrated support for the CLARiiON line. The key section to note from your vendor is any special device entry under /etc/multipath.conf they may need.

Next is to ensure multipath is installed. For RedHat this is in the package device-mapper-multipath, for SuSE it is multipath-tools. Do a package search for multipath and you should find it.

Configuration File
defaults {
user_friendly_names no
}

multipaths {
multipath {
wwid wwid (e.g. 36006016015a0190038585d690b53df11)
alias ora_data1
}
multipath {
wwid wwid (e.g. 36006016015a0190038585d690b53df12)
alias ora_data2
}
}
If you leave user_friendly_name to yes, Linux will create a lovely mpath device for you. I hate these. They are shorter than a WWID but they aren't guaranteed to be consistent across reboots and certainly not across multiple servers. So by specifying “no” you will end up with a device named after the World Wide Identifier (WWID) of the SAN target.

The multipath entries take this WWID and turns it into any name specified by the alias entry. Because we want to change the device permissions, a specific name or prefix is used. In this case I have chosen “ora_” but you could use whatever you like, or multiple prefixes for different functions. For example you could have a vote_ or an ocr_ prefix using different permissions although those disks aren't really required anymore with 11gR2.

You can get WWIDs from the storage array or simply let Linux tell you when it scans them, we'll cover that in a later post.

Optional Configuration
Now if you have a different storage array you may also have to add the vendor specified devices entry, perhaps something like this:

devices {
device {
vendor “NETAPP”
product “LUN”
path_group_policy multibus
getuid_callout “/sbin/scsi_id -g -u -s /block/%n”
prio_callout “/sbin/mpath_prio_ontap /dev/%n”
features "1 queue_if_no_path"
path_checker directio
failback immediate
flush_on_last_del yes
}
}

This is only an example, check with your vendor. If they support multipath, they have documentation with the correct entry for the current features and models.

You can also include a blacklist entry for your local devices but I haven't found it to be required. However, if you like an entry such as this would do:

blacklist
{
devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*"
devnode "^hd[a-z]"
devnode "^cciss!c[0-9]d[0-9]*[p[0-9]*]"
}

Multipath Daemon
The last step is to get the daemon running. Its job is to reconfigure paths when something breaks or when a link is put back in service.

Under RedHat / SuSE this is relatively simple to do:
chkconfig multipathd on

You can validate this with:
chkconfig --list multipathd

And finally startup the daemon immediately with:
service multipathd start
- or -
/etc/init.d/multipathd start

Udev Configuration
Unfortunately distributions can vary greatly in how their udev rules are laid out, so here is the configuration for RedHat, your mileage may vary. I am going to store my new configuration as part of the multipath rules under /etc/udev/rules.d/40-multipath.rules. The changes from the default are highlighted:

# multipath wants the devmaps presented as meaninglful device names
# so name them after their devmap name
SUBSYSTEM!="block", GOTO="end_mpath"
KERNEL!="dm-[0-9]*", ACTION=="add", PROGRAM=="/bin/bash -c '/sbin/lsmod | /bin/grep ^dm_multipath'", RUN+="/sbin/multipath -v0 %M:%m"
KERNEL!="dm-[0-9]*", GOTO="end_mpath"
PROGRAM!="/sbin/mpath_wait %M %m", GOTO="end_mpath"
ACTION=="add", RUN+="/sbin/dmsetup ls --target multipath --exec '/sbin/kpartx -a -p p' -j %M -m %m"
PROGRAM=="/sbin/dmsetup ls --target multipath --exec /bin/basename -j %M -m %m", RESULT=="?*", NAME="%k", SYMLINK="ora_rdsk/%c", GOTO="update_oracle_devs"
PROGRAM!="/bin/bash -c '/sbin/dmsetup info -c --noheadings -j %M -m %m | /bin/grep -q .*:.*:.*:.*:.*:.*:.*:part[0-9]*-mpath-'", GOTO="end_mpath"
PROGRAM=="/sbin/dmsetup ls --target linear --exec /bin/basename -j %M -m %m", NAME="%k", RESULT=="?*", SYMLINK="ora_rdsk/%c", GOTO="update_oracle_devs"
GOTO="end_mpath"
LABEL="update_oracle_devs"
RESULT=="vote*",OWNER="oracle",GROUP="oinstall",MODE="644"
RESULT=="ocr*",GROUP="oinstall",MODE="640"
RESULT=="ora*",OWNER="oracle",GROUP="dba",MODE="660"
OPTIONS="last_rule"
LABEL="end_mpath"

The ora_rdsk entry is a way of keeping the oracle disk (multipath disk) in a unique location. The default is mpath, its up to you. This means that all of my multipath devices will show up under /dev/ora_rdsk.

GOTO=”end_mpath” is an entry which acts as a catch for devices that aren't specifically sent to update_oracle_devs which of course follows. This is where the magic happens, it basically does a match based on a certain prefix assigns an owner, a group, and permissions. You can get quite creative here if you like with different regular expressions or device entries for different functions.

Strictly speaking you don't require partitions to use ASM under Oracle RAC, however, if you would like to, they will show up as your designated alias plus a p. For example ora_data1p1, ora_data1p2, etc. I generally don't use them but if you want to change this it is controlled earlier in the file when kpartx is run.

Putting it All Together
Just allocate your disk, put the entries into /dev/multipath.conf and you are off to the races. Your devices should all magically show up in /dev/ora_rdsk/. Next time I'll cover off how to add and remove devices dynamically so you don't have to know your WWIDs up front or do any nasty rebooting.

2 comments:

  1. Hi,

    Do you you get the right perms for /dev/mapper/ora_data1

    too? Or will the symlink be owned as root?

    ReplyDelete
  2. 99-asm.conf

    PROGRAM="/sbin/dmsetup info -c --noheadings -o name casm-09B6-OCR02p1" , RESULT=="casm-09B6-OCR02p1", NAME="test/casm-09B6-test1" , RUN+="/bin/chown oracle:dba /dev/mapper/casm-09B6-OCR02p1"

    ReplyDelete