Saturday, August 18, 2012

VMDK Duplicate UUIDs

I've been working on a program to recover vmware guest data using array based snapshots of vmdk files. Basically this means taking a clone of the datastore, presenting it back to the esx server so a copy of the active vmdk can be presented back to the guest for file based recovery. The bulk of the code is in Java using vijava with some of the OS specific parts each guest has to perform in bash for Linux and PowerShell for Windows. However, it seems that under certain combinations, like Linux and vmware 4.0 or windows and any version of vmware, I get a duplicate UUID message on the console. Of course this is true, but the real problem is this will pause the virtual machine waiting for someone to click the button. Nasty if no one's watching.
The common solution I've found is to run /usr/sbin/vmkfstools on the ESX server, however, this does me no good from my application. I've finally found a solution. The key lies in connecting to the Virtual Disk Manager which can be done either directly against an ESX server or Virtual Center itself. I prefer Virtual Center as I don't have to manage any accounts on the ESX servers, and because of the way my program works, I already have a connection to Virtual Center. The main difference between the two is you'll also need a Data Center object when connection to VC whereas you can specify null connecting to an ESX server. In this example, I already have my own virtual machine object which I've wrapped into something called vmData.
private ServiceInstance serviceInstance;
serviceInstance = new ServiceInstance(new URL("https://"+vmwareServer+"/sdk"), userName, password, true);
Datacenter datacenter = (Datacenter)vmData.getHostSystem().getParent().getParent().getParent();
VirtualDiskManager vmd = serviceInstance.getVirtualDiskManager();
The inventory path from a host system to a data center is 'datacenter --> hostFolder --> childEntity (ComputeResource or its subtype) --> host'. So we call getParent() three times to work our way back up the path.

Once we have our Virtual Disk Manager, all we have to do is assign a new UUID. Java has a really easy way of doing this (UUID.randomUUID()), but of course it can't be that simple. VMware requires a specific format for this UUID as well as a specific prefix ("60 00 C2 9") although I can't find any documentation stating the prefix. The format is normally 8-4-4-4-12 but for some reason the vmdk uuid is 16-16. This turns out to be a little tricky as I wanted something truly random. My solution comes from the java UUID object source code and its toString method. To handle the required VMware prefix, I grab the original vmdk's UUID and take the first half, adding my generated half to it and, viola a cloned vmdk.
String oldUUID = vmd.queryVirtualDiskUuid(fileName, datacenter);
String firstHalf = oldUUID.split("-")[0];
String halfUUID = genHalfVMwareUUID();
vmd.setVirtualDiskUuid(fileName, datacenter, firstHalf+"-"+halfUUID);

private String genHalfVMwareUUID() {
 UUID uuid = UUID.randomUUID();
 Long second = uuid.getLeastSignificantBits();
 String secondHalf = longToVMwareUUID(second);
 return secondHalf;
}

private String longToVMwareUUID(Long val) {
 StringBuffer buffer = new StringBuffer();

 // the entire long is 64 bits, we want the first two so we can add spaces
 // we need to offset the whole thing by 56 to start
 // (8 bits or 2 x 4 bit hex characters from the left)
 // each iteration decrements by two characters, 8 bits
 // e.g.
 // UUID = d645da13-87f6-4e50
 // UUID Binary = 1101011001000101110110100001001110000111111101100100111001010000
 // UUID >> 56 =  11111111111111111111111111111111111111111111111111111111 11010110
 // when we bitwise and (&) (using the digits method)
 // against an 8 bit sequence we get just the last 8 bits
 // 11010110
 // which equals d6, the first two characters
 // the second sequence would be shifted by 48
 // 111111111111111111111111111111111111111111111111 1101011001000101
 // bitwise and against 8 bits and we get 01000101, which in hex is 45
  
 for (int i = 56; i >= 0; i-=8) {
  buffer.append(digits(val >> i, 2));
  buffer.append(" ");
 }

 // remove the last space
 buffer.deleteCharAt(buffer.length()-1);
 return buffer.toString();
}

private String digits(long val, int digits) {
 // each hex character is 4 bits (2 ^ 4 = 16 possibilities)
 // so multiply the number of digits desired by 4 and create a long of 1's that size
 long hi = 1L << (digits * 4);
 return Long.toHexString(hi | (val & (hi - 1))).substring(1);
}
Now that the UUID's have been properly dealt with; no more virtual center messages, no more paused virtual machines.

Wednesday, April 18, 2012

Reclaiming VMDK Space

Unless you are running Microsoft Cluster services, and with VMware HA I'm not sure why you would, I can't really see a reason not to thin provision. However, as capacity within the guest is consumed, the size of the VMDK files increases. Even once data is cleaned up, that VMDK space is never reclaimed. Virtual Center shows this as "Provisioned Storage" vs "Used Storage" as shown in the screen shot below.
In this case the VMDK is 16GB and I'm currently using 13.18GB on disk. When we take a look at the host we can see it has only a couple of GB consumed.
# df -h
Filesystem   Size Used Avail Use% Mounted on
/dev/mapper/rootvg-root  12G 1.9G 9.3G 17% /
tmpfs    499M 0 499M 0% /dev/shm
/dev/sda1   95M 43M 48M 47% /boot
/dev/mapper/rootvg-var  2.0G 117M 1.8G 7% /var
The easiest way I know to fix the problem is to storage vmotion the guest to another datastore and after, if you like, to move it back. The trick is, you have to zero out the extra space in the file system first in order for vmware to thin provision it.

This is pretty easy, although it can take a few minutes depending on how much space you need to 'fill'. The following will create a 9GB zero filled file, flush changes to disk, and then remove it. You could of course fill the entire file system but this could impact running applications, so I'll leave that up to you to decide.
# dd if=/dev/zero of=/fill_file bs=1024k count=9216; sync; rm /fill_file
Now your free space in the virtual disk is filled with zeros. All that's left is to storage vmotion the VMDK to another datastore. I only have experience with NFS, in which case I can selece "Same format as source", if you are using VMFS you should probably select "Thin provisioned format".
Once completed you'll see the used capacity back in line with what the host is actually using.

Sunday, February 26, 2012

Jumbo Frame Test

How can you tell if jumbo frames are working between two devices. Well the easiest way I know is a simple ping test.

Linux:
Success
# ping -s 8000 -M do -c 1 192.168.0.1
PING 192.168.0.1 (192.168.0.1) 8000(8028) bytes of data.
8008 bytes from 192.168.0.1: icmp_seq=1 ttl=255 time=0.555 ms
Failure
# ping -s 9000 -M do -c 1 192.168.0.1
PING 192.168.0.1 (192.168.0.1) 9000(9028) bytes of data.
From 192.168.0.1 icmp_seq=1 Frag needed and DF set (mtu = 9000)

VMware:
Success
# ping -s 8000 -d 10.1.1.101
PING 10.1.1.101 (10.1.1.101): 8000 data bytes
8008 bytes from 10.1.1.101: icmp_seq=0 ttl=64 time=0.410 ms
Failure
# ping -s 9000 -d 10.1.1.101
PING 10.1.1.101 (10.1.1.101): 9000 data bytes
sendto() failed (Message too long)
Even though we set the mtu size to 9000 bytes not all of it can be used by the ICMP payload. The reason for this is because of the 28 bytes of overhead needed for IP plus ICMP. This means the maximum would be 9000 - 28 = 8972 but anything larger than 1472 should prove the point.

Thursday, February 23, 2012

NetApp SDK for Java Part 2

Last time I gave a very basic example of the NetApp SDK using Java so I thought I would post something a bit more complex today. Most of the API calls use a String or boolean as the required type but some need nested NaElement objects. This isn't terribly difficult in itself but I find sorting through the documentation when nesting can be a bit tricky. To illustrate this I'm going to use the removal of an NFS export for two reasons:
  1. It has very few nested elements
  2. It has a bug in the documentation
Here is the API entry to remove an NFS export:
nfs-exportfs-delete-rules
Input Name Range Type Description
all-pathnames boolean
optional
Default value is false. Set to true to delete all rules. 'pathnames' option must be left empty if this option is true.
pathnames pathname-info[]
optional
In the Data ONTAP 7-Mode, these must be the pathnames to be deleted from the exports table. In Data ONTAP Cluster-Mode, the junction paths of the volumes to be unexported must be provided.
persistent boolean
optional
In Data ONTAP 7-Mode, default value is false. Modify the etc/exports file to delete the rules permanently. CAUTION: If 'all-pathnames' and 'persistent' are both true, all exports are removed permanently. In Data ONTAP Cluster-Mode, the export entries are always persistent. Default value is true. If false, an error will be returned.
verbose boolean
optional
Return a verbose output of what occurred. If there is an error after deleting only a few rules, 'deleted-pathnames' will return which rules were deleted. Default value is false.
The first thing to notice is that the pathname-info[] type looks like its an array. What the API really means to say is pathname-info is another node in our XML document of type NaElement. If we click on that link we get another entry shown here:
Element definition: pathname-info
Name Range Type Description
pathname string
pathname of the file. Pathname must be of the format /vol//
To complete our request, we are going to need to have multiple NaElements to represent each node in our XML document. I'm only going to reference the request code here, you can look at the previous post for netappServer details.
NaElement request = new NaElement("nfs-exportfs-delete-rules");
request.addNewChild("persistent", "true");
NaElement pathName = new NaElement("name", "/vol/volume");
NaElement pathNameInfo = new NaElement("pathname-info");
pathNameInfo.addChildElem(pathName);
NaElement pathNames = new NaElement("pathnames");
pathNames.addChildElem(pathNameInfo);
request.addChildElem(pathNames);
try {
   response = netappServer.invokeElem(request);
} catch (Exception e) {
   e.printStackTrace();
}
Walking through the code, the first two lines are pretty simple. The main NaElement named "request" is created with our desired action and I've added an option titled "persistent" to save the changes to the exports file. After that, I've kind of worked backwards. The pathnames input requires another element called pathname-info, and that requires an element called "pathname".

The "pathname" element is created with the line NaElement pathName = new NaElement("name", "/vol/volume");. Notice how it doesn't say "pathname" like the documentation states but rather "name". This is the bug I spent several hours of my life on. I found a reference dating back to 2008 on version 3.0R1 of the API. I'm using SDK 4.1 and it's still wrong. If you run the code with "pathname" as the documentation states you'll get an error that looks like this.
netapp.manage.NaAPIFailedException: Element 'name' is NULL (errno=13114)
However, after having worked that out, we create the required "pathname-info" and add the child element we just created as shown in these two lines.
NaElement pathNameInfo = new NaElement("pathname-info");
pathNameInfo.addChildElem(pathName);
We then create the element we originally wanted, called "pathnames", and add "pathname-info" to it.
NaElement pathNames = new NaElement("pathnames");
pathNames.addChildElem(pathNameInfo);
And finally, we add that whole assembly to the original request.
request.addChildElem(pathNames);
Using a graphical representation, we end up with this.
The SDK developers have done a nice job of NaElement's toString() method which will dump the XML for you to see what's going on. The above code will produce the following when request.toString() is called.
<nfs-exportfs-delete-rules>
  <persistent>true</persistent>
  <pathnames>
    <pathname-info>
      <name>/vol/volume</name>
    </pathname-info>
  </pathnames>
</nfs-exportfs-delete-rules>

Saturday, February 18, 2012

NetApp SDK for Java Basics

I've been working with the NetApp Manageability SDK for a while now and I think I have the basics down. While there are samples in the SDK, I thought I make some notes here in case someone else is looking or I just plain forget. First, there are only a handful of objects to deal with. I actually find this a bit annoying as I'm constantly looking to the API documentation to find out how to create a meaningful query which then requires me to type error prone (for me) strings rather than static variables or methods. For the most part this is pretty easy but can get rough in a couple of spots.

As an example lets create a snapshot. The API documentation is located in the download pack under doc/ontap/ontapapi_1.15/7-Mode/snapshot/index.html and looks like this:

snapshot-create
Input Name Range Type Description
async boolean
optional
If true, the snapshot is to be created asynchronously. The default value is false.
is-valid-lun-clone-snapshot boolean
optional
If true, the snapshot create has been requested by snapvault hence all backing snapshots for all the lun clones in this snapshot will be locked. This ensures the consistency of this snapshot. The default value is false.
snapshot string
nonempty
Name of the snapshot to be created. The maximum string length is 256 characters.
volume string
Name of the volume on which the snapshot is to be created. The volume name can contain letters, numbers, and the underscore character (_), but the first character must be a letter or an underscore.
I need two objects to get the job done, an NaServer to talk to the array, and an NaElement to do the work as an XML document. According to the documentation, the NaElement I need to create is called "snapshot-create" and has two mandatory fields, "snapshot" and "volume". Since both of these are strings and the options are boolean, it looks something like this.
String arrayName = "netapp1.domain.com";
String userName = "administrator";
String password = "password";
String volumeName = "volume";
String snapName = "snapname.0";
NaServer netappServer = null;
NaElement request, response;
try {
   netappServer = new NaServer(arrayName);
   netappServer.setTransportType(NaServer.TRANSPORT_TYPE_HTTPS);
   netappServer.setPort(443);
   netappServer.setStyle(NaServer.STYLE_LOGIN_PASSWORD);
   netappServer.setAdminUser(userName, password);
} catch (UnknownHostException e) {
   e.printStackTrace();
}
request = new NaElement("snapshot-create");
request.addNewChild("volume", volumeName);
request.addNewChild("snapshot", snapName);
try {
   response = netappServer.invokeElem(request);
} catch (Exception e) {
   e.printStackTrace();
}
The request portion is the interesting part. It's telling the array I want to create a snapshot for volume /vol/volume called snapname.0. I'm not using any of the options but if you do, remember that everything is added as a string. I really want to add request.addNewChild("async", true); but that won't work. You'll have to use true as a string value and end up with request.addNewChild("async", "true");

There are of course a number of things that can go wrong. Errors caused by your request will generally throw an NaAPIFailedException. You can grab the error number with e.getErrno() and then compare it against static references in the NaErrno class. Something like this
try {
    response = netappServer.invokeElem(request);
} catch (NaAPIFailedException e) {
switch (e.getErrno()) {
    case NaErrno.ESNAPSHOTEXISTS:
        System.out.println("snap already exists");
}