Migrating VMs from a libvirt setup to a ganeti cluster

Until recently, I worked with a rather basic setup of virtual machine hosts. Simply a couple of servers that ran their individual set of VMs. In case of an outage of one of the nodes, all the nodes’ VMs were also affected. That needed change, so I looked at and found ganeti to be something that might help to provide redundancy for the VMs without introducing too much voodoo. Here’s how I migrated the 140 VMs from 17 machines into one cluster.

Initial setup

Start with a debian wheezy default installation. Since I plan to put VM storage into DRBD and ganeti places those DRBD devices on top of LVM: use most of the disk you have as a physical volume for LVM. Then install ganeti packages

apt-get -t wheezy-backports install qemu-kvm ganeti2 ganeti-instance-debootstrap

from wheezy-backports and

apt-get install drbd8-utils

from standard wheezy repositories.

Configure DRBD module parameters

echo 'drbd minor_count=128 usermode_helper=/bin/true' >> /etc/modules

and load the module with specified parameters

modprobe $(grep drbd /etc/modules)

Initialize the cluster by running

gnt-cluster init --vg-name vg01 --master-netdev br0 \
--enabled-hypervisors kvm --nic-parameters mode=bridged,link=br0 \ 
cluster.mydomain.com

on one node. Then add the other clusternodes by running

gnt-node add node2.mydomain.com

ganeti has some own magic to run, install and provision new VMs. I do not wish to use that. I wish to migrate my old setups as they are. So I tell ganeti not to use a kernel from the physical hosts:

gnt-cluster modify --hypervisor-parameters kvm:kernel_path=

In order to be able to view the VMs console, I like using spice, so enable that

gnt-cluster modify --hypervisor-parameters kvm:spice_bind=127.0.0.1

Now the cluster is set up. Let’s start migrating machines.

Migration process

Simply spoken, the migration process consists of these 5 steps:

  1. Shutdown the VM
  2. Copy VM disk image to the master cluster node
  3. Create an empty VM with DRBD disks
  4. Write virtual disk image to DRBD device
  5. Start the VM

I’ll not get into steps one and two, you’ll handle that. Steps three and four are described below:

Create an empty VM with DRBD disks

Since most VM parameters are available in libvirts XML configuration file, I wrote a parser that will prepare a command for the creation:

#!/bin/bash

# createnewvm.sh
# prepares gnt-instance command for creating a new vm
# either reads parameters from stdin or from an xml file
# usage: 
# $0 				# read all vm parameters from stdin
# $0 debug			# read all vm parameters from stdin and print debug output
# $0 file $filename		# read most vm parameters from $filename and some from stdin
# $0 debug file $filename	# read most vm parameters from $filename and some from stdin and print debug output

debug=false
if [ "$1" = "debug" ]; then
	debug=true
	shift
fi

if ! which xmllint &>/dev/null; then
	apt-get install libxml2-utils
fi

getuserinput() {
	read $1
	if [ -z "$(eval echo \${$1})" ]; then 
		echo "Value must not be empty. Exit."
		exit
	fi
	case "$1" in
		vmname)
			if ! eval echo \${$1} | grep -qE "^[a-zA-Z0-9_\.-]+$"; then
				echo $1 failed validation. exit.
				exit
			fi
		;;
		vmmemory|vmdisksize)
			if ! eval echo \${$1} | grep -qE "^[0-9]+[kmg]$"; then
				echo $1 failed validation. exit.
				exit
			fi
		;;
		vmcpus|vmnumnics|vmvlan|vmnumdisks)
			if ! eval echo \${$1} | grep -qE "^[0-9]+$"; then
				echo $1 failed validation. exit.
				exit
			fi
		;;
		vmnicmodel)
			if ! eval echo \${$1} | grep -qE "^(paravirtual|e1000|rtl8139|ne2k_isa|ne2k_pci|i82551|i82557b|i82559er|pcnet)$"; then
				echo $1 failed validation. exit.
				exit
			fi
		;;
		vmmac)
			if ! eval echo \${$1} | grep -qE "^[0-9a-z:]+$"; then
				echo $1 failed validation. exit.
				exit
			fi
		;;
	esac
}

if [ "$1" = "file" ]; then # read most parameters from file
	if [ ! -r "$2" ]; then
		echo "file \"$2\" unreadable. exit."
		exit
	fi
	xmlfile=$2
	# some libvirt VMs have <vcpu current='8'>12</vcpu>, then we want "current"
	vmcpus=$(xmllint --noout --xpath 'string(//vcpu/@current)' $xmlfile)
	if [ -z "$vmcpus" ]; then
		# some have just <vpu>12</vcpu>, then we want that
		vmcpus=$(xmllint --noout --xpath '//vcpu/text()' $xmlfile)
	fi
	# some libvirt VMs have both  <currentMemory> and <memory> tags, then we want currentMemory
	vmmemory=$(xmllint --noout --xpath '//currentMemory/text()' $xmlfile)
	if [ -z "$vmmemory" ]; then
		# some just have <memory>, then we want that
		vmmemory=$(xmllint --noout --xpath '//memory/text()' $xmlfile)
	fi
	# calc memory size to megabyte only since < 1g libvirt VMs would calc to "0g"
	vmmemory="$((vmmemory/1024))m"
	vmnumnics=$(xmllint --noout --xpath 'count(//interface)' $xmlfile)
	vmnicmodel=$(xmllint --noout --xpath 'string(//interface/model/@type)' $xmlfile)
	if [ "$vmnicmodel" = "virtio" -o -z "$vmnicmodel" ]; then
		vmnicmodel="paravirtual"
	fi
        for num in $(eval echo {1..$vmnumnics}); do
		vmvlan=$(xmllint --noout --xpath "string(//interface[$num]/source/@bridge)" $xmlfile | grep -oE "[0-9]+")
		vmmac=$(xmllint --noout --xpath "string(//interface[$num]/mac/@address)" $xmlfile)
		i=$((num-1))
		netstring="$netstring --net $i:mac=$vmmac,link=br$vmvlan"
	done
	vmnumdisks=$(xmllint --noout --xpath "count(//disk[@device='disk'])" $xmlfile)
else # read from stdin
	echo "Number of CPUs? (eg 2)"
	getuserinput vmcpus
	
	echo "Memory of the VM? (eg 1024m,1g)"
	getuserinput vmmemory
	
	echo "Number of NICs? (eg 1)"
	getuserinput vmnumnics
	
	echo "NIC model of nic $num? (paravirtual e1000 rtl8139 ne2k_isa ne2k_pci i82551 i82557b i82559er pcnet)"
	getuserinput vmnicmodel
	
	for num in $(eval echo {1..$vmnumnics}); do
	        echo "VLAN of nic $num? (eg 100,290)"
	        getuserinput vmvlan
		echo "Mac of nic $num? (eg 00:16:36:1c:8a:7a)"
		getuserinput vmmac
		i=$((num-1))
	        netstring="$netstring --net $i:mac=$vmmac,link=br$vmvlan"
	done
	
	echo "Number of disks? (eg 1)"
	getuserinput vmnumdisks
fi

# we need to ask for the name because we mostly used short names in libvirt
echo "Name of the VM? (any string)"
getuserinput vmname

# we need to ask for disk size as this is not in the xml
diskstring="-t drbd"
for num in $(eval echo {1..$vmnumdisks}); do
	i=$((num-1))
	echo "Size of disk $num? (eg 4096m,4g)"
	getuserinput vmdisksize
	diskstring="$diskstring --disk $i:size=$vmdisksize"
done

# and we need to ask for the primary and secondary node
nodelist=$(gnt-node list -o name --no-headers | tr \\n ' ' | sed 's/,$//')
echo "Primary node for $vmname? ($nodelist)"
getuserinput vmprimarynode
echo "Secondary node for $vmname? ($nodelist)"
getuserinput vmsecondarynode

$debug && echo vmname $vmname
$debug && echo vmvlan $vmvlan
$debug && echo vmnumdisks $vmnumdisks
$debug && echo vmprimarynode $vmprimarynode
$debug && echo vmsecondarynode $vmsecondarynode
$debug && echo diskstring $diskstring
$debug && echo netstring $netstring

echo "To create the VM, run:"
echo "gnt-instance add -B vcpus=$vmcpus -H kvm:root_path=/dev/vda2,kernel_path=,machine_version=pc,nic_type=$vmnicmodel -B memory=$vmmemory $diskstring -n $vmprimarynode:$vmsecondarynode -o debootstrap+default $netstring --no-ip-check --no-name-check --no-install --no-wait-for-sync $vmname"
echo

So by running

createnewemptyvm.sh file myvm.xml

you can create a command like

gnt-instance add -B vcpus=6 -H kvm:root_path=/dev/vda2,kernel_path=\
,machine_version=pc,nic_type=e1000 -B memory=4096m -t drbd --disk \
0:size=60g -n node1.mydomain.com:node2.mydomain.com -o \
debootstrap+default --net 0:mac=00:34:19:11:22:33,link=br0 \
--no-ip-check --no-name-check --no-install --no-wait-for-sync myvmname

The script will ask for the name of the VM, disk sizes and primary/secondary node names.

Running that command creates a new VM without writing anything to the newly created disks.

Write virtual disk image to DRBD device

Next step is to copy data from the original disk image to the new DRBD device. Since finding the correct DRBD device to write the image file to gets rather complex (and you would not want to make a copy/paste error here!) once you have more than a handful of VMs, I also wrote a helper for that.

#!/bin/bash

image=$1
vmname=$2
disknum=$3

usage() {
	echo $0 /path/to/image.img vmname disknumber
	echo example:
	echo $0 /mnt/kwakman.img kwakman 1
	exit
}

if [ -z "$image" -o -z "$vmname" -o -z "$disknum" ]; then
	usage
fi

if [ ! -r "$image" ]; then
	echo image file $image not readable
	exit 
fi

if ! gnt-instance show $vmname &>/dev/null; then
	echo looks like $vmname does not exist
	exit
fi

drbddevice=$(gnt-instance show $vmname | grep -E "primary: \/dev\/drbd" | tail -n +$((disknum)) | head -n 1 | awk '{ print $3; }')
if [ -z "$drbddevice" ]; then
	echo failed to identify drbd device for disk \#$disknum of vm $vmname
	exit
fi

echo qemu-img convert -p $image -O raw $drbddevice

Running

ddimagetoganetivm.sh /path/to/image.img vmname 1

will output the command to run in order to write the data into the DRBD device:

qemu-img convert -p /mnt/image.img -O raw /dev/drbd27

Start the VM

That’s only a matter of

gnt-instance start vmname

Et voila, you just converted a VM from a standalone libvirt host using a disk image to a VM running on redundant storage in a cluster.

Side notes

Ganeti command perfomance

I found that, when adding more VMs to the cluster, the gnt-* commands took more and more time to execute. This was due to the “lvs” command run by ganeti rather often taking more time to run. This is because, by default, it scans all block devices in /dev for LVM setups. And since every VM adds at least 2 block devices (logical volumes), this becomes slower. The solution is to place an appopriate

filter

line in lvm.conf. My physical volume is always on /dev/sda3, so my filter line looks like this

filter = [ "a|^/dev/sda3$|", "r/.*/" ]

but you will likely have to adjust that to your setup.

Running debian lenny VMs

I need to run some debian lenny VMs in this cluster. Yeah, I know … If you also need to do that, make sure you install kernel and acpid from backports. Then it works just fine. For now. Sigh.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s