Migrating VMs between DRBD backed clusters

DRBD is a kernel module that lets you mirror block devices over the network. Every bit you write is mirrored to a second node and your writing process only gets back to you once the second node finished writing¹. So at every point in time, you have an exact copy of whatever it is you’re writing to that block device.

One of many common use cases is to export DRBD devices to virtual machines and have them use those as hard drives. If you think about what I wrote in the first part of this blog, you’ll understand that if a VM uses a DRBD device as a harddisk, you can effectively run the VM on either of your two machines. So, say you have to shutdown the currently active machine for maintenance purposes, you can move the VM to the other node and your service does not have to be down with its hardware. The service the VM provides does not have to be interrupted².

Several years ago, I built such a platform to run a bunch of VMs (like 15 or something) and therey reduce 15 physical machines to 2. Time goes by, and now, that VM cluster hardware is to be replaced by more powerful hardware in order to be able to run more virtual machines.

First task is to migrate the currently running VMs to the new hardware. And here’s how I did that using DRBD. Let’s first paint a picture of what I’m talking about

So right now, all VMs are on Node1 and Node2, the DRBD replication takes place over the back to back connection with network 192.168.0.0/30. The goal is to move all VMs to Nodes 3 and 4 and replicate data over their back to back connection with network 192.168.1.0/30.

Steps to move one VM from Node1 to Node3:

  1. Node1:
    1. Disconnect the DRBD device:
      drbdadm disconnect foo
    2. Re-configure drbd.conf to replicate to Node3 instead of Node2 using the common network 10.0.0.0/8:
      resource foo {
              protocol C;
              device          /dev/drbdXX;
              disk            /dev/vg1/foo;
              meta-disk       internal;
              on Node1 {
      #                address 192.168.0.1:7788;
                      address 10.0.0.1:7788;
              }
      #        on Node2 {
              on Node3 {
      #                address 192.168.0.2:7788;
                      address 10.0.0.3:7788;
              }
      }
      
    3. Load this config:
      drbdadm adjust foo
    4. Connect this config:
      drbdadm connect foo
  2. Node3:
    1. Create backing device with the same specs as on Node1
    2. Create drbd.conf that uses this backing device and replicates from Node1 using the common network 10.0.0.0/8:
      resource foo {
              protocol C;
              device          /dev/drbdXX;
              disk            /dev/vg1/foo;
              meta-disk       internal;
              on Node1 {
                      address 10.0.0.1:7788;
              }
              on Node3 {
                      address 10.0.0.3:7788;
              }
      }
      
    3. Create metadata on this new DRBD device:
      drbdadm create-md foo
    4. Bring this device up:
      drbdadm up foo
  3. Watch the device sync:
    drbdadm status
  4. Node1: After initial sync, shutdown the VM and put device into secondary mode:
    drbdadm secondary foo
  5. Node3: Put device into primary mode:
    drbdadm primary foo

Once that’s done, copy your VMs configuration file to Node3, make adjustments as needed (maybe network bridge names changed) and try to boot up the VM. If you’re using the same hypervisor and have a sane configuration, this should just work. The VMs data is identical to what it was on Node1 before.

So now we need to re-configure Node3 to replicate data to Node4.

  1. Node3
    1. Disconnect the DRBD device:
      drbdadm disconnect foo
    2. Re-configure drbd.conf to replicate to Node4 instead of Node1 using the back to back network 192.168.1.0/30:
      resource foo {
              protocol C;
              device          /dev/drbdXX;
              disk            /dev/vg1/foo;
              meta-disk       internal;
              on Node3 {
                      address 192.168.1.1:7788;
              }
              on Node4 {
                      address 192.168.1.2:7788;
              }
      }
      
    3. Load the new config:
      drbdadm adjust foo
    4. Connect the new config:
      drbdadm connect foo
  2. Node4:
    1. Create backing device with the same specs as on Node3
    2. Create drbd.conf that uses this backing device and replicates from Node3 using the back to back network 192.168.1.0/30:
      resource foo {
              protocol C;
              device          /dev/drbdXX;
              disk            /dev/vg1/foo;
              meta-disk       internal;
              on Node3 {
                      address 192.168.1.1:7788;
              }
              on Node4 {
                      address 192.168.1.2:7788;
              }
      }
      
    3. Create metadata on this new DRBD device:
      drbdadm create-md foo
    4. Bring this device up:
      drbdadm up foo
  3. Watch the device sync:
    drbdadm status
  4. Node3: After initial sync, shutdown the VM and put the device into secondary mode:
    drbdadm secondary foo
  5. Node4: Put the device into primary mode:
    drbdadm primary foo

Now copy the VM configuration from Node3 and try to startup the VM on Node4. This, too, should just work.

I thought this was an impressively easy way to migrate things to a new cluster and once again, DRBD “just worked” for me.

Cheers

¹) but that’s only one way of using DRBD, have a look at their page if you don’t know DRBD yet
²) without any further setup you’d technically have to shutdown the VM on the active node and boot it up on the second node, which would give you the downtime of a reboot, but this can be optimized

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s