Creating & Managing EBS Volumes

I started making some notes on how EBS volume creation can be tracked and troubleshot in Eucalyptus (Eucalyptus 3.X + btw).  I figured this would make a good blog post, so I’ve just dumped all the information into here and I’ll let Google do the rest 😉  I’ll update it as I go along and provide more hints in the troubleshooting section, this will also make it into the Eucalyptus Knowledge Base, albeit in different form and smaller chunks.  I hope this is useful to some.

Creating & Managing Elastic Block Storage (EBS) Volumes

This guide illustrates for cloud administrators how the process of volume creation works, how volumes can be managed and how volumes can be troubleshooted.
The Volume Creation Process
Volume creation is fairly straightforward, there are a number of steps in the process between issuing a euca-create-volume command and the configuration of the iscsi target.
To create a volume, one would issue a euca-create-volume command:

# euca-create-volume -z eucalyptus -s 5

This tells Eucalyptus to create a volume in the specified availability zone (use euca-describe-availability-zone to show zones) and with a size of 5GB.
At this point, Eucalyptus will communicate with the storage controller (SC) component.  The SC process creates a sparse image file in /var/lib/eucalyptus/volumes:

# ls -lsrth /var/lib/eucalyptus/volumes/
total 1.1M
1.1M -rw-r–r– 1 root root 5.1G Jan 23 08:57 vol-391D44BE

Next, the SC will create a loop device and loopback mount this image file.  Use the following command to view the loop device configuration:

# losetup -a
/dev/loop0: [fd00]:394108 (//var/lib/eucalyptus/volumes/vol-391D44BE)

Then, this device is controlled by the Logical Volume Manager (LVM); it’s added as a physical volume, a volume group is created and a logical volume from that.  Once again, this can be seen using the LVM commands:

# pvdisplay
— Physical volume —
PV Name               /dev/loop0
VG Name               vg-1lS3pg..
PV Size               5.00 GB / not usable 4.00 MB
Allocatable           yes (but full)
PE Size (KByte)       4096
Free PE               0
Allocated PE          1280
PV UUID               u8vTQh-pEhU-9ID2-ewEO-kOIh-iwQx-Kb7BrZ
# vgdisplay
— Volume group —
VG Name               vg-1lS3pg..
System ID
Format                lvm2
Metadata Areas        1
Metadata Sequence No  2
VG Access             read/write
VG Status             resizable
MAX LV                0
Cur LV                1
Open LV               1
Max PV                0
Cur PV                1
Act PV                1
VG Size               5.00 GB
PE Size               4.00 MB
Total PE              1280
Alloc PE / Size       1280 / 5.00 GB
Free  PE / Size       0 / 0
VG UUID               eMYWz3-DkhN-IM9I-8T6j-cbnI-vAJe-H9OJgh

  Total PE              1280

and finally:

# lvdisplay
— Logical volume —
LV Name                /dev/vg-1lS3pg../lv-28NzTQ..
VG Name                vg-1lS3pg..
LV UUID                Goyv51-3WRJ-GOiq-yrjU-osw8-efqF-UGQt60
LV Write Access        read/write
LV Status              available
# open                 1
LV Size                5.00 GB
Current LE             1280
Segments               1
Allocation             inherit
Read ahead sectors     auto
– currently set to     256
Block device           253:3

With the logical volume created on the loopback mounted image, this then needs to be made available to our node controller which is running the instance on which we want to attach the EBS volume.  In Eucalyptus 2 and 3, the default storage networking method is iSCSI.  To provide iSCSI storage networking, Eucalyptus uses the Linux SCSI target framework (tgt).   Available in the base repositories of all major distributions, it is installed as a dependency for the Eucalyptus SC.
Eucalyptus configures the SCSI target daemon to run on the SC, configuring targets for each EBS volume which is configured.
Using the tgt-admin utility it’s possible to view the currently configured iSCSI targets:

# tgt-admin -s
Target 4: iqn.2009-06.com.eucalyptus.eucalyptus:store4
System information:
Driver: iscsi
State: ready
I_T nexus information:
LUN information:
LUN: 0
Type: controller
SCSI ID: IET     00040000
SCSI SN: beaf40
Size: 0 MB, Block size: 1
Online: Yes
Removable media: No
Readonly: No
Backing store type: null
Backing store path: None
Backing store flags:
LUN: 1
Type: disk
SCSI ID: IET     00040001
SCSI SN: beaf41
Size: 5369 MB, Block size: 512
Online: Yes
Removable media: No
Readonly: No
Backing store type: rdwr
Backing store path: /dev/vg-1lS3pg../lv-28NzTQ..
Backing store flags:
Account information:
eucalyptus
ACL information:
ALL

Note the target name, which is a useful identifier.  Also note the backing store path, which happens to be the logical volume the SC has configured.  Visible for each target is the account information, Eucalyptus uses a non-root account for security reasons.

Attaching Volumes to Instances

At this point, the user may attach the volume to a running instance.  Use the following command to view your available volumes:

# euca-describe-volumes

Then, to attach the desired volume:

# euca-attach-volume -i i-133F3E53 -d /dev/sdb1 vol-391D44BE

This will attach the designated volume (vol-391D44BE) to the instance with ID i-133F3E53, as the device /dev/sdb1.
To cover what happened at this point, transition onto the Node Controller (NC) hosting the instance to which the EBS volume was attached.
Use the open ISCSI administration utility to query the target daemon on the SC and view the published targets:

# iscsiadm -m discovery -t sendtargets -p 172.22.0.13
172.22.0.13:3260,1 iqn.2009-06.com.eucalyptus.eucalyptus:store4

Note the target LUN is visible to the NC.
Eucalyptus instructs the NC to connect to the target.  This is visible in /var/log/messages on the NC:

Jan 23 11:03:31 Pod-04 kernel: scsi3 : iSCSI Initiator over TCP/IP
Jan 23 11:03:32 Pod-04 kernel:   Vendor: IET       Model: Controller        Rev: 0001
Jan 23 11:03:32 Pod-04 kernel:   Type:   RAID                               ANSI SCSI revision: 05
Jan 23 11:03:32 Pod-04 kernel: scsi 3:0:0:0: Attached scsi generic sg5 type 12
Jan 23 11:03:32 Pod-04 kernel:   Vendor: IET       Model: VIRTUAL-DISK      Rev: 0001
Jan 23 11:03:32 Pod-04 kernel:   Type:   Direct-Access                      ANSI SCSI revision: 05
Jan 23 11:03:32 Pod-04 kernel: SCSI device sdd: 10485760 512-byte hdwr sectors (5369 MB)
Jan 23 11:03:32 Pod-04 kernel: sdd: Write Protect is off
Jan 23 11:03:32 Pod-04 kernel: SCSI device sdd: drive cache: write back
Jan 23 11:03:32 Pod-04 kernel: SCSI device sdd: 10485760 512-byte hdwr sectors (5369 MB)
Jan 23 11:03:32 Pod-04 kernel: sdd: Write Protect is off
Jan 23 11:03:32 Pod-04 kernel: SCSI device sdd: drive cache: write back
Jan 23 11:03:32 Pod-04 kernel:  sdd: unknown partition table
Jan 23 11:03:32 Pod-04 kernel: sd 3:0:0:1: Attached scsi disk sdd
Jan 23 11:03:32 Pod-04 kernel: sd 3:0:0:1: Attached scsi generic sg6 type 0
Jan 23 11:03:33 Pod-04 kernel: peth0: received packet with  own address as source address
Jan 23 11:03:33 Pod-04 kernel: peth0: received packet with  own address as source address
Jan 23 11:03:33 Pod-04 iscsid: Connection2:0 to [target: iqn.2009-06.com.eucalyptus.eucalyptus:store4, portal: 172.22.0.13,3260] through [iface: default] is operational now

The resultant device is available as /dev/sdd on the NC, visible with “fdisk -l”:

Disk /dev/sdd: 5368 MB, 5368709120 bytes
166 heads, 62 sectors/track, 1018 cylinders
Units = cylinders of 10292 * 512 = 5269504 bytes
Disk /dev/sdd doesn’t contain a valid partition table

Then, Eucalyptus generates an XML file for the volume in the working directory of the instance it will attach to.  Below is the instance working directory, as the ID suggests:

[root@Pod-04 i-133F3E53]# pwd
/var/lib/eucalyptus/instances/work/CEK7XDHLEBSVR1SATAZMT/i-133F3E53

Below is the XML file for the volume:

[root@Pod-04 i-133F3E53]# ll vol*
-rw-rw—- 1 eucalyptus eucalyptus 535 Jan 23 11:03 vol-391D44BE.xml

This XML generated by Eucalpytus looks like:

<?xml version=”1.0″ encoding=”UTF-8″?>
<volume>
<hypervisor type=”xen” capability=”xen+hw” bitness=”64″/>
<id>vol-391D44BE</id>
<user>CEK7XDHLEBSVR1SATAZMT</user>
<instancePath>/var/lib/eucalyptus/instances/work/CEK7XDHLEBSVR1SATAZMT/i-133F3E53</instancePath>
<os platform=”linux” virtioRoot=”false” virtioDisk=”false” virtioNetwork=”false”/>
<backing>
<root type=”image”/>
</backing>
<diskPath targetDeviceType=”disk” targetDeviceName=”sdb2″ targetDeviceBus=”scsi” sourceType=”block”>/dev/sdd</diskPath>
</volume>

See the volume ID and target device name as specified with the euca-attach-volume command.
This is then added with a virsh command behind the scenes by Eucalyptus.  Note a manual method would be something like:

virsh attach-device <domain> <file>

Running the following command on the node controller will dump the current xml definition for the virtual machine:

virsh dumpxml <domain_id>

Use virsh list to obtain the domain ID for the instance.  The block device should be present in the <device> section:

<disk type=’block’ device=’disk’>
<driver name=’phy’/>
<source dev=’/dev/sdd’/>
<target dev=’sdb2′ bus=’scsi’/>
</disk>

Thus, looking on the instance, the new volume is available as /dev/sdb2:

Disk /dev/sdb2: 5368 MB, 5368709120 bytes
255 heads, 63 sectors/track, 652 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x00000000
Disk /dev/sdb2 doesn’t contain a valid partition table

At this point, the volume can be formatted and used by the instance:

# mkfs.ext3 /dev/sdb2
# mkdir /media/data
# mount /dev/sdb2 /media/data

Detaching Volumes

Detaching a volume follows much the same process but in reverse order.
WARNING:  You must unmount the device within the instance before detaching the volume, otherwise you risk data loss and being unable to properly detach the volume whilst the instance is running.
A volume can be detached with the following command:

euca-detach-volume -i i133F3E53 vol-391D44BE

This detaches the previously created volume from the running instance, i133F3E53.
Much the same as adding a block device to a running guest, this calls virsh commands, as below:

virsh detach-device <domain> <file>

Whereby domain is the ID of the running domain (which can be found by using ‘virsh list’) and file is the XML device definition file that Eucalyptus created at attach time, in this continued example it was vol-391D44BE.xml
This device defition file is then removed from the working directory of the instance on the node controller.

Snapshotting Volumes

An EBS volume snapshot acts as a point-in-time backup of your persistent storage into Walrus (S3), note that it is incremental in nature, only backing-up the changed data.  It can also be a method for duplicating EBS volumes; it is possible to snapshot a volume and recreate a new volume from the snapshot.
WARNING: As when detaching a volume, it is recommended that you unmount the volume within the instance before taking a snapshot.
To create a snapshot, use the euca-create-snapshot command:

euca-create-snapshot <volume_id>

Providing the volume ID, as shown with euca-describe-volumes as the arguement.  Next, view the status of the snapshot process with:

[root@Pod-03 lester]# euca-describe-snapshots
SNAPSHOT        snap-C2DF3CF2   vol-D02E43C2    pending 2012-01-30T15:44:04.853Z        0%      None

In this example, the snapshot is of a 15G EBS volume.
Eucalyptus EBS snapshots use Logical Volume Manager (LVM) copy-on-write (COW) snapshots to create the snapshot disk image.  Firstly the storage controller creates an image file large enough to add to the volume group for the current logical volume on which the current EBS volume resides.   This additional image is created in the volumes directory on the storage controller:

[root@Pod-03 lester]# ll /var/lib/eucalyptus/volumes/
total 17015932
-rw-r–r– 1 root root  8589934592 Jan 30 07:32 snap-26EE3E3A
-rw-r–r– 1 root root  3224145920 Jan 30 07:44 snap-C2DF3CF2
-rw-r–r– 1 root root  5368709120 Jan 30 06:49 snap-D7E53AC0
-rw-r–r– 1 root root 10741612544 Jan 20 09:56 vol-178C3E9C
-rw-r–r– 1 root root  5372903424 Jan 23 08:57 vol-391D44BE
-rw-r–r– 1 root root 16110321664 Jan 30 07:43 vol-D02E43C2
-rw-r–r– 1 root root  8053063680 Jan 30 07:44 vol-D02E43C2MsmGAIzx

This is then loopback mounted:

[root@Pod-03 volumes]# losetup -a
/dev/loop0: [fd00]:393652 (//var/lib/eucalyptus/volumes/vol-178C3E9C)
/dev/loop1: [fd00]:394108 (//var/lib/eucalyptus/volumes/vol-391D44BE)
/dev/loop2: [fd00]:393631 (//var/lib/eucalyptus/volumes/vol-D02E43C2)
/dev/loop3: [fd00]:394121 (//var/lib/eucalyptus/volumes/vol-D02E43C2MsmGAIzx)

Marked as a physical volume for LVM and added to the volume group, which now shows increased size:

[root@Pod-03 volumes]# vgdisplay
— Volume group —
VG Name               vg-WsNskQ..
System ID
Format                lvm2
Metadata Areas        2
Metadata Sequence No  5
VG Access             read/write
VG Status             resizable
MAX LV                0
Cur LV                2
Open LV               2
Max PV                0
Cur PV                2
Act PV                2
VG Size               22.50 GB
PE Size               4.00 MB
Total PE              5759
Alloc PE / Size       5759 / 22.50 GB
Free  PE / Size       0 / 0
VG UUID               Dh0Xs7-Pvlr-vUbm-d1mA-7pCf-I94z-dS3r0l

lvdisplay shows the status of the logical volumes:

[root@Pod-03 volumes]# lvdisplay
— Logical volume —
LV Name                /dev/vg-WsNskQ../lv-OWzOIQ..
VG Name                vg-WsNskQ..
LV UUID                JFyFII-F1K8-QGYD-0s77-1dZf-ALQm-33AOr9
LV Write Access        read/write
LV snapshot status     source of
/dev/vg-WsNskQ../lv-snap-CfzmRA.. [active]
LV Status              available
# open                 1
LV Size                15.00 GB
Current LE             3840
Segments               1
Allocation             inherit
Read ahead sectors     auto
– currently set to     256
Block device           253:4

— Logical volume —
LV Name                /dev/vg-WsNskQ../lv-snap-CfzmRA..
VG Name                vg-WsNskQ..
LV UUID                0S4AAC-g4ip-0Zjs-0dQV-Yfym-tTNA-tPV1rA
LV Write Access        read/write
LV snapshot status     active destination for /dev/vg-WsNskQ../lv-OWzOIQ..
LV Status              available
# open                 1
LV Size                15.00 GB
Current LE             3840
COW-table size         7.50 GB
COW-table LE           1919
Allocated to snapshot  0.00%
Snapshot chunk size    4.00 KB
Segments               1
Allocation             inherit
Read ahead sectors     auto
– currently set to     256
Block device           253:5

Note the COW-table size and logical volume size equals that of the underlying physical volume.  This additional physical volume in the volume group is needed to hold the snapshot logical volume.
For more information on how LVM snapshotting works, see the TLDP guide: http://tldp.org/HOWTO/LVM-HOWTO/snapshotintro.html
From this snapshot the data is then copied into a snapshot disk image in /var/lib/eucalyptus/volumes/.  The process can be viewed using ps -aux on the storage controller:

dd if /dev/vg-WsNskQ../lv-snap-CfzmRA.. of //var/lib/eucalyptus/volumes/snap-C2DF3CF2 bs 1M

Once complete, the storage controller cleans up the snapshot device, removes the physical volume and unmounts the loopback device and removes the temporary disk image.  The volumes directory now just shows the snapshot image:

[root@Pod-03 ~]# ll /var/lib/eucalyptus/volumes/
total 29607236
-rw-r–r– 1 root root  8589934592 Jan 30 07:32 snap-26EE3E3A
-rw-r–r– 1 root root 16106127360 Jan 30 07:48 snap-C2DF3CF2
-rw-r–r– 1 root root  5368709120 Jan 30 06:49 snap-D7E53AC0
-rw-r–r– 1 root root 10741612544 Jan 20 09:56 vol-178C3E9C
-rw-r–r– 1 root root  5372903424 Jan 23 08:57 vol-391D44BE
-rw-r–r– 1 root root 16110321664 Jan 30 07:43 vol-D02E43C2

Next, the snapshot is stored in Walrus (S3) as a snapset.  Check /var/log/eucalyptus/cloud-output.log for the transfer messages:

| <euca:StoreSnapshotType xmlns:euca=”http://msgs.eucalyptus.com”&gt;
|   <euca:WalrusDataRequestType>
|     <euca:WalrusRequestType>
|       <euca:correlationId>9105e2c8-b228-4dfa-b622-e384a232852f</euca:correlationId>
|       <euca:_return>true</euca:_return>
|       <euca:_services/>
|       <euca:_disabledServices/>
|       <euca:_notreadyServices/>
|       <euca:accessKeyID>KGPY0PMLTKX4XUORAC8IK</euca:accessKeyID>
|       <euca:timeStamp>2012-01-30T15:48:34.688Z</euca:timeStamp>
|       <euca:bucket>snapset-ccc92f77-ab62-4554-9151-973e45fcc974</euca:bucket>
|       <euca:key>snap-C2DF3CF2</euca:key>
|     </euca:WalrusRequestType>
|     <euca:randomKey>snapset-ccc92f77-ab62-4554-9151-973e45fcc974.snap-C2DF3CF2.JAoH7pTIt7gbzQ..</euca:randomKey>
|   </euca:WalrusDataRequestType>
|   <euca:snapshotSize>16106127360</euca:snapshotSize>
| </euca:StoreSnapshotType>

Note the snapshot reference in the key field.  This will be followed in the log (if successful) with:

INFO WalrusManager.putObject(WalrusManager.java):1020 | Transfer complete: snapset-ccc92f77-ab62-4554-9151-973e45fcc974.snap-C2DF3CF2

The snapset should appear in the Walrus (S3) bukkits directory:

[root@Pod-03 eucalyptus]# ll /var/lib/eucalyptus/bukkits/
total 24
drwxr-xr-x 2 eucalyptus eucalyptus 4096 Jan 30 05:05 euca-centos
drwxr-xr-x 2 eucalyptus eucalyptus 4096 Jan 16 06:50 euca-ubuntu
drwxr-xr-x 2 eucalyptus eucalyptus 4096 Jan 30 08:35 snapset-80efb05e-f0d7-444d-946d-9cfdad3845ad
drwxr-xr-x 2 eucalyptus eucalyptus 4096 Jan 30 06:51 snapset-ad471308-c3f9-4908-a4f0-1e8790d8b99f
drwxr-xr-x 2 eucalyptus eucalyptus 4096 Jan 30 09:08 snapset-ccc92f77-ab62-4554-9151-973e45fcc974
drwxr-xr-x 2 eucalyptus eucalyptus 4096 Jan 30 07:35 snapset-f394c343-2580-4406-a29b-b5b62b2b9fd1

WARNING:  Snapshots count towards a users quota in S3, keep this in mind when configuring the storage system.

Troubleshooting

Q1:  I’ve tried to detach a volume from my instance but it stays in the “detaching” state.
A1:  Ensure you unmount your volume from within the instance before detaching it, in this case you would need to terminate the instance to free your volume.
Q2:  My EBS snapshot is created on the storage controller but it’s not uploaded to Walrus.
A2:  Check /var/log/eucalyptus/cloud-error.log and cloud-output.log for errors.  You may be hitting your users quota, use the GUI or euca-describe-properties to check the quota limits for snapshots.