Thursday, 30 June 2016

Storage Overview

Storage Overview

Storage Repositories (SRs)

XenServer defines a container called a storage repository (SR) to describe a particular storage target, in which Virtual Disk Images (VDIs) are stored. A VDI is a disk abstraction which contains the contents of a virtual disk.

VDIs to be supported on a large number of SR types. The XenServer SR is very flexible, with built-in support for IDE, SATA, SCSI and SAS drives locally connected, and iSCSI, NFS, SAS and Fibre Channel remotely connected. The SR and VDI abstractions allow advanced storage features such as Thin Provisioning, VDI snapshots, and fast cloning to be exposed on storage targets that support them. For storage subsystems that do not inherently support advanced operations directly, a software stack is provided based on Microsoft's Virtual Hard Disk (VHD) specification which implements these features.

Each XenServer host can use multiple SRs and different SR types simultaneously. These SRs can be shared between hosts or dedicated to particular hosts. Shared storage is pooled between multiple hosts within a defined resource pool. A shared SR must be network accessible to each host. All hosts in a single resource pool must have at least one shared SR in common. 
SRs are storage targets containing virtual disk images (VDIs). SR commands provide operations for creating, destroying, resizing, cloning, connecting and discovering the individual VDIs that they contain.

A storage repository is a persistent, on-disk data structure. For SR types that use an underlying block device, the process of creating a new SR involves erasing any existing data on the specified storage target. Other storage types such as NFS, Integrated StorageLink (iSL) SRs, create a new container on the storage array in parallel to existing SRs.

Virtual Disk Images (VDIs)


Virtual Disk Images (VDI) are a storage abstraction that are presented to a VM representing the physical disk. VDIs are the fundamental unit of virtualized storage in XenServer. Similar to SRs, VDIs are persistent, on disk objects that exist independently of XenServer hosts. CLI operations to manage VDIs are described in Section A.4.24, “VDI Commands”. The actual on-disk representation of the data differs by the SR type and is managed by a separate storage plug-in interface for each SR, called the SM API.

Physical Block Devices (PBDs)


Physical Block Devices represent the interface between a physical server and an attached SR. PBDs are connector objects that allow a given SR to be mapped to a XenServer host. PBDs store the device configuration fields that are used to connect to and interact with a given storage target. For example, NFS device configuration includes the IP address of the NFS server and the associated path that the XenServer host mounts. PBD objects manage the run-time attachment of a given SR to a given XenServer host.

Virtual Block Devices (VBDs)


Virtual Block Devices are connector objects (similar to the PBD described above) that allows mappings between VDIs and VMs. In addition to providing a mechanism for attaching (also called plugging) a VDI into a VM, VBDs allow for the fine-tuning of parameters regarding QoS (quality of service), statistics, and the bootability of a given VDI.

Summary of Storage objects


The following image is a summary of how the storage objects presented so far are related:



Virtual Disk Data Formats


In general, there are three types of mapping of physical storage to a VDI:

1. Logical Volume-based VHD on a LUN; The default XenServer blockdevice-based storage inserts a Logical Volume manager on a disk, either a locally attached device (LVM type SR) or a SAN attached LUN over either Fibre Channel (LVMoHBA type SR), iSCSI (LVMoISCSI type SR) or SAS (LVMoHBA type Sr). VDIs are represented as volumes within the Volume manager and stored in VHD format to allow thin provisioning of reference nodes on snapshot and clone.
2. File-based VHD on a filesystem; VM images are stored as thin-provisioned VHD format files on either a local non-shared filesystem (EXT type SR) or a shared NFS target (NFS type SR)
3. LUN per VDI; LUNs are directly mapped to VMs as VDIs by SR types that provide an array specific plug in (NetApp, EqualLogic or StorageLink type SRs). The array storage abstraction therefore matches the VDI storage abstraction for environments that manage storage provisioning at an array level.

VDI Types


In general, VHD format VDIs will be created. You can opt to use raw at the time you create the VDI; this can only be done using the xe CLI. .

To check if a VDI was created with type=raw, check its sm-config map. The sr-param-list and vdi param-list xe commands can be used respectively for this purpose.

Creating a Raw Virtual Disk Using the xe CLI


1. Run the following command to create a VDI given the UUID of the SR you want to place the virtual disk in:
xe vdi-create sr-uuid=<sr-uuid> type=user virtual-size=<virtual-size> \
name-label=<VDI name> sm-config:type=raw
2. Attach the new virtual disk to a VM and use your normal disk tools within the VM to partition and format, or otherwise make use of the new disk. You can use the vbd-create command to create a new VBD to map the virtual disk into your VM.

Converting Between VDI Formats


It is not possible to do a direct conversion between the raw and VHD formats. Instead, you can create a new VDI (either raw, as described above, or VHD) and then copy data into it from an existing volume. Citrix recommends that you use the xe CLI to ensure that the new VDI has a virtual size at least as big as the VDI you are copying from (by checking its virtual-size field, for example by using the vdi-param-list command). You can then attach this new VDI to a VM and use your preferred tool within the VM (standard disk management tools in Windows, or the dd command in Linux) to do a direct block-copy of the data. If the new volume is a VHD volume, it is important to use a tool that can avoid writing empty sectors to the disk so that space is used optimally in the underlying storage repository — in this case a file-based copy approach may be more suitable.

VHD-based VDIs


VHD files may be chained, allowing two VDIs to share common data. In cases where a VHD-backed VM is cloned, the resulting VMs share the common on-disk data at the time of cloning. Each proceeds to make its own changes in an isolated copy-on-write (CoW) version of the VDI. This feature allows VHD-based VMs to be quickly cloned from templates, facilitating very fast provisioning and deployment of new VMs.

This leads to a situation where trees of chained VDIs are created over time as VMs and their associated VDIs get cloned. When one of the VDIs in a chain is deleted, XenServer rationalizes the other VDIs in the chain to remove unnecessary VDIs. This coalescing process runs asynchronously. The amount of disk space reclaimed and the time taken to perform the process depends on the size of the VDI and the amount of shared data. Only one coalescing process will ever be active for an SR. This process thread runs on the SR master host.

If you have critical VMs running on the master server of the pool and experience occasional slow IO due to this process, you can take steps to mitigate against this:

• Migrate the VM to a host other than the SR master
• Set the disk IO priority to a higher level, and adjust the scheduler. See Section 5.5.9, “Virtual Disk QoS Settings” for more information.

The VHD format used by LVM-based and File-based SR types in XenServer uses Thin Provisioning. The image file is automatically extended in 2MB chunks as the VM writes data into the disk. For File-based VHD, this has the considerable benefit that VM image files take up only as much space on the physical storage as required. With LVM-based VHD the underlying logical volume container must be sized to the virtual size of the VDI, however unused space on the underlying CoW instance disk is reclaimed when a snapshot or clone occurs. The difference between the two behaviors can be characterized in the following way:

• For LVM-based VHDs, the difference disk nodes within the chain consume only as much data as has been written to disk but the leaf nodes (VDI clones) remain fully inflated to the virtual size of the disk. Snapshot leaf nodes (VDI snapshots) remain deflated when not in use and can be attached Read-only to preserve the deflated allocation. Snapshot nodes that are attached Read-Write will be fully inflated on attach, and deflated on detach.
• For file-based VHDs, all nodes consume only as much data as has been written, and the leaf node files grow to accommodate data as it is actively written. If a 100GB VDI is allocated for a new VM and an OS is installed, the VDI file will physically be only the size of the OS data that has been written to the disk, plus some minor metadata overhead.

When cloning VMs based on a single VHD template, each child VM forms a chain where new changes are written to the new VM, and old blocks are directly read from the parent template. If the new VM was converted into a further template and more VMs cloned, then the resulting chain will result in degraded performance. XenServer supports a maximum chain length of 30, but it is generally not recommended that you approach this limit without good reason. If in doubt, "copy" the VM using XenCenter or use the vm-copy command, which resets the chain length back to 0.

No comments:

Post a Comment