Category Archives: Storage

VCAP-CID Study Notes: Objective 3.2

Welcome to the VCAP-CID Study Notes. This is Objective 3.2 in the VCAP-CID blueprint Guide 2.8. The rest of the sections/objectives can be found here.

Bold items that have higher importance and copied text is in italic.
Knowledge

  • Identify the storage placement algorithm for vApp creation.
    • As scary as this sounds this is documented very well in the vCAT Architecting a VMware vCloud document on pages 46-48.
    • This is the link to the online version.
  • Explain considerations related to vApp VM virtual disk placement.
    • The same pages.
  • Explain the impact of datastore threshold settings on shadow VM creation.
    • If a vApp is a linked clone it check if the datastore holding the Shadow VM is reaching its yellow or red limit, if so then a new Shadows VM is created on a new Datastore.
  • Explain the impact of multiple vCenter Servers on shadow VM creation.
    • I explained how Fast Provisioning works in objective 2.5
  • Explain the impact of VMFS and NFS on cluster size.
    • In vSphere 5.1 (which this blueprint is based on) the maximum size of a Datastore is 64TB. NFS shares can be as large as the manufactures of the NAS box allows.
    • Resource clusters can have 32 ESXi hosts. Each host can have 256 Volumes, but only 64 hosts can access the same volume.
    • So at 64 ESXi hosts you are limited to showing them the same 256 Volumes.
    • Each Datastore Cluster can only hold 32 Datastores, so that’s 8 full Datastore clusters.
    • For NFS you can mount 256 mountpoints per host.

Skills and Abilities

  • Given service level requirements, create a tiered vCloud storage design.
    • Most likely the SLA’s around tiered storage ar based on performance guarantees.
    • If the use cases and the requirements are that you need to have faster storage for some high IO VM’s and then slower storage for low IO VM’s a tiered storage would help in those instances.
    • Most likely a 3 tier storage solution is used. Eg. Gold, Silver and Bronze.
      • Gold would perhaps use 15K disks and SSD disks for caching, or 10K disks and SSD disk for automatic caching. RAID levels depend on the storage array manufacturer and write load.
      • Silver could be a pool of 10K disks. RAID levels depend on the storage array manufacturer and write load.
      • Bronze could be a pool of 7.2 NL-SAS disk and be used mainly for archiving purposes, idle workloads and perhaps test/dev (just remember that test/dev environments are production environments for developers :))
  • Given the I/O demands of various cloud workloads, determine an appropriate vCloud storage configuration.
    • This is also based on storage tiers, or how they are configured.
    • RAID levels, in legacy storage arrays, play a role in calculating IO performance for disk arrays.
      • R1 or R10: Mirror og Mirrored Spanned drives. Write penalty is 2, so for each write issued from the servers, two need to be performed on the array. That’s because it’s a mirror.
      • R5: Uses Parity to allow failure of one drive. Write penalty is 4. That’s because with each change in the on the disk, a read on both data and parity is performed, and then write the data and the parity.
      • R6: Uses two sets of parity. Write Penalty is 6. Three reads, three writes.
      • RAID-DP: Two sets of parity like raid 6. But the write penalty is only two because of how WAFL writes data to disks.
    • How to calculateIOPS
      • First thing first, a list ofIOPS per disk speed
        • 15K Disks = 175-210 IOPS
        • 10K Disks = 125-150 IOPS
        • 7.2K Disks = 75-100 IOPS
        • 5.4K Disks = 50 IOPS
      • Also it depends if it’s a SATA or SAS drive, and if it’s an enterprise grade disk or not (higher MTBF).
      • And then Duncan Epping will explain IOPS calculations for us in this post: http://www.yellow-bricks.com/2009/12/23/iops/
  • Given service level, workload and performance requirements, create a private vCloud storage design.
    • Private Cloud most likely have workloads that are known, or even planned in advance. So there is more information about the storage requirements for the workloads.
    • This information can be used with other information (gathered in a Current State Analysis) to create a storage design that would fulfill all requirements.
    • Data that would be nice to have for a private cloud are:
      • IO profile of the workload
        • Can be used to calculate IO requirements for the environment.
      • Hard disk size and number, and growth
        • To calculate initial capacity and projected growth
      • RTO if a LUN is lost
        • How long it will take to restore all the VM’s on the datastore, and that depends on the backup solution used
    • As an example you can then create storage tiers that have different sizes to differentiate on RTO, but the same RAID setup.
  • Given service level, workload and performance requirements, create a public vCloud storage design.
    • Public Clouds have very varied workloads, and there is no way of knowing beforehand what the medium size workload is in these environments.
    • When using Fast-provisioning and Thin-provisioning larger Datastores are better
    • 2-4 TB Datastores are a good choice and use SDRS to balance workloads inside a Storage Cluster. Those Clusters can grow to 64-128 TB using this configuration.
    • Michael Webster explains this in detail in this blog post: http://longwhiteclouds.com/2013/02/18/considerations-for-designing-a-small-vcloud-director-environment-storage/

 

VM swapfiles: Leftover swapfiles

Every time a VM is started a swapfile is created on the datastore where the VM resides.

The format for these files are .vswp

I was working my way through a customers datastores, trying to find some orphaned VMs or files that could be deleted and I noticed that many of the VM folder had the .vswp files and then some extra ones.

The extra vswp files had a different ending, and that was vswp.xxxxx, where the x’s represent a number.

These files are created when your ESXi hosts crash and when the VM was restarted, the file was still there so it’s renamed and a new .vswp file is created.

For an active VMware environment, you could potentially see a lot of these if you had some ESXi crashing on you in the past. For that environment (around 60 VMs)  I found 34 GB of “dead” .vswp files. That’s a lot when your Datastores are at full capacity.

But you really can’t manually go searching for these unless you like repetitive and brain-damaging work.

This really isn’t a big piece of code and really doesn’t deserve a script by itself 🙂

dir vmstores: -Recurse -Include *.vswp.* | Select Name,Folderpath

This will give you a output of:

Note: This will take a lot of time to run on bigger environments. I tried this in a big production environment (40+ Datastores) and it took 5 hours to run. But it found 24GB of emtpy .vswp files.

I hope this helps to find some space on your expensive SAN 🙂

Edit: Raphael Shitz at http://www.hypervisor.fr has a great script for searching for orphaned files on Datastores and he added a search for “*.vswp.*”.  And its a lot faster, only few minutes rather than hours. His site is in french so here is a link to his post through Google translate. And if you can read french, Allez.

VMware Storage Basics – PSA and NMP

A few days ago I posted a blog about VMware Storage Basics, the identifiers and how paths work between them.

This post will be post two of three in explaining the VMware storage stack, ending in a post about masking paths.

So, PSA and NMP, what is that? First a short and extremely dry explanation.

PSA (Pluggable Storage Architechture)

  • is special VMKernel layer that manages storage multipathing.
  • Coordinates simultaneous operation of multiple MPPs (Multipathing plugins) and the default NMP (Native Multipath Plugin).
  • Allows 3rd party vendors to design their own MPP, with their own load balancing techniques and failover mechanisms.

NMP (Native Multipath Plugin)

  • The default MPP that comes with ESX,ESXI is the NMP. It manages some subplugins.
  • The subplugins are SATPs (Storage Array Type Plugin) and the PSPs (Path Selection Plugin).
  • Subplugins are either VMware default or 3rd party plugins (specific SATPs for specific arrays, e.g. VMW_SATP_SVC for IBM SVC Arrays).
  • It associates SATP to paths, processes I/O requests to logical devices, performs failovers using SATP.

MPP (Multipath Plugin)

  • 3rd party NMP+SATP+PSP. All in one stack.

Does that explain anything? No. Well OK, some. For my part its when I see a picture and preferrably a moving representation is when things start to seep into the grey matter.

First an overview picture – On the left you got the NMP, with its destinct subplugins, SATP and PSP. Then in the middle there is a 3rd party MPP, and last but not least is the MASK_PATH plugin.

So what do the subplugins do?

SATP

  • Manages failover of paths. Monitors, determines and implements switching between paths in case of a failure.
  • Provided for every type of array that VMware supports, e.g. VMW_SATP_LSI for LIST/NetApp arrays from Dell, IBM, Oracle and SGI to name a few.

PSP

  • Determines which path will be used for an I/O request. Thats the Fixed, Round-Robin, Most-recently-used algorithms. More on that later.

So now we got these acronyms : PSA, NMP, MPP, SATP, PSP, MRU, RR. How do they work together? Lets begin when you boot up a ESXi host.

  1. NMP assigns a SATP to every physical path to the logical device (datastore), e.g. VMW_SATP_LSI if its a IBM DS3524.
  2. NMP associates paths to logical devices – see my previous post on paths.
  3. NMP decides which PSP to use with the logical device.
  4. Storage framework (VM) tells NMP an I/O is ready to send.
  5. I/O is issued.
  6. PSP is selected. Load-balances if applicable.
  7. I/O is sent to  device.
  8. Success:Device driver (Storage array) indicates I/O is complete. Failure: NMP calls appropriate SATP.
  9. Success: NMP tells PSP I/O is complete. Failure: SATP interprets error codes and fails over to inactive paths.
  10. Failure: PSP is called again to select which path to use for I/O – excluding the failed path.

That pretty simple right? Well not exactly because there is no way to visualize that process unless you had read about the PSA before.

So here is a short video to help you out. I hope this makes the topic easier to understand.