Tag Archives: vcloud

VCAP-CID Exam Experience

So at VMworld 2014 Europe I sat the VCAP-CID exam, on Monday the 13th of October to be exact. I passed with 30 min left on the clock. Barely.

How hard was it and what resources did I use to pass?

So lets begin with the difficulty level. I had no idea how the exam was going to be, but I figured it would be like the VCAP-DCD exam, which is according to most people very tough. I’m sure if this is the first VCAP design exam you will do the VCAP-CID exam is actually a bit tougher than its DCD cousin, as it covers the VAF (VMware Architecture Framework) using vCloud Director and also how to operate/manage VMware Cloud environments. But there is a reason why it was tough, and perhaps tougher than it should have been for me.

The main reason the exam was tough was that I hadn’t covered all the sections in the Blueprint. I’ve been covering the VCAP-CID exam blueprint as a part of my studying, and since I was doing that I might as well blog about it, which I’m doing at this page:

VCAP-CID Study Notes

At the time of the exam I had only covered the first three Sections and but had read the vCAT documents at least 2 times before. Going through the Blueprint sections helped me immensely and I recommend going through them as well, or reading my Study Notes, which ever helps 🙂

And of course if you have taken the DCD you know how the exam is set up, with its Visio and Drag-and-Drop questions. The most time is spent on these.

Other material I gathered, but only covered the BOLD items before the exam:

As you can see there is a lot of good material on vCloud Director, but the main thing to read  the vCAT documents, they are an invaluable resource on vCloud Director designs.

If you have taken the DCD the Conceptual->Logical->Physical aspect of the exam is pretty straight forward, but if not I recommend spending some time on reading the material for Section 1 in the blueprint.

A part from that I really can’t tell you more 🙂

But do read on other bloggers experiences as well as its always good to get a second (or nine in this case) opinion 🙂

Please share if you liked this post!

VCAP-CID Study Notes: Objective 3.3

Welcome to the VCAP-CID Study Notes. This is Objective 3.3 in the VCAP-CID blueprint Guide 2.8. The rest of the sections/objectives can be found here.

Bold items that have higher importance and copied text is in italic.


  • Identify allocation model characteristics
  • Explain the impact of providing a dedicated cluster to a Provider VDC.
    • Its better to use a dedicated cluster because then you don’t have to carve up the same DRS cluster into smaller resource pools, each with different allocation models which have different ways of guaranteeing resource to their workloads.
    • This is explained in detail in this post from Frank Denneman:

Skills and Abilities

  • Determine the impact of a chosen allocation model on cluster scalability.
    • When using Reservation pools you can only scale to 32 ESXi hosts per cluster since they can only use the resources available in a single provider vDC.
    • Here there are two options, either the organization vDC is elastic or it’s not.
      • Elastic organization vDC allows you to scale to multiple clusters within the same provider vDC. This allows these to scale to multiple clusters.
      • Non elastic organization vDCs only allow Allocation Pool workloads to consume resource from single cluster.
    • These are elastic as they only work on per VM based reservation
    • Reservation Pool
    • Allocation Pool
    • Pay-as-you-Go
  • Give application performance requirements, determine an appropriate CPU and memory over-commitment configuration.
    • CPU overcommitment ratios are mostly based on vCPU:pCPU ratios. 6:1 vCPU to pCPU is considered a maximum in most cases, but this is just a recommendation from VMware. CPU vary greatly in performance so make sure to have that in mind in calculating these ratios.
    • The number of vCPU’s that can run on available physical cores determines the amount of VM’s an environment can run. This ratio is determined in the design phase as a technical requirement with design decision similar to this one: “The risk associated with an increase in CPU overcommitment is that mainly degraded overall performance that can result in higher than acceptable vCPU ready times.”
    • This ratio can be used as a part of a performance service agreement to make sure certain tier-1 workloads will not have potential CPU contention.
    • CPU Overcommitment
    • Memory Overcommitment
      • Memory overcommitment is based on configuring VM’s running on a host with more memory than it has to offer. This is commonly used in VMware vSphere environment as most workload do not use all their memory at the same time. vSphere ESXi has various features to help with overcommiting memory
      • TPS – Transparent page sharing
      • Memory ballooning
      • Memory Compression
      • Swapping to disk
    • Many of the allocation models have reservation of memory build in, but unlike reservations of CPU, when memory is reserved it is not used by other VM’s.
      • Reservation Pool creates a static memory pool where the workload fight for the resources, but certain workloads can be have individual reservations (and limits and shares)
      • Allocation Pool create a resource pool with % of reserved memory for running VM’s. These VM’s then have that pool of reserved resource to fight over as well for resources configured for the pool.
      • PAYG create a per VM reservation (and resource pool as well) and can use the rest of the resource available in the cluster.
  • Given service level requirements, determine appropriate failover capacity.
    • Failover capacity is most likely based on disaster recovery scenarios.
    • SRM is not vCloud aware so storage replication is the only way to “protect” consumer workloads.
    • The steps required could be automated using Storage system API in addition to using automation features in vSphere (PowerCLI, Orchestrator)
    • Operationally, recovery point objectives support must be determined for consumer workloads and included in any consumer Service Level Agreements (SLAs). Along with the distance between the protected and recovery sites, this helps determine the type of storage replication to use for consumer workloads: synchronous or asynchronous.
    • For more information about vCloud management cluster disaster recovery, see http://www.vmware.com/files/pdf/techpaper/vcloud-director-infrastructure-resiliency.pdf.
    • Appendix C in the vCAT Architecting a VMware vCloud document is something that is worth reading.
    • To determine the appropriate failover capacity you will need to have determined the SLA for DR service, and to which service offering it will map to. A Gold provider vDCD cluster might have that as a feature.
    • Then you have the required capacity you will need to have on the recovery site as that will based on the resources used by the organizations using the Provider vDC Performance Tier that has DR features.
  • Given a Provider VDC design, determine the appropriate vCloud compute requirements.
    • A prodiver vDC design will include allocation models, with requirements for availability SLA’s and other related requirements (DR, DRS, HA)
    • This information is used to determine the appropriate Host logical design, with number of CPU and cores, memory, HBA, NICs, local storage and additional information if needed (boot of USB etc.)
    • A great example of the process is on pages 32-34 in the Private VMware vCloud Implementation Example document
  • Given workload requirements, determine appropriate vCloud compute requirements for a vApp
    • Compute requirements for a vApp involve deciding how many vCPUs will be used, how much memory is allocated and which allocation model would fit the application in question. A Tier-1 application should run on a Reservation Pool, Tier-2-3 on Allocation Pool and Dev/Test/Transient workloads on Pay-as-you-Go.
  • Given capacity, workload and failover requirements, create a private vCloud compute design.
    • Capacity requirements are 700 VM’s and 500 vApps.
    • Workloads include Development, Pre-production, Demonstration, Training and Tier-2-3 IT Infrastructure applications and Tier-1 Database applications
    • Failover requirements are to support failover of essential Tier-2-3 Workloads and all Tier-1 workloads.
    • Create two tiers of Compute clusters, Gold and Silver.
      • Gold will include  8 host with N+2 HA configuration at 26% resource reservation. DRS configured at Fully automated at Moderate Migration Threshold.
      • Silver will include 8 host with N+1 HA configuration at 13% resource reservation. DRS configured at Fully automated at Moderate Migration Threshold.
    • Storage will include 3 separte tiers, Tier-1, Tier-2 and Tier-3.
      • Tier-1 is based on 10K SAS disks and SSD disks and Easy-tiering solution to move hot blocks to SSD automatically. Workloads are replicated to a second site for Disaster recovery.
      • Tier-2 is based on 10K SAS disks.
      • Tier-3 is based on 7.2 NL-SAS disks and 10K disks, to move hot blocks to 10K disks automatically.
    • Allocations model used are
      • Run Tier1-2-3 workloads
      • Run the rest of the workloads with various amount of % reservations between workloads.
      • Gold Compute cluster: Reservation Pool
      • Silver Compute clsuter: Allocation Pool
    • Let’s use an example:
    • The Private vCloud compute design includes both Management and Resource clusters, but most Management cluster are similar, both for Private or Public Clouds so we will just design the Resource cluster (since the workloads will reside there)
    • This is just an example how you would create a logical compute layout. I added both storage and consumer constructs but I think you really need to have all the required information to get the idea.
    • This information was gathered from the Private VMware vCloud Implementation Example document. Even though based on vCloud 1.5 it’s a great document to use as a reference for future designs.
  • Given capacity, workload and failover requirements, create a public vCloud compute design.
    • The same applies to Public vCloud compute designs, but instead of Reservation or Allocation Pool you use Pay-as-you-Go (if the requirements are to charge each VM separately etc.)
    • Capacity requirements in Public vCloud instances are really guess-work. You design around predicted capacity based on number of VM’s, their setup and size of their storage. As seen in this picture from the Public VMware vCloud Implementation Example document.

VCAP CID 3-3-1

    • Most public vCloud recommended use-cases are Test/Dev and other transient workloads. But this is really based on the requirement of the hosting company. Can be a mix of all the allocation models.
    • Failover requirements of Public vClouds are most likely a part of a more expensive offering for higher tier workloads, but you can of course failover any kind of workloads, even the “no-so-important” ones. All workloads are someones production workloads 🙂
    • For creating a design for a public vCloud I recommend reading the Public VMware vCloud Implementation Example document as a great reference.

VCAP-CID Study Notes: Objective 3.2

Welcome to the VCAP-CID Study Notes. This is Objective 3.2 in the VCAP-CID blueprint Guide 2.8. The rest of the sections/objectives can be found here.

Bold items that have higher importance and copied text is in italic.

  • Identify the storage placement algorithm for vApp creation.
    • As scary as this sounds this is documented very well in the vCAT Architecting a VMware vCloud document on pages 46-48.
    • This is the link to the online version.
  • Explain considerations related to vApp VM virtual disk placement.
    • The same pages.
  • Explain the impact of datastore threshold settings on shadow VM creation.
    • If a vApp is a linked clone it check if the datastore holding the Shadow VM is reaching its yellow or red limit, if so then a new Shadows VM is created on a new Datastore.
  • Explain the impact of multiple vCenter Servers on shadow VM creation.
    • I explained how Fast Provisioning works in objective 2.5
  • Explain the impact of VMFS and NFS on cluster size.
    • In vSphere 5.1 (which this blueprint is based on) the maximum size of a Datastore is 64TB. NFS shares can be as large as the manufactures of the NAS box allows.
    • Resource clusters can have 32 ESXi hosts. Each host can have 256 Volumes, but only 64 hosts can access the same volume.
    • So at 64 ESXi hosts you are limited to showing them the same 256 Volumes.
    • Each Datastore Cluster can only hold 32 Datastores, so that’s 8 full Datastore clusters.
    • For NFS you can mount 256 mountpoints per host.

Skills and Abilities

  • Given service level requirements, create a tiered vCloud storage design.
    • Most likely the SLA’s around tiered storage ar based on performance guarantees.
    • If the use cases and the requirements are that you need to have faster storage for some high IO VM’s and then slower storage for low IO VM’s a tiered storage would help in those instances.
    • Most likely a 3 tier storage solution is used. Eg. Gold, Silver and Bronze.
      • Gold would perhaps use 15K disks and SSD disks for caching, or 10K disks and SSD disk for automatic caching. RAID levels depend on the storage array manufacturer and write load.
      • Silver could be a pool of 10K disks. RAID levels depend on the storage array manufacturer and write load.
      • Bronze could be a pool of 7.2 NL-SAS disk and be used mainly for archiving purposes, idle workloads and perhaps test/dev (just remember that test/dev environments are production environments for developers :))
  • Given the I/O demands of various cloud workloads, determine an appropriate vCloud storage configuration.
    • This is also based on storage tiers, or how they are configured.
    • RAID levels, in legacy storage arrays, play a role in calculating IO performance for disk arrays.
      • R1 or R10: Mirror og Mirrored Spanned drives. Write penalty is 2, so for each write issued from the servers, two need to be performed on the array. That’s because it’s a mirror.
      • R5: Uses Parity to allow failure of one drive. Write penalty is 4. That’s because with each change in the on the disk, a read on both data and parity is performed, and then write the data and the parity.
      • R6: Uses two sets of parity. Write Penalty is 6. Three reads, three writes.
      • RAID-DP: Two sets of parity like raid 6. But the write penalty is only two because of how WAFL writes data to disks.
    • How to calculateIOPS
      • First thing first, a list ofIOPS per disk speed
        • 15K Disks = 175-210 IOPS
        • 10K Disks = 125-150 IOPS
        • 7.2K Disks = 75-100 IOPS
        • 5.4K Disks = 50 IOPS
      • Also it depends if it’s a SATA or SAS drive, and if it’s an enterprise grade disk or not (higher MTBF).
      • And then Duncan Epping will explain IOPS calculations for us in this post: http://www.yellow-bricks.com/2009/12/23/iops/
  • Given service level, workload and performance requirements, create a private vCloud storage design.
    • Private Cloud most likely have workloads that are known, or even planned in advance. So there is more information about the storage requirements for the workloads.
    • This information can be used with other information (gathered in a Current State Analysis) to create a storage design that would fulfill all requirements.
    • Data that would be nice to have for a private cloud are:
      • IO profile of the workload
        • Can be used to calculate IO requirements for the environment.
      • Hard disk size and number, and growth
        • To calculate initial capacity and projected growth
      • RTO if a LUN is lost
        • How long it will take to restore all the VM’s on the datastore, and that depends on the backup solution used
    • As an example you can then create storage tiers that have different sizes to differentiate on RTO, but the same RAID setup.
  • Given service level, workload and performance requirements, create a public vCloud storage design.
    • Public Clouds have very varied workloads, and there is no way of knowing beforehand what the medium size workload is in these environments.
    • When using Fast-provisioning and Thin-provisioning larger Datastores are better
    • 2-4 TB Datastores are a good choice and use SDRS to balance workloads inside a Storage Cluster. Those Clusters can grow to 64-128 TB using this configuration.
    • Michael Webster explains this in detail in this blog post: http://longwhiteclouds.com/2013/02/18/considerations-for-designing-a-small-vcloud-director-environment-storage/