SQL Server VM Performance with VMware vSphere 6.5

Achieving optimal SQL Server performance on vSphere has been a constant focus here at VMware; Irsquo;ve published past performance studies with vSphere 5.5 and 6.0 which showed excellent performance up to the maximum VM size supported at the time.

Since then, there have been quite a few changes! While this study uses a similar test methodology, it features an updated hypervisor (vSphere 6.5), database engine (SQL Server 2016), OLTP benchmark (DVD Store 3), and CPUs (Intel Xeon v4 processors with 24 cores per socket, codenamed Broadwell-EX).

The new tests show large SQL Server databases continue to run extremely efficiently, achieving great performance on vSphere 6.5. Following our best practices was all that was necessary to achieve this scalability – which reminds me, donrsquo;t forget to check out Niranrsquo;s new SQL Server on vSphere best practices guide, which was also just updated.

In addition to performance, power consumption was measured on each ESXi host. This allowed for a comparison of Host Power Management (HPM) policies within vSphere, performance per watt of each host, and power draw under stress versus idle:

Generational SQL Server DB Host Power and Performance/watt

Additionally, this new study compares a virtual file-based disk (VMDK) on VMwarersquo;s native Virtual Machine File System (VMFS 5) to a physical Raw Device Mapping (RDM). I added this test for two reasons: first, it has been several years since they have been compared; and second, customer feedback from VMworld sessions indicates this is still a debate that comes up in IT shops, particularly with regard to deploying database workloads such as SQL Server and Oracle.

For more details and the test results, download the paper: Performance Characterization of Microsoft SQL Server on VMware vSphere 6.5

The post SQL Server VM Performance with VMware vSphere 6.5 appeared first on VMware VROOM! Blog.

Sizing for large VMDKs on vSAN

I’ve recently been involved in some design and sizing for very large VMDKs on vSAN. There are a couple of things to keep in mind when doing this, not just the overhead when deciding to go with RAID1, RAID5 or RAID6, but also what this means for component counts. In the following post, I have done a few tests with some rather large RAID-5 and RAID-6 VMDKs, just to show you how we deal with it in vSAN. If you are involved in designing and sizing vSANs for large virtual machines, you might find this interesting.

Let’s start with a RAID-5  example. Let’s also take a VM with a significantly large 8TB VMDK, deployed as a RAID-5.

As a RAID-5,  that 8TB VMDK will have a X1.33 capacity overhead to cater for the parity. So in essence, we are looking at 10.66TB to implement this RAID-5 configuration. This configuration can tolerate 1 failures in vSAN. That is a considerable space savings when compared to using the default RAID-1 Failures-To-Tolerate setting. RAID-1 would require a second copy of the data in that case, so 2 x 8TB = 16TB. So RAID-5 is giving us a considerable space-saving over RAID-1. However, you will need to have at least 4 hosts in your vSAN cluster to implement a RAID-5 configuration, whereas RAID-1 can be implemented with 2 or 3 nodes.

Now we come to the component count. Readers who are well versed in vSAN will be aware that the maximum component size in vSAN is 255GB. This has been around since vSAN 5.5, and continues to be the same today. So with 10.66TB, We would have something like 40 or more 256GB segments to accommodate this 10.66TB requirement. I deployed this configuration on my own environment, and this is what was created on vSAN, using a policy of FTT (FailureToTolerate)=1, FTM (FailureToleranceMethod)=Erasure Coding.

I have a total of 44 components in this example, 11 per RAID-5 segment. These components are then concatenated into a RAID-0 in each RAID-5 segment. If you want to see this on your own vSAN, you will have to use the Object Space Reservation setting of 100% to achieve this (along with the necessary disk capacity of course). Since vSAN deploys objects thinly, if you do not use OSR=100%, you will only see the bare minimum 4 components in the RAID-5 object. As you consume capacity in the VMDK, the layout will grow accordingly.

Now the other thing to keep in mind with component count is snapshots. A snapshot layout will follow the same layout as the VMDK that it is a snapshot of. Therefore a snapshot of the above VMDK will have the same layout, as shown here:

This means that to snapshot this RAID-5 VMDK, I will consume another 44 components (which needs to be factored into the component count).

Let’s take another example that I have been working on. Let’s  take the same VM with an 8TB VMDK, and deploy it as a RAID-6.

As a RAID-6,  that 8TB VMDK will have a X1.5 capacity overhead to cater for the double parity required for RAID-6. So in essence, we are looking at in the region of 12TB to implement this RAID-6 configuration. Of course, the point to remember is that this configuration can tolerate 2 failures in vSAN. That is a considerable space savings when compared to using RAID-1 to tolerate 2 failures. This would be 3 copies of the data in that case, so 3 x 8TB = 24TB. So RAID-6 is giving us a 100% space-saving. You will of course need to have at least 6 hosts in your vSAN cluster to implement a RAID-6 configuration, so keep that in mind as well.

Now the next thing is the component count. So with a ~12TB RAID-6 object (8TB data, 4TB parity), this is what was deployed on vSAN, once I set the Object Space Reservation to 100% and choose a RAID -6 policy (FTT=2, FTM=Erasure Coding):

In each of the RAID-6 segments, there are 9 components (just over the ~2TB). With 6 segments, this implies we are looking at 54 components to deploy that 8TB VMDK in a RAID-6 configuration. As before, any snapshots of this VM/VMDK will instantiate a snapshot delta object with the same configuration of 54 components.

Hopefully that explains some of the considerations when dealing with some very large VMDKs on vSAN.

 

The post Sizing for large VMDKs on vSAN appeared first on CormacHogan.com.

Video – LogInsight deep dive

This video is aimed towards anyone who does a lot of Log Analysis. In this video I showcased the capabilities of VMware vRealize Log Insight.

This will enable you to confidently utilise this tool to not only analyse the diverse logs that you can think of but also visualise the patterns and much more.

So, if you are a hands on person who loves to do root cause analysis, or want to solve that nagging performance issue, then this video is for you.

Learn all about vRealize Log Insight in under 90 minutes

New Release: Learning PowerCLI – Second Edition Book

Recently, the new book Learning PowerCLI – Second Edition was published by Packt Publishing. Learning PowerCLI – Second Edition contains 517 pages of PowerCLI goodness. The book starts with downloading and installing PowerCLI. It continues with basic PowerCLI concepts, and working with PowerShell objects. Managing vSphere host, virtual machines, virtual networks, storage, high availability, clusters, and vCenter Server are the following topics. After patching ESXi hosts and upgrading virtual machines using vSphere Update Manager, managing VMware vCloud Director and vCloud Air, using Site Recovery Manager, vRealize Operations Manager, and REST API to manage NSX and vRealize Automation, the book finishes with a chapter about reporting.

If you are new to PowerCLI or have some PowerCLI experience and want to improve your PowerCLI skills, Learning PowerCLI – Second Edition will teach you to use PowerCLI to automate your work!

What’s New

Compared to the first edition, the following new topics are added in Learning PowerCLI – Second Edition:

  • Importing OVF or OVA packages
  • Using Tags
  • Using VMware vSAN
  • Using vSphere storage policy-based management
  • Configuring enhanced vMotion compatibility (EVC) mode
  • Patching ESXi hosts and upgrading virtual machines
  • Managing VMware vCloud Director and vCloud Air
  • Using Site Recovery Manager
  • Using vRealize Operations Manager
  • Using REST API to manage NSX and vRealize Automation

Learning PowerCLI – Second Edition is available exclusively from Packt Publishing: https://www.packtpub.com/virtualization-and-cloud/learning-powercli-second-edition

About the Author

Learning PowerCLI – Second Edition is written by Robert van den Nieuwendijk. Robert is a freelance system engineer living and working in the Netherlands. He is a VMware vExpert since 2012 and a moderator of the VMware VMTN Communities. Robert has a blog at http://rvdnieuwendijk.com. You can follow Robert on Twitter as @rvdnieuwendijk.

The post New Release: Learning PowerCLI – Second Edition Book appeared first on VMware PowerCLI Blog.

A closer look at Runecast

Last week, I had the pleasure of catching up with a new startup called Runecast. These guys are doing something that is very close to my heart. As systems become more and more complex, and with fewer people taking on more responsibility, highlighting potential issues, and providing descriptive guidance to resolving an issue is now critical. This is something that is resonating in the world of HCI, hyper-converged infrastructure, where the vSphere administrator may also be the storage administrator, and perhaps the network administrator too. This is where Runecast come in. Using a myriad of resources such as VMware’s Knowledgebase system, Security Hardening Guide, various Best Practices, and other assorted information, Runecast can monitor your vSphere infrastructure and bring to your attention the need for some remediation. This could be because something in the logs matched an issue reported in a KB, or new hosts have been added to a cluster which have not been security hardened, or that VMware has released a new patch or update, and it is relevant to your environment.

Setup

The Runecast product comes as a virtual appliance (OVA). Simply deploy it in your infrastructure and connect it to your vCenter Servers. The appliance needs 2 vCPUs and 6GB RAM. The latest appliance, version 1.5 (which released today incidentally), has 2 x VMDKs (40GB). This is primarily to facilitate a new feature in v1.5 which allows the appliance to gather logs and provide reporting on multiple vCenter Servers. Once logged in, the appliance can be setup to monitor ESXi hosts as well as VMs. For ESXi hosts, the syslog is redirected to the appliance, and the appropriate firewall rules are configured. Virtual Machine log output can also be redirected to the appliance. This is done by adding an entry to the VM’s .vmx, and then the VM needs to be powered cycled or migrated for the update to take effect. The appliance only needs internet access to download new updates. However, updates can be provided in other ways for sites that do not have access to the outside world. This diagram provides a basic overview of the architecture.

We were told that Runecast are currently making new updates every 2 weeks on average, but if there is a critical update from VMware, they will push this to their users quicker than that.

Demo

I must say that the demo was very intuitive. After about 30 minutes, I felt like I would be able to drive this very easily myself. Stanimir Markov demonstrated the product to us and during the demo, we saw examples of issues related to security hardening being highlighted, missing best practices, as well as alerts being generated because the log analysis caught something  that was highlighted in a VMware Knowledgebase article. I then deployed it myself in my own lab, and had it running in a matter of minutes. Probably the easiest way to get a feel for it is via some of the screenshots that I took. Here is a nice one which highlights whether a bunch of different best practices have passed or failed, and also how many objects in the inventory are impacted by this check.

This is another nice one – KBs discovered. This is highlighting whether or not the criteria in a certain knowledgebase article is applicable in this environment, and again how many objects are affected. Not only that, but each of those objects can then be queried to get even further detail, such as the ESXi host below. You’ll also notice that event of the alerts/warnings comes with a severity level, which the Runecast team determine based on the impact the “known issue” can have on your environment. As you can see below, this one could cause a PSOD (Purple Screen of Death), so it is categorized as “Critical” from an availability perspective.

The other nice feature is that you can read the KB articles via the appliance interface. There is no need to connect to the VMware KB site to review the content. Very useful again for those sites that do not have internet access.

I’ll just add one more screenshot that I thought was interesting. This was from the security hardening view. I know that this was a very important category for a lot of customers, but it requires a lot of due diligence to make sure it was implemented correctly. This is especially true in HCI, where you might be regularly scaling out the HCI system by adding new hosts to the cluster. Manually making sure all the security hardening is in place can be tedious. With Runecast, you can verify that the security hardening changes have indeed been implemented:

While I haven’t been able to touch on all aspects of the Runecast interface in this post, I was very impressed by its simplicity, and ease-of-use. Compared to a lot of other interfaces, it seemed very intuitive to use, with no steep learning-curve needed. Other items that impressed me were the ability to get an inventory view, and see how many alerts are associated with each host, or VM, or datastore, or network, etc. I also liked the filtering mechanism, where some alerts could be ignored temporarily or permanently, perhaps during a planned maintenance period.

One limitation is around remote alerting. Right now this is only available via email, but the Runecast team are working on additional notification mechanisms, such as SNMP traps and web hooks for applications such as Slack, etc. This is feedback that they have heard from many of their customers.

About Runecast

Runecast are currently up at 10 full time employees, with another 4 part time employees. The majority of the development work is being carried out in the Czech Republic, and they have a presence in many other countries. Runecast offer a free 30 day trial of their product, and I also believe the VMware vExperts have an access to an NFR license. Licensing is based on an annual subscription rate, which I understood to be $250 per CPU per year. Runecast Analyzer can be downloaded here.

I must say that I was impressed by this product. Like I said at the beginning of this post, as HCI becomes more prevalent, the onus will be on fewer people to manage more of the infrastructure. Those people are invariably the vSphere admins, and tooling will be critical in reducing troubleshooting time. The next step will not just be proactive highlighting of potential issues, but prescriptive guidance and remediation in the event of a failure.

Runecast are participating in a number of VMware User Group meetings globally this year, and they will also be at VMworld. Go check them out if you see them.

The post A closer look at Runecast appeared first on CormacHogan.com.

2-node vSAN topologies review

There has been a lot of discussion in the past around supported topologies for 2-node vSAN, specifically around where we can host the witness. Now my good pal Duncan has already highlighted some of this in his blog post here, but the questions continue to come up about where I can, and where I cannot place the witness for a 2-node vSAN deployment. I also want to highlight that many of these configuration considerations are covered by our official documentation. For example, there is the very comprehensive VMware Virtual SAN 6.2 for Remote Office and Branch Office Deployment Reference Architecture which talks about hosting the witness back in a primary data center, as well as another Reference Architecture document which covers Running VMware vSAN Witness Appliance in VMware vCloud Air. So considering all of the above, let’s look at some topologies that are supported with 2-node vSAN deployments, and one which ones are not:

Witness running in the main DC

In this full example, we fully support having the witness (W) run remotely on another vSphere site, such as back in your primary datacenter. This is covered in detail in the VMware Virtual SAN 6.2 for Remote Office and Branch Office Deployment Reference Architecture  mentioned earlier.

Witness running in vCloud Air

In this next example, we fully support having the witness (W1) run remotely in vCloud Air. This is covered in detail in the Running VMware vSAN Witness Appliance in VMware vCloud Air Reference Architecture mentioned earlier.

Witness running on another standard vSAN deployment

Now this one is interesting. A common question is whether or not one can run the witness (W) on a vSAN deployment back on the main DC. The answer is yes, this is fully supported. The crux of the matter, as stated by the vSAN Lead Engineer Christian Dickmann, is that “We support any vSphere to run the witness that has independent failure properties”. So in other words, any failure on the 2-node vSAN at the remote site will not impact the availability of the standard vSAN environment at the main DC.

Witness running on another 2-node vSAN deployment, and vice-versa

This final configuration is the one which Duncan has described in detail on his post, so I won’t go into it too much. Suffice to say that this configuration breaks the guidance around “We support any vSphere to run the witness that has independent failure properties.” In this case there is an inter-dependency between the 2-node vSAN deployments at each of the remote sites, as each site hosts the witness of the other 2-node deployment (W1 is the witness for the 2-node vSAN deployment at remote site 1, and W2 is the witness for the 2-node vSAN deployment at remote site 2). Thus if one site has a failure, it impacts the availability of the other site. [Update] As of March 16th, 2017, VMware has change our stance with around this configuration. We will now support this through our RPQ process. There are several constraints with this deployment and  customers need to fully understand and agree with those for us to approve the RPQ. So we will change this to not recommend, but supported via RPQ.

Hope this helps clarify the support around the different 2-node topologies, especially for witness placement.

Licensing

There is one final topic that I wish to bring up with 2-node + witness deployments, and that is around licensing. Note that even though the witness is an appliance, it is an ESXi host running in a VM. And although we supply a license with the appliance, it will still consume a license in vCenter when it comes to management. For example, say you deploy a 2-node vSAN. The 2-node vSAN will need 2 ESXi hosts at the remote site, but  there may be a 3rd physical server that could be used for hosting vCenter as well as the witness appliance. If you are using a vSphere Essentials license, you will not be able to add the witness appliance as vSphere Essentials can only manage 3 hosts. There is some discussion about this internally at VMware at the moment, but as of right now, this is a restriction that you may encounter with vSphere Essentials.

The post 2-node vSAN topologies review appeared first on CormacHogan.com.

vSphere 6.5 p01 – Important patch for users of Automated UNMAP

VMware has just announced the release of vSphere 6.5 p01 (Patch ESXi-6.5.0-20170304001-standard). While there are a number of different issues addressed in the patch, there is one in particular that I wanted to bring to your attention. Automated UNMAP is a feature that we introduced in vSphere 6.5. This patch contains a fix for some odd behaviour seen with the new Automated UNMAP feature. The issue has only been observed with certain Guest OS, certain filesystems, and a certain block sizes format. KB article 2148987 for the patch describes it as follows:

Tools in guest operating system might send unmap requests that are not 
aligned to the VMFS unmap granularity. Such requests are not passed to 
the storage array for space reclamation. In result, you might not be able 
to free space on the storage array.

It would seem that when a Windows NTFS filesystem is formatted with 4KB blocks, Automated UNMAP is not working. However if the NTFS is formatted with a larger block size, say 32KB or 64KB, then the Automated UNMAP works just fine. After investigating this internally, the issue seems to be related to the alignment of the UNMAP requests that the Guest OS is sending down. These have start offsets which are not aligned on the required 1 MB boundary, which is a requirement for Automated UNMAP to work. For VMFS to process the UNMAP, the requests have to arrive in 1MB aligned, and in 1MB multiples. Even though the NTFS partition in the Guest OS is aligned correctly, the UNMAP requests are not aligned, so we cannot do anything with them.

Our engineering team also made the observation that when some of the filesystem internal files grow to a certain size, the starting clusters which are available for allocation are not aligned on 1MB boundaries. When subsequent file truncate/trim requests come in, the corresponding UNMAP requests are not aligned properly.

While investigations continue into why NTFS is behaving this way, we have provided an interim solution in vSphere 6.5 p01. Now when a Guest OS sends an UNMAP request, and the starting block or ending block offset is unaligned to the configured UNMAP granularity, VMFS will now UNMAP as many of the 1MB blocks in the request as possible, and zero out the misaligned ones (which should only be the misaligned beginning of the UNMAP request, or the misaligned end of the UNMAP request, or both).

If testing this for yourself, you can use something like the “optimize drive” utility on Windows to send SCSI UNMAP command to reclaim storage, e.g.

defrag.exe /O [/G] E:

Note that /G is not supported on some Windows versions. On Linux, tools like fstrim or can be used, e.g.

# sg_unmap -l 4097 -n 40960 -v /dev/sdb
 unmap cdb: 42 00 00 00 00 00 00 00 18 00

The post vSphere 6.5 p01 – Important patch for users of Automated UNMAP appeared first on CormacHogan.com.

Cool Tool – vCheck Daily Report for NSX

vCheck is a PowerShell HTML framework script, the script is designed to run as a scheduled task before you get into the office to present you with key information via an email directly to your inbox in a nice easily readable format.

This script picks on the key known issues and potential issues scripted as plugins for various technologies written as Powershell scripts and reports it all in one place so all you do in the morning is check your email.


One of they key things about this report is if there is no issue in a particular place you will not receive that section in the email, for example if there are no datastores with less than 5% free space (configurable) then the disk space section in the virtual infrastructure version of this script, it will not show in the email, this ensures that you have only the information you need in front of you when you get into the office.

This script is not to be confused with an Audit script, although the reporting framework can also be used for auditing scripts too. I don't want to remind you that you have 5 hosts and what there names are and how many CPUs they have each and every day as you don't want to read that kind of information unless you need it, this script will only tell you about problem areas with your infrastructure.

Intel Optane support for vSAN, first HCI solution to deliver it

Advertise here with BSA


I am in Australia this week for the Sydney and Melbourne VMUG UserCon’s. Had a bunch of meetings yesterday and this morning the news was dropped that Intel Optane support was released for vSAN. The performance claims look great, 2.5x more IOPS and 2.5x less latency. (I don’t know the test specifics yet.) On top of that, Optane typically has a higher endurance rating, meaning that the device can incur a lot more writes, which makes it an ideal device for the vSAN caching layer.

While talking to customers the past couple of days though it was clear to me that performance is one thing, but flexibility of configuration is much more important. With vSAN you have the ability to select any server from the vSphere HCL and pick the components you want as long as they are on the vSAN HCL.  Or you can simply pick a ready node and swap components as needed. As long as the controller remains the same for a ready node you can do that. Either way, you have choice, and now with Optane being certified you can use the latest in flash technology with vSAN!

Oh for those paying attention, the Intel P4800X Optane device isn’t listed on the HCL yet. The database is being updated as we speak, and the device should be included soon!

"Intel Optane support for vSAN, first HCI solution to deliver it" originally appeared on Yellow-Bricks.com. Follow me on twitter - @DuncanYB.

Intel Optane support for vSAN, first HCI solution to deliver it

Advertise here with BSA


I am in Australia this week for the Sydney and Melbourne VMUG UserCon’s. Had a bunch of meetings yesterday and this morning the news was dropped that Intel Optane support was released for vSAN. The performance claims look great, 2.5x more IOPS and 2.5x less latency. (I don’t know the test specifics yet.) On top of that, Optane typically has a higher endurance rating, meaning that the device can incur a lot more writes, which makes it an ideal device for the vSAN caching layer.

While talking to customers the past couple of days though it was clear to me that performance is one thing, but flexibility of configuration is much more important. With vSAN you have the ability to select any server from the vSphere HCL and pick the components you want as long as they are on the vSAN HCL.  Or you can simply pick a ready node and swap components as needed. As long as the controller remains the same for a ready node you can do that. Either way, you have choice, and now with Optane being certified you can use the latest in flash technology with vSAN!

Oh for those paying attention, the Intel P4800X Optane device isn’t listed on the HCL yet. The database is being updated as we speak, and the device should be included soon!

"Intel Optane support for vSAN, first HCI solution to deliver it" originally appeared on Yellow-Bricks.com. Follow me on twitter - @DuncanYB.