VMware pushes past 4TB SAP HANA limit
Excuse me while I yawn. Why such a lack of enthusiasm you might ask? Well, it is hard to know where to start. First, the “new” limit of 6TB is only applicable to transactional workloads like SoH and S/4HANA. BW workloads can now scale to an amazing size of, wait for it, 3TB. And granularity is still largely at the socket level. Let’s dissect this a bit.
VMware VMs with 6TB for OLTP/3TB for OLAP is just slightly behind IBM Power Systems’ current limitation of 24TB OLTP/16TB OLAP by a factor or 4 or more. (yes, I can divide 16/3 = 5.3)
But kudos to VMware. At least now customers with over 4TB but under 6TB requirements (including 3 to 5 years growth of course) can now consider VMware, right?
With this latest announcement (not really an announcement, a SAP note, but that is pretty typical for SAP) VMware is now supported as a “real” virtualization solution for HANA. Oh, how much they wish that was true. VMware is now supported on two levels: min and max size of “shared” socket at ½ of the socket’s capacity or min of 1 dedicated socket/max of 4 sockets with a granularity of 1 socket. At a ½ socket, the socket may be shared with another HANA production workload but SAP suggests an overhead of 14% in this scenario and it is not clear if they mean 14% total or 14% in addition to the 10% when using VMware in the first place. Even at this level, the theoretical capacity is so small as to be of interest for only the very smallest demands. At the dedicated socket level, VMware has achieved a fundamental breakthrough … physical partitioning. Let me reach back through the way-back machine … way, way back … to the 1990s, (Oh, you Millennials, you have no idea how primitive life used to be) when some of us used to subdivide systems along infrastructure boundaries and thought we were doing something really special (HP figured out how to do this about 5 years later (and are still doing this today), but lets give them credit for trying and not criticize them for being a little slow … after all, not everyone can be a C student).
So, now, almost 30 years later, VMware is able to partition on physical boundaries of a socket for production HANA workloads. That is so cool that if any of you are similarly impressed, I have this incredibly sophisticated device that will calculate a standard deviation with the press of a button (yes, Millennials, we used to have specialized tools to help us add and subtract called calculators which were a massive improvement over slide rules, so back off! What, you have never heard of a slide rule … OMG!)
A little 101 of Power Systems for HANA: PowerVM (IBM’s hardware/firmware virtualization manager) can subdivide a physical system at a logical level of a core, not a physical socket, for HANA production workloads. You can also add (or subtract) capacity by increments of 1 core for HANA production, increments of 1/20thof a core for non-prod and other workloads. You can even add memory without cores even if that memory is not physically attached to the socket on which that core resides. But, there is more. During this special offer, only if you call in the next 1,000,000 minutes, at no extra charge, PowerVM will throw in the ability to move workloads and/or memory around as needed within a system or to another system in its cluster and share capacity unused by production workloads with a shared pool of VMs for application servers, non-prod or even host a variety of non-HANA, non-SAP, Linux, AIX or IBM I workloads on the same physical system with up to 64TB of memory shared amongst all workloads on a logical basis.
Shall we dive a little deeper into SAP’s support for HANA on VMware … I thing we shall! So, we are all giddy about hosting a 6TB S/4 instance on VMware. The HANA appliance specs for a 6TB instance on Skylake are 4 sockets @ 28 core/socket with hyperthreading enabled. A VMware 6.7 VM is supported with up to 128 virtual processors (vps). 4 x 28 x 2 = 224 vps
I know, we have a small problem with math here. 128 is less than 224 which means the math must be wrong … or you can’t use all of the cores with VMware. To be precise, with hyperthreading enabled you can only use 16 cores per socket or 2/3 of the cores used in the certified appliance test. And, that is before you consider the minimum 10% overhead noted in the SAP note. So, we are asked to believe that 128 * .9 = 115vps will perform as well as 224vps. As a career long salesman from Louisiana, I have to say, please call me because I know where I can get ahold of some prime real estate in the Atchafalya Basin (you can look it up but it is basically swamp land) to sell to you for a really good price.
Alternately, we can disable hyperthreading and use all of 4 sockets for a single HANA DB workload. Once again, the math gets in the way. VMware estimates hyperthreading increases core throughput by about 15%, so logically, removing hyperthreading has the opposite effect, and once again, that 10% overhead still comes into play meaning even more performance degradation: .85 * .9 = 77% of the certified appliance capacity.
By the way, brand new in this updated SAP note is a security warning when using VMware. I have to admit a certain amount of surprise here as I believe this is the first time that using a virtualization solution on Intel systems is recognized by SAP as coming with some degree of risk … despite the National Vulnerabilities Database showing 1009 hits when searching on VMware. A similar search on PowerVM returns 0 results.
This new announcement does little to change the playing field. Intel based solutions using VMware to host product HANA environments will deliver stunted physically partitioned systems with almost no sharing of resources other than perhaps some I/O and a nice hefty bill for software and maintenance from VMware. In other words, as before, most customers will be confronted with a simple choice: choose bare-metal Intel systems for production HANA and use VMware for only non-prod servers which do not require the same stack as production such as development or sandbox or choose IBM Power Systems and fully exploit the server consolidation capabilities it offers which continues the journey that most customers have been on since the early 2000s of improved datacenter efficiency using virtualized infrastructure.
Hi Alfred, Nice article. I heard that INTEL processors have some inherent architectural limitations which prevent supporting scale-up instances bigger than 24TB. I also heard that these limitations will only be removed with ICELAKE. I would appreciate your view on this. Thank you , Pierre
Pierre, Thank you for your comments. I am not aware of any fundamental limitation and know that at least one SAP approved Intel based system can support up to 48TB of physical memory. That is not to suggest that SAP would approve the use of all that memory in a single HANA instance as the largest single instance size supported on Skylake based systems is 18TB. I will look into this and if I can find anything that can substantiates the 24TB limit.
HANA on vSphere work nice on small confs (under 512GB) and one NUMA-node boundaries and 8 socket Intel server can get 16 x 512GB production HANA VM’s. Why we still have 8 production LPAR limit even on 192 cores 880/980 monsters ? When uncapped SP LPAR will be aproved by SAP ?
KDS, Thanks for your comment. You are indeed correct that if you have a lot of incredibly small HANA systems you could fit 16 of them on the same system. As each instance would be sharing resources with another one, this might cause some undesirable performance impacts. Also, if even an instance grows to 513GB, it would have to consume an entire socket reducing the total number of instances by 1 for each instance over 512GB. As to Power, you are correct that the 8 LPAR limit does not appear to make much sense especially with far greater control over resources than VMware has. All of eagerly await the lifting of this restriction by SAP. I wish that I could give you some guidance as to when this might happen, but that is way out of my area of responsibility. Likewise, production HANA LPARs in a shared pool is highly desired by IBM customers and most are looking forward to formal support from SAP. Once again, I am not part of the team responsible for this effort so can’t provide any guidance as to when SAP might approve this.
[…] [vi]https://saponpower.wordpress.com/2018/09/26/vmware-pushes-past-4tb-sap-hana-limit/ […]
Pingback by Optane DC Persistent Memory – Proven, industrial strength or full of hype – Detail, part 3 « SAPonPower | June 3, 2019 |