SAPonPower

An ongoing discussion about SAP infrastructure

VMware pushes past 4TB SAP HANA limit

Excuse me while I yawn.  Why such a lack of enthusiasm you might ask?  Well, it is hard to know where to start.  First, the “new” limit of 6TB is only applicable to transactional workloads like SoH and S/4HANA.  BW workloads can now scale to an amazing size of, wait for it, 3TB.  And granularity is still largely at the socket level.  Let’s dissect this a bit.

VMware VMs with 6TB for OLTP/3TB for OLAP is just slightly behind IBM Power Systems’ current limitation of 24TB OLTP/16TB OLAP by a factor or 4 or more.  (yes, I can divide 16/3 = 5.3)

But kudos to VMware. At least now customers with over 4TB but under 6TB requirements (including 3 to 5 years growth of course) can now consider VMware, right?

With this latest announcement (not really an announcement, a SAP note, but that is pretty typical for SAP) VMware is now supported as a “real” virtualization solution for HANA.  Oh, how much they wish that was true.  VMware is now supported on two levels: min and max size of “shared” socket at ½ of the socket’s capacity or min of 1 dedicated socket/max of 4 sockets with a granularity of 1 socket.  At a ½ socket, the socket may be shared with another HANA production workload but SAP suggests an overhead of 14% in this scenario and it is not clear if they mean 14% total or 14% in addition to the 10% when using VMware in the first place.  Even at this level, the theoretical capacity is so small as to be of interest for only the very smallest demands.    At the dedicated socket level, VMware has achieved a fundamental breakthrough … physical partitioning.  Let me reach back through the way-back machine … way, way back … to the 1990s,  (Oh, you Millennials, you have no idea how primitive life used to be) when some of us used to subdivide systems along infrastructure boundaries and thought we were doing something really special (HP figured out how to do this about 5 years later (and are still doing this today), but lets give them credit for trying and not criticize them for being a little slow … after all, not everyone can be a C student).

So, now, almost 30 years later, VMware is able to partition on physical boundaries of a socket for production HANA workloads.  That is so cool that if any of you are similarly impressed, I have this incredibly sophisticated device that will calculate a standard deviation with the press of a button (yes, Millennials, we used to have specialized tools to help us add and subtract called calculators which were a massive improvement over slide rules, so back off!   What, you have never heard of a slide rule … OMG!)

A little 101 of Power Systems for HANA:  PowerVM (IBM’s hardware/firmware virtualization manager) can subdivide a physical system at a logical level of a core, not a physical socket, for HANA production workloads.  You can also add (or subtract) capacity by increments of 1 core for HANA production, increments of 1/20thof a core for non-prod and other workloads.  You can even add memory without cores even if that memory is not physically attached to the socket on which that core resides.  But, there is more.  During this special offer, only if you call in the next 1,000,000 minutes, at no extra charge, PowerVM will throw in the ability to move workloads and/or memory around as needed within a system or to another system in its cluster and share capacity unused by production workloads with a shared pool of VMs for application servers, non-prod or even host a variety of non-HANA,  non-SAP, Linux, AIX or IBM I  workloads on the same physical system with up to 64TB of memory shared amongst all workloads on a logical basis.

Shall we dive a little deeper into SAP’s support for HANA on VMware … I thing we shall!  So, we are all giddy about hosting a 6TB S/4 instance on VMware.  The HANA appliance specs for a 6TB instance on Skylake are 4 sockets @ 28 core/socket with hyperthreading enabled.  A VMware 6.7 VM is supported with up to 128 virtual processors (vps).  4 x 28 x 2 = 224 vps

I know, we have a small problem with math here.  128 is less than 224 which means the math must be wrong … or you can’t use all of the cores with VMware.  To be precise, with hyperthreading enabled you can only use 16 cores per socket or 2/3 of the cores used in the certified appliance test.  And, that is before you consider the minimum 10% overhead noted in the SAP note.  So, we are asked to believe that 128 * .9 = 115vps will perform as well as 224vps.  As a career long salesman from Louisiana, I have to say, please call me because I know where I can get ahold of some prime real estate in the Atchafalya Basin (you can look it up but it is basically swamp land) to sell to you for a really good price.

Alternately, we can disable hyperthreading and use all of 4 sockets for a single HANA DB workload. Once again, the math gets in the way. VMware estimates hyperthreading increases core throughput by about 15%, so logically, removing hyperthreading has the opposite effect, and once again, that 10% overhead still comes into play meaning even more performance degradation:   .85 * .9 = 77% of the certified appliance capacity.

By the way, brand new in this updated SAP note is a security warning when using VMware.  I have to admit a certain amount of surprise here as I believe this is the first time that using a virtualization solution on Intel systems is recognized by SAP as coming with some degree of risk … despite the National Vulnerabilities Database showing 1009 hits when searching on VMware.  A similar search on PowerVM returns 0 results.

This new announcement does little to change the playing field.  Intel based solutions using VMware to host product HANA environments will deliver stunted physically partitioned systems with almost no sharing of resources other than perhaps some I/O and a nice hefty bill for software and maintenance from VMware.  In other words, as before, most customers will be confronted with a simple choice: choose bare-metal Intel systems for production HANA and use VMware for only non-prod servers which do not require the same stack as production such as development or sandbox or choose IBM Power Systems and fully exploit the server consolidation capabilities it offers which continues the journey that most customers have been on since the early 2000s of improved datacenter efficiency using virtualized infrastructure.

Advertisements

September 26, 2018 Posted by | Uncategorized | , , , , , , , | Leave a comment

Scale-up vs. scale-out architectures for SAP HANA – part 2

S/4HANA is enabled for scale-out up to 4 nodes plus one hot-standby.  Enablement does not mean it is easy or advisable.  SAP states clearly: “We recommend using scale-up configurations as long as this is economically justifiable, taking operational costs and drawbacks into account.”[i]  This same note goes on to say: Limited knowledge about S/4HANA customer scenarios using scale-out is currently available.”

For very large customers, e.g. those for which an S/4HANA system’s memory is predicted to be larger than 24TB currently, scale-out may be the best option.  Best, of course, implies that there may be other options which will be discussed later in this post.

It is a reasonable question to ask why does SAP offer such conditional advice.  We can only speculate since SAP does not provide a direct explanation.  Some insight may be gained by reading the SAP note on scale-out sizing.[ii]  Unlike analytical applications such as BW/4HANA, partitioning of S/4HANA tables across nodes is not permitted.  Instead, all tables of a particular module are grouped together and the entire group must be placed on an individual node in the cluster.

Let’s consider a simple example of three commonly used modules, FI, MM and SD (Financial, Materials, Sales).  The tables associated with each module belong to their respective groups.  Placing each on a different node may help to minimize the size of any one node, but several issues arise.

  • Each will probably be a different size.  This fully supported, but the uneven load distribution may result in one node running at high utilization while another is barely using any capacity.  Not only does this mean wasted computing power, power and cooling but could result in inferior performance on the hot node.
  • Since most customers prefer to size all nodes in a cluster the same way, considerable over capacity of memory might result further driving up infrastructure costs.
  • Transactions often do not fit comfortably within a single module, e.g. a sales order might result in financial tables being updated with billing, accounts receivable and revenue data and materials tables being adjusted with a decrement of available stock.  If a transaction is running on node 1 (the master node) and needs to access/update tables on nodes 2 and 3, those communications run across a network.  As with the BW example in the previous blog post, each communication is at least 30 times slower across a network than across memory.

It is important to consider that every transaction that comes into an S/4HANA system connects to the index server on the master node with queries distributed by the master node to the appropriate index server.  This means that every transaction not handled directly by the master node must involve at least one send and one receive with the associated 30 times slower latency.

Some cross-node latency may be reduced by collocating appropriate groups resulting in fewer total nodes and/or by replicating some tables.  Unfortunately, if a table is replicated, this would result in the breaking of a fundamental SAP rule noted in SAP Note 2408419 (see footnote #1 below), that all tables of a group must be located on the same node.

As with the BW example, what works well for one scenario may not work well for another.  One of the significant advantages of S/4HANA over Business Suite 7 is the consolidation and dramatic reduction of tables resulting in fewer, much larger tables.  Conversely, this makes table distribution in a scale-out cluster much more challenging. It is not hard to imagine that performance management could be quite a task in a scale-out scenario.

So, if scale-out is not an option for many/most customers, what should be done if approaching a significant memory barrier?  Options include:

  • Cleanup, use of hybrid LOBs, index optimization, etc
  • Archiving data to reduce the size of the system
  • Eliminating duplicate data or easily reproduced data, e.g. iDocs, data from Hadoop
  • Usage of Data Aging[iii]
  • Sizing memory smaller than predicted
  • Request an exception to size the system larger than officially supported

Cleaning up your system and getting rid of various unnecessary memory consumers should be the first approach undertaken.[iv]  Remember, what might have been important with a conventional DB may either be not needed with S/4HANA or a better technique may exist.  The expected memory reduction is usually shown as part of an ERP sizing report.

Archiving is another obvious approach but since the data is kept on very slow media, compared to in-memory data, and cannot be changed, the decision as to what to archive and where to place it can be very challenging for some organizations.

iDocs are, by definition, intermediate documents and are used primarily for sending and receiving documents to/from third parties, e.g. sales orders, purchase orders, invoices, shipping notices.  Every iDoc sent or received should have a corresponding transaction within the SAP system which means that it is essentially a duplicate record once processed by the SAP system.  Many customers keep these documents indefinitely just in case any disputes occur with those third parties.  Often, these iDocs just sit around collecting digital dust and may be prime candidates for deletion or archival.  Likewise, data from an external source, e.g. Hadoop, should still exist in that source and could potentially be deleted from HANA.

Data Aging only covers a subset of data objects and requires some effort to utilize it.[v]  By default, the ABAP server adds “WITH RANGE RESTRICTION (‘CURRENT’)” to all queries to prevent unintended access to aged or cold partitions, which means that to access aged data, a query must specify which aged partition to access.  This implies special transactions or at least different training for users to access aged data.  Data Aging does allow aged data to be updated, so may be more desirable than archiving in some cases.  Aged data is stored on storage devices which means many order of magnitude slower than memory, however this can be mitigated, to some extent by faster media, e.g. NVMe drives on PCIe cards. Unfortunately, Data Aging has not been implemented by many customers meaning a potentially steep learning curve.

Deliberately undersizing a system is not recommended by SAP and I am not recommending it either.  That said, if an implementation is approaching a memory boundary and scaling to a larger VM or platform is not possible (physically, politically or financially), then this technique may be considered. It comes with some risk, however, so should be considered a last resort only.  HANA enables “lazy loading” of columns[vi] whereby columns are not loaded until needed. If your system has a large number of columns which consume space on disk but are never or rarely accessed, the memory reserved for these columns will, likewise, go unused or underused.  HANA also will attempt to unload columns when the system runs out of allocable memory based on a least frequently used algorithm. Unless a problem occurs, a system configured with less memory than the sizing report predicts will start without problem and unload columns when needed.  The penalty comes when those columns that are not memory resident are accessed at which time other column(s) must first be unloaded and the entire requested column loaded, i.e. significant latency for the first access is incurred. As mentioned earlier, this should be considered only in a worst case scenario and only if scaling up/out is not desired or an option.

Lastly, requesting an exception from SAP to allow a system size greater than officially supported may be a viable choice for the customers that are expected to exceed current maximums. This may not be without difficulty as when you embark on a journey where few or none have gone before, inevitably, you will run into obstacles that others have not yet encountered.  Dispatch mechanisms, delta merge operations, transactional log latency, savepoint I/O throughput, system startup times, backup/recovery and system replication are among some of the more significant areas that would be stressed and some might break.

My advice: Scale-up only in all S/4HANA cases unless the predicted memory for the immediate planning horizon exceeds the official SAP maximum supported size.  Before considering scale-out solutions, use every available tool to reduce the size of the system and ask SAP for an exception if the resulting size is still above the maximum.  Lastly, remember that SAP and its hardware partners are constantly working to enable larger HANA system sizes.  If the size required today fits within the largest supported system but is expected to exceed the limit over time, it may be reasonable to start your implementation or migration effort today with the expectation that the maximum will be increased by the time you need it.  Admittedly, this is taking a risk, but one that may be tolerable and if the limit is not raised in time, scale-out is still an option.

[i]2408419 – SAP S/4HANA – Multi-Node Support
[ii]2428711 – S/4HANA Scale-Out Sizing
[iii]2416490 – FAQ: SAP HANA Data Aging in SAP S/4HANA
[iv]1999997 – FAQ: SAP HANA MemoryFAQ 5
[v]1872170 – Business Suite on HANA and S/4HANA sizing report.
[vi]https://www.sap.com/germany/documents/2016/08/205c8299-867c-0010-82c7-eda71af511fa.html

July 16, 2018 Posted by | Uncategorized | , , , , , , , | 1 Comment

Scale-up vs. scale-out architectures for SAP HANA – part 1

Dozens of articles, blog posts, how-to guides and SAP notes have been written about this subject.  One of the best was by John Appleby, now Global Head of DDM/HANA COEs @ SAP.[i]  Several others have been written by vendors with a vested interest in the proposed option. The vendor for which I work, IBM, offers excellent solutions for both options, so my perspective is based on both my and the experiences of our many customers, some that have chosen one or the other option, or both, in some cases.

Scale-out for BW is well established, understood, fully supported by SAP and can be cost effective from the perspective of systems acquisition costs.  Scale-out for S/4HANA, by comparison, is in use by very few customers, not well understood, yet is support by SAP for configurations up to 4 nodes.  Does this mean that a scale-out architecture should always be used for BW and a scale-up architecture for S/4HANA the only viable choice?  This blog post will discuss only BW and similar analytical environments including BW/4HANA, data marts, data lakes, etc.  The next will discuss S/4HANA and the third in the series will discuss vendor selection and where one might have an advantage over the others. 

Scale-out has 3 key advantages over scale-up:

  • Every vendor can participate therefore competitive bidding of “commodity” level systems can result in optimal pricing.
  • High availability, using host auto-failover requires nothing more than n+1 systems as the hot standby node can take over the role of any other node (some customers chose n+2 or group nodes and standby nodes).
  • Some environments are simply too large to fit in even the largest supported scale-up systems.

Scale-up, likewise, has 3 key advantages over scale-out:

  • Performance is, inevitably, better as joins across memory are always faster than joins across a network
  • Management is much simpler as query analysis and data distribution decisions need not be performed on a regular basis plus fewer systems are involved with the corresponding decrease in monitoring, updating, connectivity, etc.
  • TCO can be lower when the costs of systems, storage, network and basis management are included.

Business requirements, as always, should drive the decision as to which to use.  As mentioned, when an environment is simply too large, unless a customer is willing to ask for an exception from SAP (and SAP is willing to grant it), then scale-out may be the only option.  Currently, SAP supports BW configurations of up to 6TB on many 8-socket Intel Skylake based systems (up to 12TB on HPE’s 16-socket system) and up to 16TB on IBM Power Systems.

The next most important issue is usually cost.  Let’s take a simple example of an 8TB BW HANA requirement.  With scale-out, 4 @ 2TB nodes may be used with a single 2TB node for hot standby for a total of 10TB of memory.  If scale-up is used, the primary system must be 8TB and the hot-standby another 8TB for a total of 16TB of memory.  Considering that memory is the primary driver of the cost of acquisition, 16TB, from any vendor, will cost more than 10TB.  If the analysis stops there, then the decision is obvious. However, I would strongly encourage all customers to examine all costs, not just TCA.

In the above example, 5 systems are required for the scale-out configuration vs. 2 for scale-up. The scale-out config could be reduced to 4 systems if 3TB nodes are used with 1TB left unused although the total memory requirement would go up to 12TB.  At a minimum, twice the management activities, trouble-shooting and connectivity would be required. Also, remember, prod rarely exists on its own with some semblance of the configuration existing in QA, often DR and sometimes other non-prod instances.

The other set of activities is much more intensive.  To distribute load amongst the systems, first data must be distributed.  Some data must reside on the master node, e.g. all row-store tables, ABAP tables, general operations tables.  Other data such as Fact, DataStore Object (DSO), Persistent Staging Area (PSA) is distributed evenly across the slave nodes based on the desired partitioning specification, e.g. hash, round robin or range.  There are also more complex options where specifications can be mixed to get around hash or range limitations and create a multi-level partitioning plan).   And, of course, you can partition different tables using different specifications.  Which set of distribution specifications you use is highly dependent on how data is accessed and this is where it gets really complicated.  Most customers start with a simple specification, begin monitoring placement using the table distribution editor and performance using STO3N plus getting feedback from end users (read that as complaints to the help desk).  After some period of time and analysis of performance, many customers elect to redistribute data using a better or more complex set of specifications. Unfortunately, what is good for one query, e.g. distribute data based on month, is bad for another which looks for data based on zipcode, customer name or product number.  Some customers report that the above set of activities can consume part or all of one or more FTEs.

Back to the above example. 10TB vs. 16TB which we will assume is replicated in QA and DR, for sake of argument, i.e. the scale-up solution requires 18TB more memory.  If the price per TB is $35,000 then the cost different in TCA would be $630,000.  The average cost of a senior basis administrator (required for this sort of complex task) in most western countries is in the $150,000 range.  That means that over the course of 5 years, the TCO of the scale-up solution, considering only TCA and basis admin costs would be roughly equivalent to the cost of the scale-out solution.  Systems, storage and network administration costs could push the TCO of the scale-out solution up relative to the scale-up solution.

And then there is performance.  Some very high performance network adapter companies have been able to drive TCP latency across a 10Gb Ethernet down to 3.6us which sounds really good until you consider memory latency is around 120ns, i.e. 30 times faster.  Joining tables across nodes not only is substantially slower, but also results in more CPU and memory overhead.[ii]  A retailer in Switzerland, Coop Group, reported 5 times quicker analytics while using 85% fewer cores after migrating from an 8-node x86 scale-out BW HANA cluster with 320 total cores to a single scale-up 96-core IBM Power Systems.[iii]  While various benchmarks suggest 2x or better per core performance of Power Systems vs. x86, the results suggest far higher, much of which can, no doubt, be attributed to the effect of using a scale-up architecture.

Of course, performance is relative.  BW queries run with scale-out HANA will usually outperform BW on a conventional DB by an order of magnitude or more.  If this is sufficient for business purposes, then it may be hard to build a case for why faster is required.  But end users have a tendency to soak up additional horsepower once they understand what is possible.  They do this in the form of more what-if analyses, interactive drill downs, more frequent mock-closes, etc.

If the TCO is similar or better and a scale-up approach delivers superior performance with many fewer headaches and calls to the help desk for intermittent performance problems, then it would be very worthwhile to investigate this option.

 

To recap; For BW HANA and similar analytical environments, Scale-out architectures usually offer the lowest TCA and scalability beyond the largest scale-up environment.  Scale-up architectures offers significantly easier administration, much better performance and competitive to superior TCO.

[i]https://blogs.saphana.com/2014/12/10/sap-hana-scale-scale-hardware/

[ii]https://launchpad.support.sap.com/#/notes/2044468(see FAQ 8)

[iii]https://www.ibm.com/case-studies/coop-group-technical-reference

July 9, 2018 Posted by | Uncategorized | , , , , , , , , , , , | 3 Comments

Persistent Memory for HANA @ SapphireNow Orlando 2018

Once again, Intel and the companies that utilize their processors were all abuzz at Sapphire about Intel Optane DC Persistent Memory (PMEM).  This is the second year in a row that they have been touting this future technology and its ability to fit into a DIMM form factor and take the place of some of the main memory currently supply by DRAM. I was intrigued until I saw Hasso Plattner, at SapphireNow 2018 Orlando, explain how HANA would utilize this technology.  He showed a chart where a 6TB HANA DB startup time of 50 minutes reduced to 4 min with a 50/50 mix of standard DRAM DIMMs and the new Intel PMEM DIMMs.  As he explained it, HANA column store would reside in PMEM, while working space and delta store tables would reside in DRAM.

50 minutes down 4 minutes sounds outstanding, but let’s see if we can pull back the veil a bit.

Who created the chart?  Dr. Plattner was vague about this. He suggested that it might be from an internal test.  When I asked multiple vendors, including Intel, if any parts were available for customer testing, I was told no and that it would require Cascade Lake, Intel’s next version of the current Skylake chips, to drive these new DIMMs.  I suspect that Dr. Plattner was referring to the 50 minutes as being from an internal test.  This means that the 4 minute projection with PMEM may have come from another source, e.g. Intel.

Why 50 minutes?  It might seem reasonable to assume that if this was an internal test, that SAP knows how to configure a system properly, so it was probably using best of breed SSD technology, e.g. Intel’s SSD 750.  50 minutes works out to roughly 122GB/min after HANA SW load.  IBM published a white paper in which Power Systems achieved approximately 172GB/min (30% faster) with a typical mid-range SSD subsystem and almost 1 TB/min with NVMe based SSDs, i.e. 740% faster.[i]  In other words, if 50 minutes for 6TB is longer than acceptable, Power Systems can already deliver radically faster startup time without using esoteric and untested memory concepts.

For the Intel world, getting down from 50 minutes to 4 minutes would be quite a feat, but how often is this sort of restart likely to happen?  Assuming an SAP client is not using one of the near-zero maintenance options, this depends on the frequency of such updates but most typically a couple of times per year.  More often, for Intel customers, predictive failure analysis on memory will call out memory DIMMs for replacement once or twice a month, so more frequent reboots may be required.  Of course, using a more reliable memory technology such as that offered in Power Systems could alleviate this requirement.  It is ironic that the use of unreliable DRAM memory options in x86 systems could be the very cause of why faster restarts are needed.

Speaking of reliability, is PMEM RAID protected like disk?  The answer, based on what has been published so far, appears to be no.  In other words, if a PMEM DIMM were to fail, not only could this cause the system to fail, but since this would result in an incomplete or corrupted memory image, reload from the storage subsystem would still be required.  Even more irony that the fast restart functionality of PMEM would be of no use when PMEM itself is the cause of the outage.  Also, this would be the first commercial use of this technology. Good thing that Version 1.0 of anything usually works perfectly!

Next, let us consider the effect of having column store in PMEM.  If SAP has said it once, they have said it 1,000 times, the “H” in “HANA” refers to “High Performance”.  If you slow down access to column store by a factor of 5x or 10x, you get a cascading effect on just about every possible KPI in the system.  Wait, did I say 5x or 10x?  I hate it when I have to resort to quoting the source: “Intel senior vice president Rob Crooke and Micron CEO Mark Durcan declared 3D Xpoint to be 1,000 times faster and 1,000 times more durable than NAND[ii] and third party reviews: “latencies should push down into the 1-3us range, splitting the difference between current generation DRAM (~80-100ns) and PCIe-based Optane parts (~10us)”[iii] or “As an NVDIMM, 3D XPoint memory would have approximately 20% of the speed of standard volatile DRAM.”[iv]

I just get a chuckle out of Intel’s official comment: “Unlike traditional DRAM, Intel Optane DC persistent memory will offer the unprecedented combination of high-capacity, affordability and persistence,” Lisa Spelman. Notice, Lisa does not say High Performance … good thing that is not a goal of HANA!

Ok, enough of the facts and other analysts comments, we want speculation!  Got it. Let us speculate on what happens when memory is 5x slower (10x is just twice as bad).  Let us also assume that the 5x or 10x slower is true and that Linux does not utilize a pseudo memory mapped file system (which it does with additional overhead).

In a highly optimized and largely hypothetical world, we will only have analytics using HANA.  (Yes, I get it, that is BW or a data lake, not S/4HANA or C/4HANA but I get to determine the parameters of this made-up world.)  Let’s consider an overly simplistic query example, e.g.  select customer where revenue > 100000.  The first 64 byte block of data movement to L3 cache only takes 1us, which at current Skylake 2.5GHz speeds means a wait of 2,500 processor cycles.  For sake of argument, we are going to assume no additional latency getting into the processor, no ccNUMA effects or any other delays.  The good news is that modern architectures will predict the next access and start loading subsequent data blocks, while query processing of the first block is occurring.  Unfortunately, since the DIMM is already busy, this preload has to wait 2,500 processor cycles before it can start transferring the next block of data and 2,500 processor cycles is usually more than sufficient to start and finish any portion of the work against only 64 bytes of data.  It is really hard to imagine how queries speed will not be significantly affected by this additional latency.  Imaging taking a current HANA BW query that runs in 10 seconds and telling the users to now expect the same result in 50 or 100 seconds.  Can you imagine the revolt?

Or consider transaction processing. A typical transaction might require access to data with no-preload possible since these accesses are usually random.  So, this access gets a 5x or 10x delay which is radically faster than disk access but much slower than previously experienced with in-DRAM computing?  The trouble is that while the transaction incurs that penalty, the processor is just sitting there waiting since this is “main memory” which means that it does not issue a query and wait for an I/O interrupt to remind it to continue processing.  So, this access and every access behind it waits 2,500 cycles before continuing, assuming that everything that is required comes across in that first 64 byte chunk. Unfortunately, a transaction accesses data in rows, not in columns which means that each row transaction may involve dozens of individual column accesses, each of which will experience a 5x or 10x delay.  Now, extend that to intensely random operations such as delta merge where, instead of a sub-second interruption to individual transaction response time, there is now 5x or 10x increase on all related columnar memory writes.  I could continue to extrapolations in save points, batch and external interfaces, but you get the general idea.  At currently projected speeds, this sort of slow down in transactional performance could result in project failure.

One other point that must not be overlooked is Intel’s claim about density.  While the initial press suggested up to 10x greater density, the DIMM specifications that are currently circulating show up to 512GB per DIMM, a significant increase from the 128GB DIMM max size today (4x not 10x).  But can HANA take advantage of that increased density? Prior to Skylake, SAP certified appliances with 8-sockets could only support 8TB of memory despite many having configuration maximums of 12TB.  SAP certifications are dependent on meeting performance KPIs and there has always been a pretty direct correlation between numbers of sockets, performance per socket and amount of memory supported.  In other words, it takes more and faster cores to support more memory.  So, is it reasonable to expect that SAP will discard those KPIs and accept 5x or 10x slower speeds while also jamming 4 times as much memory per socket as is currently supported?

This is not to say that persistent memory has no place in the HANA world.  There are many places in which a 5x or 10x memory penalty is worthwhile.  Consider the case of a non-prod instance, e.g. test. If it takes 5x or 10x longer, there is little impact to the business operations of most companies, just an increase in the cost of IT and applications professionals.  This may be offset against the cost of memory and, in some cases, the math may work.  How about HA or DR? No, that does not work as HA and DR must operate like production in the case of a failure or disaster.  Certainly aged data that might otherwise reside on disk would see a radical improvement from PMEM or a radically lower cost when compared to DRAM memory in  BW extension nodes.

Also, consider that aggressive research is occurring in this field and that future technologies may reduce the penalty to only 2x the speed of DRAM.  Would that be close enough to make it worthwhile?

One final thought: The co-inventor of 3D Xpoint memory is Micron. Earlier this year, Micron and Intel decided to go their separate ways with Micron using this technology in their QuantX solutions.[v]  Micron is a member of the OpenPower Consortium.  Is it possible that they could use this technology to build their own PMEM solutions for Power Systems?  If that happens, it would certainly be fascinating to see IBM harvesting the value of PMEM without the marketing and research investment that Intel has put into this.

[i]https://www-01.ibm.com/common/ssi/cgi-bin/ssialias?htmlfid=POS03155USEN

[ii]https://www.youtube.com/watch?v=O0JUCjd_t_0

[iii]https://www.pcper.com/news/Storage/Intel-Launches-Optane-DC-Persistent-Memory-DIMMs-Talks-20TB-QLC-SSDs

[iv]https://searchstorage.techtarget.com/feature/3D-XPoint-memory-stumbles-in-race-to-ditch-DRAM-RRAM-may-step-up

https://www.anandtech.com/bench/product/1967?vs=2067

[v]https://www.anandtech.com/show/12258/intel-and-micron-to-discontinue-flash-memory-partnership

https://www.micron.com/products/advanced-solutions/3d-xpoint-technology

June 22, 2018 Posted by | Uncategorized | , , , , , , , | 3 Comments

Power Systems – Delivering best of breed scalability for SAP HANA

SAP quietly revised a SAP Note last week but it certainly made a loud sound for some.  Version 47 of https://launchpad.support.sap.com/#/notes/2188482 now says that OLTP workloads, such as Suite on HANA or S/4HANA are now supported on IBM Power Systems up to 24TB.  OLAP workloads, like BW HANA may be implemented on IBM Power Systems with up to 16TB for a single scale-up instance.  As noted in https://launchpad.support.sap.com/#/notes/2055470, scale-out BW is supported with up to 16 nodes bringing the maximum supported BW environment to a whopping 256TB.

As impressive as those stats are, it should also be noted that SAP also provided new core-to-memory (CTM) guidance with the 24TB OLTP system sized at 176-cores which results in 140GB/core, up from the previous 113.7GB/core at 16TB.  The 16TB OLAP system, sized at 192-cores, translates to 85.3GB/core, up from the previous 50GB/core for 4-socket and above systems.

By comparison, the maximum supported sizes for Intel Skylake systems are 6TB for OLAP and 12TB for OLTP which correlates to 27.4GB/core OLAP and 54.9GB/core OLTP.  In other words, SAP has published numbers which suggest Power Systems can handle workloads that are  2.7x (OLAP) and 2x (OLAP) the size of the maximum supported Skylake systems.  On the CTM side, this works out to a maximum of 3.1x (OLAP) and 2.6x (OLTP) better performance per core for Power Systems over Skylake.

Full disclosure, these numbers do not represent the highest scaling Intel systems.  In order to find them, you must look at the previous generation of systems.  Some may consider them obsolete, but for customers that must scale beyond 6TB/12TB (OLAP/OLTP) and are unwilling or unable to consider Power Systems, an immediate sunk investment may be their only choice.  (Note to customers in this undesirable predicament, if you really want to get an independent, third party verification of potential obsolesence, ask your favorite leasing companies, not associated or owned by the vendor, what residual value they would assume after 1 year for these systems vs. what they would assume for similar Skylake systems after 1 year.)

The previous “generation” of HPE Superdome, “X”, which as discussed in my last blog post shares 0% technology with Skylake based HPE Superdome “Flex”, was supported up to 8TB/16TB with 384 cores for both OLAP and OLTP, resulting in CTM of 21.3GB/42.7GB/core.  The SGI derived HPE MC990 X, which is the real predecessor to the new “Flex” system, was supported up to 4TB/20TB with 192 cores OLAP with 480 cores.

Strangely, “Flex” is only supported for HANA with 2 nodes or chassis where “MC990 X” was supported with up to 5 nodes.  It has been over 4 months since “Flex” was announced and at announcement date, HPE loudly proclaimed that “Flex” could support 48TB with 8 chassis/32 sockets https://news.hpe.com/hewlett-packard-enterprise-unveils-the-worlds-most-scalable-and-modular-in-memory-computing-platform/.  Since that time, some HPE reps have been telling customers that 32TB support with HANA was imminent.  One has to wonder what the hold up is.  First it took a couple of months just to get 128GB DIMM support. Now, it is taking even longer to get more than 2-node support for HANA.  If I were a potential HPE customer, I would be very curious and asking my rep about these delays (and I would have my BS detector set to high sensitivity).

Customers have now been presented with a stark contrast.  On one side, Power Systems has been on a roll; growing market share in HANA, regular increases in supported memory sizes, the ability to handle the largest single image HANA memory sizes of any vendor, outstanding mainframe derived reliability and radically better flexibility with built in virtualization and support for a maximum of 8 concurrent production HANA instances or 7 production with many dozens of non-prod HANA, application servers, non-HANA DBs and/or a wide variety of other applications supported in a shared pool, all at competitive price points.

On the other hand, Intel based HANA systems seem to be stuck in a rut with decreased maximum memory sizes (admittedly, this may be temporary), anemic increases in CTM, improved RAS but not yet to the league of Power Systems and a very questionable VMware based virtualization support filled with caveats, limitations, overhead and poor, at best, sharing of resources.

March 28, 2018 Posted by | Uncategorized | , , , , , , , , , , , , , , , | Leave a comment

HPE Superdome is dead, but HPE marketing continues its deceptive ways.

Today, 11/6/17, HPE announced the “New” Superdome Flex.  If you did not look too closely, you would think that this was some sort of descendant of Superdome.  After all, the Integrity Superdome took the original Superdome and replaced PA-RISC chips and the SX1000 cell controller with Itanium chips and a faster SX2000 cell controller.  Superdome 2 took this further by upgrading to the latest Itanium chips, an even faster SX3000 cell controller and moved from a cell board to a blade configuration.  Superdome X changed out the Itanium chips for Intel Xeon chips which it upgraded over several generations.  So, it would be only natural to think that Superdome Flex did something similar and that is exactly what HPE wants you to think.

Except, this is not even remotely like any prior Superdome and has inherited almost nothing from it.  In fact, this is a very straightforward descendant of the SGI UV 300H, which HPE renamed the MC990 X after the acquisition.  A glance at the front of the “new” system shows the same basic design, a 4-socket, 5U chassis even down to the unique diagonal handles on the fans, but they apparently moved the NUMAlink fabric ports (no longer called that; renamed Superdome Flex ports) from the back to the front, perhaps to get rid of a little of the rats nest of cables which defined the SGI UV 300H.  This means there is no SX3000 or cross bar switch in the Flex and the blade design is gone.  Even the memory DIMMS are different which implies that nothing could be moved from an old Superdome X to a new “Flex” other than perhaps some old PCIe adapters.

So, if the entire design is based on an SGI acquired technology and it shares nothing from its “namesake”, one would need to avoided that course in ethics in high school or college to find it appropriate to suggest to customers that this is a related technology.  Imagine if Honda changed the engine, frame, transmission, trim and body style but called their new car an Accord “Flex” because it used the same bumper and tire sizes, would you feel as if they were trying to manipulate you?

Back to the more important topic, Superdome is now dead!  I have been saying this for a while and blogged about this several months ago.  I suggested that any customer considering investing in this technology view it as instantly obsolete and a sunk investment.  I pointed out the huge investment in ccNUMA interconnect technologies and how it was hard to imagine how HPE could afford to invest in 2 different ones at the same time, so only one system was likely to survive.  I explained that the SGI technology offered more space and power to host the new, larger, higher wattage and heat dissipating Skylake processors.  It appears that my projections were correct.  For customers that ignored that advice, I just hope you got a really great price and don’t mind paying a lot for old technology for any upgrades or dumping your old systems at a huge financial loss.  For any customer still considering a Superdome X, the writing is no longer on the wall.  It is on HPE’s web site.  https://news.hpe.com/hewlett-packard-enterprise-unveils-the-worlds-most-scalable-and-modular-in-memory-computing-platform/

Currently, no white papers have been published showing the architecture and detailed specs of this “new” system, only a relatively high level “Spec” sheet.  Perhaps HPE is too embarrassed to publish this since it would likely resemble the SGI UV 300H in way too many ways, including the old rats nest of 4-bit wide interconnect cables.  Once they do, I will investigate and will likely publish a separate post to share what I find.

On the SAP front, new HANA appliance specs have been published for “Flex”.   It is interesting, and again embarrassing for HPE, that only up to 8-socket configs are shown, with less BWoH memory support @ 6TB max than the old, and now obsolete, Superdome X.  Even more interesting is the lack of SoH and S/4 configs, and I have a suspicion as to why.  Turns out that the spec sheet does have one interesting point after all.  It shows the maximum size memory DIMMS are 64GB and the number of DIMMS slots is 48 with a max supported memory of 3TB per chassis, i.e. half of what is necessary to support the 6TB per 4-sockets that other competing Intel vendors support.

So, if you need a supported HANA configuration today with current generation processors for BWoH beyond 6TB, look at any vendor other than HPE with 8-socket Skylake systems or IBM Power Systems.  If you need a supported SoH or S/4 configuration with current gen processors, look at any vendor other than HPE and beyond 12TB, only IBM Power Systems is supported at this level.

November 6, 2017 Posted by | Uncategorized | , , , , , , , , , , | Leave a comment

Support for HANA on Power with RedHat is finally here!!

SAP made a very exciting announcement this past Friday.  While it took a bit longer than expected, support for RHEL 7.3 with HANA on Power was announced by SAP in their usual overwhelming manner, i.e. they updated a SAP note:   SAP HANA 2235581: Supported Operating Systems.  RHEL is only supported on Power in Little Endian mode, i.e. only works with HANA 2.0.  This support is incredibly important for customers that have established RHEL as their standard for Linux and were either reluctant to introduce a different Linux distribution or were outright forbidden to by their corporate standards.  Taken together with the TDI Phase 5 SAPS based sizing announcement, yet another element that was inhibiting the already explosive growth of HANA on Power was removed.  I described that announcement as allowing the use of 5th gear after being limited to only 4.  Taking this metaphor a step further, RHEL support is like disengaging the parking brake.  I should mention that IBM does not develop a Linux variant of their own nor do they endorse any particular variety.  As such, I will not suggest any advantage of running RHEL or SLES for HANA and recommend, if a company has no firm policy either way, that you ask each distro partner to explain why they think theirs is better for HANA than the other.  At this time, both are supported only with PowerVM on Power Systems and with the exact same limits and multi-tenant/multi-VM flexibility.

October 9, 2017 Posted by | Uncategorized | , , , , , , , , | 4 Comments

TDI Phase 5 – SAPS based sizing bringing better TCO to new and existing Power Systems customers

SAP made a fundamental and incredibly important announcement this week at SAP TechEd in Las Vegas: TDI Phase 5 – SAPS based sizing for HANA workloads.  Since its debut, HANA has been sized based on a strict memory to core ratio determined by SAP based on workloads and platform characteristics, e.g. generation of processor, MHz, interconnect technology, etc.  This might have made some sense in the early days when much was not known about the loads that customers were likely to experience and SAP still had high hopes for enabling all customer employees to become knowledge workers with direct access to analytics.  Over time, with very rare exception, it turned out that CPU loads were far lower than the ratios might have predicted.

I have only run into one customer in the past two years that was able to drive a high utilization of their HANA systems and that was a customer running an x86 BW implementation with an impressively high number of concurrent users at one point in their month.  Most customers have experienced just the opposite, consistently low utilization regardless of technology.

For many customers, especially those running x86 systems, this has not been an issue.  First, it is not a significant departure from what many have experienced for years, even those running VMware.  Second, to compensate for relatively low memory and socket-to-socket bandwidth combined with high latency interconnects, many x86 systems work best with an excess of CPU.  Third, many x86 vendors have focused on HANA appliances which are rarely utilized with virtualization and are therefore often single instance systems.

IBM Power Systems customers, by comparison, have been almost universal in their concern about poor utilization.  These customers have historically driven high utilization, often over 65%.  Power has up to 5 times the memory bandwidth per socket of x86 systems (without compromising reliability) and very wide and parallel interconnect paths with very low latencies.  HANA has never been offered as an appliance on Power Systems, instead being offered only using a Tailored Datacenter Infrastructure (TDI) approach.  As a result, customers view on-premise Power Systems as a sort of utility, i.e. that they should be able to use them as they see fit and drive as much workload through them as possible while maintaining the Service Level Agreements (SLA) that their end users require.  The idea of running a system at 5%, or even 25%, utilization is almost an affront to these customers, but that is what they have experienced with the memory to core restrictions previously in place.

IBM’s virtualization solution, PowerVM, enabled SAP customers to run multiple production workloads (up to 8 on the largest systems) or a mix of production workloads (up to 7) with a shared pool of CPU resources within which an almost unlimited mix of VMs could run including non-prod HANA, application servers, as well as non-SAP and even other OS workloads, e.g. AIX and IBM i.  In this mixed mode, some of the excess CPU resource not used by the production workloads could be utilized by the shared-pool workloads.  This helped drive up utilization somewhat, but not enough for many.

These customers would like to do what they have historically done.  They would like to negotiate response time agreements with their end user departments then size their systems to meet those agreements and resize if they need more capacity or end up with too much capacity.

The newly released TDI Overview document http://bit.ly/2fLRFPb describes the new methodology: SAP HANA quicksizer and SAP HANA sizing reports have been enhanced to provide separate CPU and RAM sizing results in SAPS”.  I was able to verify Quicksizer showing SAPS, but not the sizing reports.  An SAP expert I ran into at TechEd suggested that getting the sizing reports to determine SAPS would be a tall order since they would have to include a database of SAPS capacity for every system on the market as well as number of cores and MHz for each one.  (In a separate blog post, I will share how IBM can help customers to calculate utilized SAPS on existing systems).  Customers are instructed to work with their hardware partner to determine the number of cores required based on the SAPS projected above.  The document goes on to state: The resulting HANA TDI configurations will extend the choice of HANA system sizes; and customers with less CPU intensive workloads may have bigger main memory capacity compared to SAP HANA appliance based solutions using fixed core to memory sizing approach (that’s more geared towards delivery of optimal performance for any type of a workload).”

Using a SAPS based methodology will be a good start and may result in fewer cores required for the same workload as would have been previously calculated based on a memory/core ratio.  Customers that wish to allocate more of less CPU to those workloads will now have this option meaning that even more significant reduction of CPU may be possible.  This will likely result in much more efficient use of CPU resources, more capacity available to other workloads and/or the ability to size systems with less resources to drive down the cost of those systems.  Either way helps drive much better TCO by reducing numbers and sizes of systems with the associated datacenter and personnel costs.

Existing Power customers will undoubtedly be delighted by this news.  Those customers will be able to start experimenting with different core allocations and most will find they are able to decrease their current HANA VM sizes substantially.  With the resources no longer required to support production, other workloads currently implemented on external systems may be consolidated to the newly, right sized, system.  Application servers, central services, Hadoop, HPC, AI, etc. are candidates to be consolidated in this way.

Here is a very simple example:  A hypothetical customer has two production workloads, BW/4HANA and S/4HANA which require 4TB and 3TB respectively.  For each, HA is required as is Dev/Test, Sandbox and QA.  Prior to TDI Phase 5, using Power Systems, the 4TB BW system would require roughly 82-cores due to the 50GB/core ratio and the S/4 workload would require roughly 33 cores due to the 96GB/core ratio.  Including HA and non-prod, the systems might look something like:

TDI Phase 4

Note the relatively small number of cores available in the shared pool (might be less than optimal) and the total number of cores in the system. Some customers may have elected to increase to an even larger system or utilize additional systems as a result.  As this stood, this was already a pretty compelling TCO and consolidation story to customers.

With SAPS based sizing, the BW workload may require only 70 cores and S/4 21 cores (both are guesses based on early sizing examples and proper analysis of the SAP sizing reports and per core SAPS ratings of servers is required to determine actual core requirements).  The resulting architecture could look like:

TDI Phase 5 est

Note the smaller core count in each system.  By switching to this methodology, lower cost CPU sockets may be employed and processor activation costs decreased by 24 cores per system.  But the number of cores in the shared pool remains the same, so still could be improved a bit.

During a landscape session at SAP TechEd in Las Vegas, an SAP expert stated that customers will be responsible for performance and CPU allocation will not be enforced by SAP through HWCCT as had been the case in the past.  This means that customers will be able to determine the number of cores to allocate to their various instances.  It is conceivable that some customers will find that instead of the 70 cores in the above example, 60, 50 or fewer cores may be required for BW with decreased requirements for S/4HANA as well.  Using this approach, a customer choosing this more hypothetical approach might see the following:

TDI Phase 5 hyp

Note how the number of cores in the shared pool have increased substantially allowing for more workloads to be consolidated to these systems, further decreasing costs by eliminating those external systems as well as being able to consolidate more SAN and Network cards, decreasing computer room space and reducing energy/cooling requirements.

A reasonable question is whether these same savings would accrue to an x86 implementation.  The answer is not necessarily.  Yes, fewer cores would also be required, but to take advantage of a similar type of consolidation, VMware must be employed.  And if VMware is used, then a host of caveats must be taken into consideration.  1) overhead, reportedly 12% or more, must be added to the capacity requirements.  2) I/O throughput must be tested to ensure load times, log writes, savepoints, snapshots and backup speeds which are acceptable to the business.  3) limits must be understood, e.g. max memory in a VM is 4TB which means that BW cannot grow by even 1KB. 4) Socket isolation is required as SAP does not permit the sharing of a socket in a HANA production/VMware environment meaning that reducing core requirements may not result in fewer sockets, i.e. this may not eliminate underutilized cores in an Intel/VMware system.  5) Non-prod workloads can’t take advantage of capacity not used by production for several reasons not the least of which is that SAP does not permit sharing of sockets between VM prod and non-prod instances not to mention the reluctance of many customer to mix prod and non-prod using a software hypervisor such as VMware even if SAP permitted this.  Bottom line is that most customers, through an abundance of caution, or actual experience with VMware, choose to place production on bare-metal and non-prod, which does not require the same stack as prod, on VMware.  Workloads which do require the same stack as prod, e.g. QA, also are usually placed on bare-metal.  After closer evaluation, this means that TDI Phase 5 will have limited benefits to x86 customers.

This announcement is the equivalent of finally being allowed to use 5th gear on your car after having been limited to only 4 for a long time.  HANA on IBM Power Systems already had the fastest adoption in recent SAP history with roughly 950 customers selecting HANA on Power in just 2 years. TDI Phase 5 uniquely benefits Power Systems customers which will continue the acceleration of HANA on Power.  Those individuals that recommended or made decisions to select HANA on Power will look like geniuses to their CFOs as they will now get the equivalent of new systems capacity at no cost.

September 29, 2017 Posted by | Uncategorized | , , , , , , , , , , , , | 3 Comments

HPE, still playing fast and loose with the facts about SAP HANA on Power

Writing a blog post would be so much simpler if IBM permitted me to lie, but that is prohibited.  That is clearly not the case at HPE, see this recent blog post: https://community.hpe.com/t5/Alliances/SAP-HANA-runs-best-on-x86-Period/ba-p/6971659#.WZxDPa01TxW

It contains so many lies, it is hard to know where to start.   Let’s start with the biggest one.  There is a 10x difference in performance KPIs required by SAP to certify and ship a HANA appliance vs. a solution certified for TDI only.

You really have to love those lies that are refuted by such easily obtained facts from documentation that is apparently not used by HPE called SAP Notes.  SAP note 1943937 specifically states: All HWCCT tests of appliances (compute servers) certified with scenario HANA-HWC-AP SU 1.1 or HANA-HWC-AP RH 1.1 must use HWCCT of SAP HANA SPS10 or higher or a related SAP HANA revision”  Interesting that appliances must use the same HWCCT test of SAP KPIs as used by TDI.  So, based on HPE’s blog post, does this mean that if an HPE appliance compute server is used for TDI, it will perform 10x worse than if it is used in an appliance? That would imply that the secret sauce of HPE’s appliances is so incredible that it acts like a dual turbocharger on a car!

The blog post goes on to say “It (Power) ‘works’ … but it is just held to a ~10x lower standard without any of the performance optimizations attributed to SAP’s co-innovation efforts with Intel.”  OK, two lies in one sentence, obviously going for the gold here.  As we have already discussed the 10x lie, let us just look at the second one, i.e. performance optimizations.  Specifically, the blog calls out AVX and TSX as the performance optimizations for the Intel platform.  They are correct, those optimization don’t work on Power as, instead of AVX (Advanced Vector Extensions), Power has two fully symmetric vector pipelines called via VSX (Vector-Scalar eXtensions) instructions which HANA has been “optimized” to use in the same manner as AVX.  And TSX, a.k.a. Transactional System Extensions, came out after POWER8 Transactional Memory, but HANA was optimized for both at the same time.

The blog post also stated “Intel E7v3 CPUs for HANA (TSX and AVX) that offer a 5x performance boost over older Intel E7 or Power 8 CPUs.“  Awesome, but where is the proof behind this statement?  Perhaps a benchmark?  Nope, not one published on SAP’s site, even the old SD benchmark backs up this claim (which shows, by the way, almost the same SAPS/core for E7v3 vs. E7v2 and way less than POWER8, but maybe that benchmark does not use those optimizations?  Ok, then maybe the sizing certifications show this?  Nope.  A 4 socket CS500 Ivy Bridge system (E7v2) is published as supporting up to a 2TB SoH solution where a 4 socket CS500 Haswell systems (E7v3) is shown as supporting up to a 3TB SoH solution.  So far, that is just 50% more, not 5x, but perhaps HPE can’t tell the difference between 0.5 and 5.0?  But didn’t Haswell have more cores per socket?  Yes, it had 18 cores/socket vs. 15 cores/socket for IvyBridge, i.e. 20% more cores/socket.  So, E7v3 based systems could actually host 25% more memory and associated workload than E7v2 based systems per core.  Of course, I am sure that the switch from DDR3 to DDR4 from E7v2 to E7v3 had nothing to do with this performance improvement.  So, 5x performance boost is clearly nothing but a big fat lie.

And the hits just keep coming.  The next statement is just lovely “Naturally, this only matters if you want to be able to call SAP support to get help on nuance performance issues impacting your productive SAP HANA deployment.“  I guess he is trying to suggest that SAP won’t help you with performance problems if you are running any TDI solution including Power Systems, except this is contradicted by all of the SAP notes about TDI and HWCCT, not to mention the experience of customers who have implemented HANA on Power.

IBM wants to be the king of legacy businesses like mainframe and UNIX. That’s pretty much the only platforms they have left. So now that they can state that HANA “works” on Power, they can make a case to their AIX/Power customers that they should stay on AIX/Power for SAP and HANA and avoid what IBM claims to be an “oh so painful’ Unix to x86 migration.“  HPE, suggesting that you can run HANA on AIX/Power since HANA only runs on Linux/Power, might not be telling a lie but might just be expressing ignorance and the inability to use sophisticated and obscure search tools like Google.   As to the suggestion of IBM having said that a SAP heterogeneous migration is “oh so painful” ignores the fact that we have done hundreds of such migrations from HP/UX among others.  Perhaps the author is reflecting HPE’s migration experience with what every other migration provider sees as very well understood and fully supported SAP process.  As to IBM’s motivation, HPE is trying to suggest that IBM is only in the HANA business to support its legacy SAP on AIX/Power.  Just looking at the thriving business of HANA on Power, over 850 HANA on Power wins since becoming a supported HANA provider, might suggest otherwise.  How about the complete absence of any quotes, marketing materials or other documentation that shows that IBM is in this market for any other reason than it is the future of SAP and IBM intends to remain a premier partner of SAP and our customers?

Taken together, this blog post shows that HPE must be using the advanced Skylake processors in their Superdome-X and MC990 X (SGI UV 300H) to generate lies at an astounding pace.  Oh wait, I forgot (not really) that HPE still has not announced support for Skylake in their high end systems and is only certified for BWoH, not SoH or S/4HANA, with Skylake and only up to 3TB at that, … one month after announcement of Skylake!  Wow, I guess this shows simultaneously how much their pace of technology innovation has slowed down and partnership with SAP has decreased!

And, let’s end on their last amazing sentence, “Every day you put off that UNIX to x86 migration, you are running HANA in a performance degraded mode with production support limitations.“  Yes, HPE seems to be recommending that you migrate your UNIX based systems, i.e. those running Business Suite 7, to x86.  In other words, do two migrations, once from UNIX to x86 and a second to HANA.  Sounds like a totally disconnect from business reality!  As to the second part of that sentence, it simply does not make sense, so we are not going to attribute that to a lie, but to a simple logic error.

What are we to conclude about this blog post?  Taken on its own, it is a rogue employee.  Taken with the deluge of other similar misleading and outright lies emerging from HPE and we see a trend.  HPE has gone from being a well-respected systems supplier to a struggling company that promotes and condones lies and hires less than competent individuals to propagate misinformation.  Sounds more like a failing communist state than a company worthy of a customer’s trust.

August 22, 2017 Posted by | Uncategorized | , , , , , , , , | 4 Comments

Intel Skylake has been announced and the self-described HANA “market leader”, HPE, is curiously trailing the field

Intel announced general availability of their “Skylake” processor on the “Purely” platform last week.  Soon after, SAP posted certified HANA configurations for Lenovo and Fujitsu up to 8 sockets and 12TB memory for Suite on HANA (SoH) and S/4HANA (S4) and 6TB for BW on HANA (BWoH).  They also posted certified configurations for Dell and Cisco up to 4-socket systems with 6TB SoH/S4 and 3TB BWoH.  The certified configurations posted for HPE, which describes itself as the HANA market leader, only included up to 4-socket/3TB BWoH configurations, no configurations for SoH/S4 and nothing for any larger systems.

It is still early and more certified configurations will no doubt emerge over time, but these early results do beg the question, “what is going on with HPE?”  I checked the most recent press releases for HPE and they did not even mention the Skylake debut much less their certification with SAP HANA.  If you Google using the keywords, HPE, Skylake and HANA, you may find a few discussions about HPE’s acquisition of SGI and my previous blog posts with my speculation about Superdome’s demise and HPE’s misleading of customers about this impending event, but nothing from HPE.

So, I will share a little more speculation as to what this slow start for HPE in the Skylake space might portend.

Option 1 – HPE is not investing the funds necessary to certify all of their possible configurations and SoH/S4.  Anyone that has been involved with the HANA certification process will tell you that it is very time consuming and expensive.  As you can see from HPE’s primary Intel based competitors, they are all very eager to increase their market share and acted quickly.  Is HPE becoming complacent?  Are they having financial restrictions that have not been made public?

Option 2 – HPE’s technology limitations are becoming apparent.  The Converged System 500 is based on Proliant DL560/580 systems which support a maximum of 4 sockets.  These systems utilize Intel QPI and now UPI interconnect technologies, i.e. no custom ASICs or ccNUMA switches are required.  The CS900 based on the Superdome X and the MC990 X (SGI UV 300H) utilize custom ASICs and, in the case of Superdome X, a set of ccNUMA switches.  As I speculated previously, Superdome X is probably at end of life, so it may never see another certification on SAP’s HANA site.  As to the MC990 X, the crystal ball is a bit more hazy.  Perhaps HPE is trying to shoot for the moon and hit a number beyond the 20TB for SoH/S4 that is currently supported meaning a much longer and more complex set of certification tests.  Or perhaps they are running into technical challenges with the new ASICs required to support UPI.

Option 3 – MC990 X is going to officially become HPE’s only high end offering to support Skylake and subsequent processors and Superdome X is going to be announced at end of life.  If this were to happen, it would mean that anyone that had recently purchased such a system would have purchased a system that is immediately obsolete.

If Option 1 turns out to be true, one would have to concerned about HPE’s future in the HANA space.  If Option 2 turns out to be true, one would have to be really concerned about HPE’s future in the HANA space.  And if Option 3 turns out to be true, why would HPE be waiting?  The answer may be inventory.  If HPE has a substantial inventory of “old” Broadwell based blades and Superdome X chassis, they will undoubtedly want to unload these at the highest price possible and they know that the value of obsolete systems after such an announcement would drop into the below cost of manufacturing range.

So, you pick the most likely scenario.  Worst case for HPE is that they are just a little slow or shooting too high.  Worst case for customers is that they purchase a HANA system based on Superdome X and end up with a few hundred thousand dollar boat anchor.  If you work for a company considering the purchase of an HPE Superdome X solution, you may want to ask about its future and, if you find it is at end of life, select another solution for your SAP HANA requirements.

Inevitably, more systems will be published on SAP’s certification page, https://www.sap.com/dmc/exp/2014-09-02-hana-hardware/enEN/appliances.html#viewcount=100&categories=certified%2CIntel%20Skylake%20SP .  When that happens, especially if any of my predictions turn out to be true or if they are all wrong and another scenario emerges, I will post an update.

July 20, 2017 Posted by | Uncategorized | , , , , , , , , , , , , | Leave a comment