SAPonPower

An ongoing discussion about SAP infrastructure

Support for HANA on Power with RedHat is finally here!!

SAP made a very exciting announcement this past Friday.  While it took a bit longer than expected, support for RHEL 7.3 with HANA on Power was announced by SAP in their usual overwhelming manner, i.e. they updated a SAP note:   SAP HANA 2235581: Supported Operating Systems.  RHEL is only supported on Power in Little Endian mode, i.e. only works with HANA 2.0.  This support is incredibly important for customers that have established RHEL as their standard for Linux and were either reluctant to introduce a different Linux distribution or were outright forbidden to by their corporate standards.  Taken together with the TDI Phase 5 SAPS based sizing announcement, yet another element that was inhibiting the already explosive growth of HANA on Power was removed.  I described that announcement as allowing the use of 5th gear after being limited to only 4.  Taking this metaphor a step further, RHEL support is like disengaging the parking brake.  I should mention that IBM does not develop a Linux variant of their own nor do they endorse any particular variety.  As such, I will not suggest any advantage of running RHEL or SLES for HANA and recommend, if a company has no firm policy either way, that you ask each distro partner to explain why they think theirs is better for HANA than the other.  At this time, both are supported only with PowerVM on Power Systems and with the exact same limits and multi-tenant/multi-VM flexibility.

Advertisements

October 9, 2017 Posted by | Uncategorized | , , , , , , , , | 3 Comments

TDI Phase 5 – SAPS based sizing bringing better TCO to new and existing Power Systems customers

SAP made a fundamental and incredibly important announcement this week at SAP TechEd in Las Vegas: TDI Phase 5 – SAPS based sizing for HANA workloads.  Since its debut, HANA has been sized based on a strict memory to core ratio determined by SAP based on workloads and platform characteristics, e.g. generation of processor, MHz, interconnect technology, etc.  This might have made some sense in the early days when much was not known about the loads that customers were likely to experience and SAP still had high hopes for enabling all customer employees to become knowledge workers with direct access to analytics.  Over time, with very rare exception, it turned out that CPU loads were far lower than the ratios might have predicted.

I have only run into one customer in the past two years that was able to drive a high utilization of their HANA systems and that was a customer running an x86 BW implementation with an impressively high number of concurrent users at one point in their month.  Most customers have experienced just the opposite, consistently low utilization regardless of technology.

For many customers, especially those running x86 systems, this has not been an issue.  First, it is not a significant departure from what many have experienced for years, even those running VMware.  Second, to compensate for relatively low memory and socket-to-socket bandwidth combined with high latency interconnects, many x86 systems work best with an excess of CPU.  Third, many x86 vendors have focused on HANA appliances which are rarely utilized with virtualization and are therefore often single instance systems.

IBM Power Systems customers, by comparison, have been almost universal in their concern about poor utilization.  These customers have historically driven high utilization, often over 65%.  Power has up to 5 times the memory bandwidth per socket of x86 systems (without compromising reliability) and very wide and parallel interconnect paths with very low latencies.  HANA has never been offered as an appliance on Power Systems, instead being offered only using a Tailored Datacenter Infrastructure (TDI) approach.  As a result, customers view on-premise Power Systems as a sort of utility, i.e. that they should be able to use them as they see fit and drive as much workload through them as possible while maintaining the Service Level Agreements (SLA) that their end users require.  The idea of running a system at 5%, or even 25%, utilization is almost an affront to these customers, but that is what they have experienced with the memory to core restrictions previously in place.

IBM’s virtualization solution, PowerVM, enabled SAP customers to run multiple production workloads (up to 8 on the largest systems) or a mix of production workloads (up to 7) with a shared pool of CPU resources within which an almost unlimited mix of VMs could run including non-prod HANA, application servers, as well as non-SAP and even other OS workloads, e.g. AIX and IBM i.  In this mixed mode, some of the excess CPU resource not used by the production workloads could be utilized by the shared-pool workloads.  This helped drive up utilization somewhat, but not enough for many.

These customers would like to do what they have historically done.  They would like to negotiate response time agreements with their end user departments then size their systems to meet those agreements and resize if they need more capacity or end up with too much capacity.

The newly released TDI Overview document http://bit.ly/2fLRFPb describes the new methodology: SAP HANA quicksizer and SAP HANA sizing reports have been enhanced to provide separate CPU and RAM sizing results in SAPS”.  I was able to verify Quicksizer showing SAPS, but not the sizing reports.  An SAP expert I ran into at TechEd suggested that getting the sizing reports to determine SAPS would be a tall order since they would have to include a database of SAPS capacity for every system on the market as well as number of cores and MHz for each one.  (In a separate blog post, I will share how IBM can help customers to calculate utilized SAPS on existing systems).  Customers are instructed to work with their hardware partner to determine the number of cores required based on the SAPS projected above.  The document goes on to state: The resulting HANA TDI configurations will extend the choice of HANA system sizes; and customers with less CPU intensive workloads may have bigger main memory capacity compared to SAP HANA appliance based solutions using fixed core to memory sizing approach (that’s more geared towards delivery of optimal performance for any type of a workload).”

Using a SAPS based methodology will be a good start and may result in fewer cores required for the same workload as would have been previously calculated based on a memory/core ratio.  Customers that wish to allocate more of less CPU to those workloads will now have this option meaning that even more significant reduction of CPU may be possible.  This will likely result in much more efficient use of CPU resources, more capacity available to other workloads and/or the ability to size systems with less resources to drive down the cost of those systems.  Either way helps drive much better TCO by reducing numbers and sizes of systems with the associated datacenter and personnel costs.

Existing Power customers will undoubtedly be delighted by this news.  Those customers will be able to start experimenting with different core allocations and most will find they are able to decrease their current HANA VM sizes substantially.  With the resources no longer required to support production, other workloads currently implemented on external systems may be consolidated to the newly, right sized, system.  Application servers, central services, Hadoop, HPC, AI, etc. are candidates to be consolidated in this way.

Here is a very simple example:  A hypothetical customer has two production workloads, BW/4HANA and S/4HANA which require 4TB and 3TB respectively.  For each, HA is required as is Dev/Test, Sandbox and QA.  Prior to TDI Phase 5, using Power Systems, the 4TB BW system would require roughly 82-cores due to the 50GB/core ratio and the S/4 workload would require roughly 33 cores due to the 96GB/core ratio.  Including HA and non-prod, the systems might look something like:

TDI Phase 4

Note the relatively small number of cores available in the shared pool (might be less than optimal) and the total number of cores in the system. Some customers may have elected to increase to an even larger system or utilize additional systems as a result.  As this stood, this was already a pretty compelling TCO and consolidation story to customers.

With SAPS based sizing, the BW workload may require only 70 cores and S/4 21 cores (both are guesses based on early sizing examples and proper analysis of the SAP sizing reports and per core SAPS ratings of servers is required to determine actual core requirements).  The resulting architecture could look like:

TDI Phase 5 est

Note the smaller core count in each system.  By switching to this methodology, lower cost CPU sockets may be employed and processor activation costs decreased by 24 cores per system.  But the number of cores in the shared pool remains the same, so still could be improved a bit.

During a landscape session at SAP TechEd in Las Vegas, an SAP expert stated that customers will be responsible for performance and CPU allocation will not be enforced by SAP through HWCCT as had been the case in the past.  This means that customers will be able to determine the number of cores to allocate to their various instances.  It is conceivable that some customers will find that instead of the 70 cores in the above example, 60, 50 or fewer cores may be required for BW with decreased requirements for S/4HANA as well.  Using this approach, a customer choosing this more hypothetical approach might see the following:

TDI Phase 5 hyp

Note how the number of cores in the shared pool have increased substantially allowing for more workloads to be consolidated to these systems, further decreasing costs by eliminating those external systems as well as being able to consolidate more SAN and Network cards, decreasing computer room space and reducing energy/cooling requirements.

A reasonable question is whether these same savings would accrue to an x86 implementation.  The answer is not necessarily.  Yes, fewer cores would also be required, but to take advantage of a similar type of consolidation, VMware must be employed.  And if VMware is used, then a host of caveats must be taken into consideration.  1) overhead, reportedly 12% or more, must be added to the capacity requirements.  2) I/O throughput must be tested to ensure load times, log writes, savepoints, snapshots and backup speeds which are acceptable to the business.  3) limits must be understood, e.g. max memory in a VM is 4TB which means that BW cannot grow by even 1KB. 4) Socket isolation is required as SAP does not permit the sharing of a socket in a HANA production/VMware environment meaning that reducing core requirements may not result in fewer sockets, i.e. this may not eliminate underutilized cores in an Intel/VMware system.  5) Non-prod workloads can’t take advantage of capacity not used by production for several reasons not the least of which is that SAP does not permit sharing of sockets between VM prod and non-prod instances not to mention the reluctance of many customer to mix prod and non-prod using a software hypervisor such as VMware even if SAP permitted this.  Bottom line is that most customers, through an abundance of caution, or actual experience with VMware, choose to place production on bare-metal and non-prod, which does not require the same stack as prod, on VMware.  Workloads which do require the same stack as prod, e.g. QA, also are usually placed on bare-metal.  After closer evaluation, this means that TDI Phase 5 will have limited benefits to x86 customers.

This announcement is the equivalent of finally being allowed to use 5th gear on your car after having been limited to only 4 for a long time.  HANA on IBM Power Systems already had the fastest adoption in recent SAP history with roughly 950 customers selecting HANA on Power in just 2 years. TDI Phase 5 uniquely benefits Power Systems customers which will continue the acceleration of HANA on Power.  Those individuals that recommended or made decisions to select HANA on Power will look like geniuses to their CFOs as they will now get the equivalent of new systems capacity at no cost.

September 29, 2017 Posted by | Uncategorized | , , , , , , , , , , , , | 1 Comment

HPE, still playing fast and loose with the facts about SAP HANA on Power

Writing a blog post would be so much simpler if IBM permitted me to lie, but that is prohibited.  That is clearly not the case at HPE, see this recent blog post: https://community.hpe.com/t5/Alliances/SAP-HANA-runs-best-on-x86-Period/ba-p/6971659#.WZxDPa01TxW

It contains so many lies, it is hard to know where to start.   Let’s start with the biggest one.  There is a 10x difference in performance KPIs required by SAP to certify and ship a HANA appliance vs. a solution certified for TDI only.

You really have to love those lies that are refuted by such easily obtained facts from documentation that is apparently not used by HPE called SAP Notes.  SAP note 1943937 specifically states: All HWCCT tests of appliances (compute servers) certified with scenario HANA-HWC-AP SU 1.1 or HANA-HWC-AP RH 1.1 must use HWCCT of SAP HANA SPS10 or higher or a related SAP HANA revision”  Interesting that appliances must use the same HWCCT test of SAP KPIs as used by TDI.  So, based on HPE’s blog post, does this mean that if an HPE appliance compute server is used for TDI, it will perform 10x worse than if it is used in an appliance? That would imply that the secret sauce of HPE’s appliances is so incredible that it acts like a dual turbocharger on a car!

The blog post goes on to say “It (Power) ‘works’ … but it is just held to a ~10x lower standard without any of the performance optimizations attributed to SAP’s co-innovation efforts with Intel.”  OK, two lies in one sentence, obviously going for the gold here.  As we have already discussed the 10x lie, let us just look at the second one, i.e. performance optimizations.  Specifically, the blog calls out AVX and TSX as the performance optimizations for the Intel platform.  They are correct, those optimization don’t work on Power as, instead of AVX (Advanced Vector Extensions), Power has two fully symmetric vector pipelines called via VSX (Vector-Scalar eXtensions) instructions which HANA has been “optimized” to use in the same manner as AVX.  And TSX, a.k.a. Transactional System Extensions, came out after POWER8 Transactional Memory, but HANA was optimized for both at the same time.

The blog post also stated “Intel E7v3 CPUs for HANA (TSX and AVX) that offer a 5x performance boost over older Intel E7 or Power 8 CPUs.“  Awesome, but where is the proof behind this statement?  Perhaps a benchmark?  Nope, not one published on SAP’s site, even the old SD benchmark backs up this claim (which shows, by the way, almost the same SAPS/core for E7v3 vs. E7v2 and way less than POWER8, but maybe that benchmark does not use those optimizations?  Ok, then maybe the sizing certifications show this?  Nope.  A 4 socket CS500 Ivy Bridge system (E7v2) is published as supporting up to a 2TB SoH solution where a 4 socket CS500 Haswell systems (E7v3) is shown as supporting up to a 3TB SoH solution.  So far, that is just 50% more, not 5x, but perhaps HPE can’t tell the difference between 0.5 and 5.0?  But didn’t Haswell have more cores per socket?  Yes, it had 18 cores/socket vs. 15 cores/socket for IvyBridge, i.e. 20% more cores/socket.  So, E7v3 based systems could actually host 25% more memory and associated workload than E7v2 based systems per core.  Of course, I am sure that the switch from DDR3 to DDR4 from E7v2 to E7v3 had nothing to do with this performance improvement.  So, 5x performance boost is clearly nothing but a big fat lie.

And the hits just keep coming.  The next statement is just lovely “Naturally, this only matters if you want to be able to call SAP support to get help on nuance performance issues impacting your productive SAP HANA deployment.“  I guess he is trying to suggest that SAP won’t help you with performance problems if you are running any TDI solution including Power Systems, except this is contradicted by all of the SAP notes about TDI and HWCCT, not to mention the experience of customers who have implemented HANA on Power.

IBM wants to be the king of legacy businesses like mainframe and UNIX. That’s pretty much the only platforms they have left. So now that they can state that HANA “works” on Power, they can make a case to their AIX/Power customers that they should stay on AIX/Power for SAP and HANA and avoid what IBM claims to be an “oh so painful’ Unix to x86 migration.“  HPE, suggesting that you can run HANA on AIX/Power since HANA only runs on Linux/Power, might not be telling a lie but might just be expressing ignorance and the inability to use sophisticated and obscure search tools like Google.   As to the suggestion of IBM having said that a SAP heterogeneous migration is “oh so painful” ignores the fact that we have done hundreds of such migrations from HP/UX among others.  Perhaps the author is reflecting HPE’s migration experience with what every other migration provider sees as very well understood and fully supported SAP process.  As to IBM’s motivation, HPE is trying to suggest that IBM is only in the HANA business to support its legacy SAP on AIX/Power.  Just looking at the thriving business of HANA on Power, over 850 HANA on Power wins since becoming a supported HANA provider, might suggest otherwise.  How about the complete absence of any quotes, marketing materials or other documentation that shows that IBM is in this market for any other reason than it is the future of SAP and IBM intends to remain a premier partner of SAP and our customers?

Taken together, this blog post shows that HPE must be using the advanced Skylake processors in their Superdome-X and MC990 X (SGI UV 300H) to generate lies at an astounding pace.  Oh wait, I forgot (not really) that HPE still has not announced support for Skylake in their high end systems and is only certified for BWoH, not SoH or S/4HANA, with Skylake and only up to 3TB at that, … one month after announcement of Skylake!  Wow, I guess this shows simultaneously how much their pace of technology innovation has slowed down and partnership with SAP has decreased!

And, let’s end on their last amazing sentence, “Every day you put off that UNIX to x86 migration, you are running HANA in a performance degraded mode with production support limitations.“  Yes, HPE seems to be recommending that you migrate your UNIX based systems, i.e. those running Business Suite 7, to x86.  In other words, do two migrations, once from UNIX to x86 and a second to HANA.  Sounds like a totally disconnect from business reality!  As to the second part of that sentence, it simply does not make sense, so we are not going to attribute that to a lie, but to a simple logic error.

What are we to conclude about this blog post?  Taken on its own, it is a rogue employee.  Taken with the deluge of other similar misleading and outright lies emerging from HPE and we see a trend.  HPE has gone from being a well-respected systems supplier to a struggling company that promotes and condones lies and hires less than competent individuals to propagate misinformation.  Sounds more like a failing communist state than a company worthy of a customer’s trust.

August 22, 2017 Posted by | Uncategorized | , , , , , , , , | 2 Comments

There they go again, HPE misleading customers about Superdome X futures

When the acquisition of SGI by HPE was announced last year, I was openly skeptical about HPE’s motives and wrote a blog post to discuss the potential reasons and implications behind this decision. I felt pretty certain that HPE would not retain both Superdome X (SD-X) and the high end SGI system, now called MC990 X as they play in the same space. I speculated that the MC990 X would not be the winner because it has inferior memory reliability technology and uses a hand wired matrix to interconnect different nodes compared to SD-X which uses a backplane to connect nodes to switches for connectivity to other nodes.

At SapphireNow this past week, I learned, from a couple of different sources, that Intel’s next generation platform, “Purely” using “Skylake” processors include so many significant changes that SD-X will very likely not be able to support them. Those changes include a new chip level interconnect technology “UltraPath” (UPI), as a follow on to QPI, faster memory and a larger footprint which, according to pictures available on the internet, appears to be at least 25% larger. This makes sense since the high end Skylake processor will have up to 28 cores versus Broadwell’s 24 cores and includes 25% more L3 cache but uses the same 14nm lithography. Though specs are limited, my sources also believe this new platform will require more power and cooling than the “Broadwell” chips used in SD-X today. SD-X cell boards are already pretty compact nodes, which made sense when they were trying to deliver a “converged” solution, but which means that any increase in footprint of any component could push these boards beyond their limits. Clearly, the UPI change alone would require HPE’s XNC2 cell controller to be redesigned even if no power or cooling limits were exceeded. Considering how few of these chips are used per year, it would be impractical for HPE to maintain a development program for both XNC2 (and the associated SX3000 switch) in addition to the SGI acquired NUMALink7 cell controllers. The speculation is therefore that SD-X will, in fact, be the loser and HPE will utilize the MC990 X as the go forward, high end platform.

If this turns out to be a correct conclusion, then a lot of very large customers are likely to be in for a major shock since they have purchased SD-X with the expectation of being able to upgrade it and add on like architecture systems for their SAP HANA workloads as they grow over time as well as for other large memory applications. Had they been informed about the upcoming obsolescence of SD-X, many of these customers may have made a different purchasing decision. I am not convinced they would have purchased the MC990 X instead for a variety of reasons, not the least of which is the lack of robust memory protection, available with SD-X, known as RAS, Lockstep or DDDC+ mode, but not with MC990 X. Lacking this, the failure of a chip on a memory DIMM could result in a HANA failure, system failure or the need to immediately shut down the system to diagnose and fix the failed DIMM. Most SAP customers running Suite on HANA or S/4HANA are unlikely to find this level of protection to be acceptable, especially when they are running massive HANA transactional systems.

On the subject of reliability, one of the highest exposures that any system has are the physical connections and the MC990 X has a lot of them to form the ccNUMA mesh. Each cable has a connector on each end which are then inserted into the appropriate port on each pair of node controllers. Any time a cable is inserted into any sort of receptacle, there is a possibility that it will fail due to the pressure required to insert it or corrosion. A system with 2 nodes has 8 such cables, 2 between the pair of controllers on each node and 4 between the nodes. A system with 5 nodes has approximately 50 such cables. While not necessarily catastrophic, a failure of one of these cables would require a planned outage as soon as possible as there would be a performance impact from the loss of the connection.

Another limitation of the MC990 X is the lack of virtualization technology. As each node in a MC990 X contains 4 sockets, the level of granularity of this system is so coarse that very poor utilization is likely to result and a massive increase in footprint will be required to handle the same set of workloads that might have been previously handled by SD-X. Physical partitioning is available, but based on one or more 4 socket nodes, such a waste as to be completely irrelevant.

If this speculation proves to be true, and HPE has not been sharing this roadmap with customers, then those customers have been convinced to invest in an obsolete platform and they should be furious. For those in the process of making a decision, they should be asking HPE the right questions about SD-X product futures and then taking that into account as well as the product weaknesses of MC990 X.

I should note that IBM has but one go forward architecture for Power Systems based on the on-chip interconnect technology that has been included and evolving since POWER4. Customers that invested in Power technology have seen a consistent 3 to 4 year cycle to the next generation with those who purchased high end models often having the option to upgrade from one generation to the next.

If you read my blog post last week, you may recall that I pointed out how HP was lying to customers about HANA on IBM Power Systems.  Now we learn that they may also be doing the same sort of thing about their own products.  Me thinks a trend is emerging here!?!?

May 23, 2017 Posted by | Uncategorized | , , , , , , , , , | 2 Comments

HPE, in an act of desperation, is spreading misinformation about SAP HANA on IBM Power Systems

Misinformation is a poor characterization of HPE’s behavior.  HPE, or some of its employees, are showing customers charts with a variety of statements which are simply untrue.  In any normal definition, this is called a lie.  This is unethical and unprofessional.  I will repeat what it says in my profile, these are my opinions, not a reflection of those of IBM.

You may have seen the blog post from Vicente Moranta: https://www.linkedin.com/pulse/truth-wild-lies-being-told-hanaonpower-vicente-moranta or my own post on IBM Systems Blog: https://www.ibm.com/blogs/systems/can-you-tell-hana-on-ibm-power-systems-fact-from-fiction/ . At IBM, we have this set of ethical rules called “IBM Business Conduct Guidelines” (BCG).  This 15 to 20 page document is required reading every year with a mandatory test to ensure comprehensive understanding of these rules.  I can boil down one of the most important themes into two words: DON’T LIE!

For those of you who have been reading this blog for a while, you may question whether I am too verbose and that may be fair as I thoroughly research each subject and include attribution for claims, usually including direct links to the source of those claims.  I would never think of making up “facts” and, on rare occasion when a reader has informed me of a mistake, I always correct the mistake as well as include a comment to that effect.

Some background:  A few weeks ago, a customer sent us a list of questions about SAP HANA on IBM Power Systems.  At first, the questions seemed bizarre as they included some very pointed misunderstandings about HANA and SAP in general and IBM’s role with SAP in particular.  As I read them more thoroughly, I realized that someone or some entity had coached the customer.  This was confirmed when I received a copy of a HPE presentation from a completely different source with almost identically worded statements.  By the way, back to the BCG, IBM employees are not allowed to view much less share information from a competitor marked confidential and this presentation was not marked with anything, meaning it was being shown to customers with, or without, HPE management’s official knowledge.

Some of the lies it shares:

  • HPE has 99%+ share of the HANA market. It is kind of funny to note that this claim is contradicted in the same table where it shows 80% share for Intel.  I guess they are confusing SAP and SAP HANA markets which is misleading at best.  More importantly, SAP does not release market share information and even if they did, I think the Lenovo, Cisco, Dell and Fujitsu might together claim more than 1% of the market.
  • IBM, not SAP “delivers” HANA code to customers because they have access to SAP code and have created a “special” version of SAP HANA. Wow, it is hard to figure out where to start here.  SAP owns HANA and only they distribute code.  They refused to support other operating systems than Linux, including AIX, for the very reason of wanting a common code tree for all platforms.  HPE is correct that IBM works closely with SAP to optimize HANA code, a fact which should be lauded not criticized.  Apparently, HPE must not have such a relationship and are jealous?  What HPE does not understand is that regardless of who, IBM, Intel or other, contributes code to SAP or suggests modifications to code, SAP makes all decisions regarding that code, including support, and incorporates it into the common code tree meaning all platforms can benefit if the code is not related to a specific, proprietary instruction set.  When Intel contributed code for TSX, Power HANA was not able to use this code, but with appropriate modifications, SAP was able to add the code to call IBM’s similar “Transactional Memory” calls.  Now, there is simple logic which ensures the appropriate call is made depending on the underlying processor architecture.  Likewise, when IBM saw that the huge number of threads in its architecture might push limits in HANA, it worked with SAP to improve the thread and workload dispatch mechanisms in HANA.  When Intel released their Broadwell-EX 24-core chips and SAP approved large socket counts, these systems would have hit the same threading issues, but with the new mechanisms already in place, were able to benefit from IBM & SAP’s joint effort.  Maybe HPE means that SAP has to compile the same code as used for Intel systems on the Power platform.  Well duh, it is a different chip architecture, so this is computer science 101, but hardly a different “version”.
  • Release priority – #1 Intel, #2 Power. Wrong again HPE!  HANA 2.0 released simultaneously on Intel and Power, as they did for S/4HANA 1610 on-prem edition, support for SoH with HANA 2.0, etc.  Where do you get your misinformation HPE?  This information is widely available on SAP’s Service Marketplace and the SAP PAM.
  • Sizes supported – HPE shows Power support of “only” 4.8TB for BW, 9TB for SoH vs. 24TB for Intel and No scale-out HANA on Power – I will give HPE the benefit of the doubt on the 4.8TB statement as 6TB just came out, but the “only” part is strange in that in the same table it shows “only” 4TB support on Intel. As to 9TB SoH and lack of scale-out HANA, both are wrong and have been for a while with 16TB SoH available since December 4th, 2016, see SAP Note 2188482 and scale-out HANA since November 2015. As to the 24TB claim for Intel, the largest supported HANA appliance is 20TB, so HPE, once again, seems to be making up facts.

There were other lies, but I think you get the idea.  Here are a few suggestions:

To HPE management: Shame on you for permitting such behavior or if done with your knowledge, for encouraging it.  If you have any “integrity” (pun intended), you will fire the employees and managers responsible for knowingly spreading lies and will print a retraction in appropriate press sources and on your web site.  If you don’t, then you are demonstrating, loud and clear, that your company is not to be trusted.

To HPE employees: Unless your management takes the above suggestions to heart with appropriate action to rectify this wrong, I am not sure how you can sleep well working for a company that considers truth to be something to be sacrificed at their convenience.  Hope they are truthful about your benefits.

To SAP customers: I can only speak for myself; when a restaurant, retailer or manufacturer lies to me and/or the public, I refuse to ever do business with them again.  The old saying applies, “fool me once, shame on you, fool me twice, shame on me.”  When you consider the minimal differences, if any, in cost of acquisition between all HANA system providers on the market, including IBM Power

May 15, 2017 Posted by | Uncategorized | , , , , , , , , , | Leave a comment

Is your company ready to put S/4HANA into the cloud? – Part 3

The third of a 4 part discussion about corporate requirements in support of S/4HANA and questions to be asked of cloud providers in support of placing this landscape in the cloud.

  • What backups must be performed? Some cloud providers might include daily, weekly, incremental or no backups.  They may include raw image backups vs. database aware backups. Just make sure that whatever backups you require are supported and included in the price for the cloud services.
    • Should corporate backup solution be used? For flexibility reasons as well as visibility, you may prefer that a backup solution that you have tested and approved works in the cloud environment.  Or, perhaps, it is an audit requirement.
    • Are extra server(s) required for backup solution? Your security and audit departments may not permit your backups to share infrastructure, including the network, with any other clients in a provider’s cloud environment.  One or more servers with or without dedicated network infrastructure may be required.
    • How quickly must backups be performed, restored and what is the RTO after database corruption? SAP HANA backups can generally take their time as long as the aggregate transfer rate is sufficient to backup the entire database prior to the next backup.  Well, that is unless you want to be able to restore to the prior day in the event of database corruption in which case you may want the backup finished prior to a specific time on the same day.   Just make sure you think about this and have the infrastructure necessary to meet your backup speed priced.  Even more important is how quickly the backup can be restored as well as what services are offered to restore the backups and roll forward any logs that have been created since that backup was initiated, i.e. the RTO for getting back up and running.
    • Will backups be available to DR systems? Not to be overlooked are backups in DR.  Not only will you want to be able to take backups in DR, but you would also need to be able to restore from a backup in the primary site to the DR site, not so easily done if the primary site is truly down and unavailable.  This means that you would need the backup server to have bi-directional replication with DR site as well as testing to ensure this works correctly.  What incremental costs are required for the replicated backup bandwidth?
  • Security – First a disclaimer. I am not a security expert, so may be only addressing a subset of the real requirements.
    • How will corporate single sign-on operate with the cloud solution? Whether you use Microsoft Active Directory, CA SSO, IBM Tivoli Access Manager or one of the dozens of other products on the market, you are probably using this solution to authenticate and authorize users in SAP.  Make sure it can integrate with the potential S/4HANA system in the cloud.  Make sure that your security administrators can control policies, assign and revoke privileges and audit as necessary.
    • Must communications to/from cloud be encrypted and what solution will be used? We all know that hackers want to access your data for malicious reasons, financial gain or industrial espionage.  Do you want your key strokes and data to and from the cloud to transmit in clear text?  If not, which solution will you use and how might the use of that solution impact performance?  How about between application servers and database servers at the cloud provider?
    • How will data stored in cloud be secured? It is one thing to have your personal email stored on storage devices shared with millions of other users, but do your corporate polices allow for corporate databases to be located on storage devices that are shared with other customers?  If not, do you require dedicated devices, of what kind and at what cost?
    • How will backups be secured? We touched on backups earlier, but this is now specific to the physical media on which those backups are stored not to mention replicated to the DR site as well as any external media that you might require, e.g. tapes, DVDs or removable disks.  How can you be ensured that no one makes a copy, removes a disk, etc?
  • What are the non-production requirements? All of the above was just talking about production, but most customers have an even more extensive non-production landscape.  Many, if not most, of those same questions can be applied to non-production.  Remember, there are few employees that command a higher salary than your developers, whether internal or external.  They create corporate intellectual property and often work with copies of production data.  Their workloads vary based on project demands, phases of implementation or problems to be addressed.  Many customers utilize DR capacity or underutilized capacity on HA systems to address non-prod requirements, however this may not be an option in a cloud environment, or if it is, at what cost?
    • How will images be created/copied, managed, isolated, secured? You may use SAP LaMa (Landscape Manager previously know as Landscape Virtualization Manager (LVM), backup/restore, disk replication, TDMS, BDLS and/or custom scripts to populate non-prod systems.  Will those tools and techniques work in the cloud and at what cost?

 

The last part of this discussion will deal with migration challenges when moving to the cloud and lastly, a few of the reasons that are often used to justify a move to the cloud.

May 5, 2017 Posted by | Uncategorized | , , , , , , , | Leave a comment

Is your company ready to put S/4HANA into the cloud? – Part 2

And now, the details and rationale behind the questions posed in Part 1.

  • What is the expected memory size for the HANA DB? Your HANA instances may fit comfortably within the provider’s offerings, or may force a bare-metal option, or may not be offered at all.  Equally important is expected growth as you may start within one tier and end in another or may be unable to fit in a provider’s cloud environment.
  • What are your performance objectives and how will they be measured/enforced? This may not be that important for some non-production environments, but production is used to run part, or all, of a company.  The last thing you want is to find out that transaction performance is not measured or for which no enforcement for missing an objective exists.  Even worse, what happens if these are measured, but only up to the edge of the provider’s cloud, not inclusive of WAN latency?  Sub-second response time is usually required, but if the WAN adds .5 seconds, your end users may not find this acceptable.  How about if the WAN latency varies?  The only thing worse than poor performance is unpredictable performance.
    • Who is responsible for addressing any performance issues? No one wants finger pointing so is the cloud provider willing to be responsible for end-user performance including WAN latency and at what cost?
    • Is bare-metal required or if shared, how much overhead, how much over-commitment? One of the ways that some cloud providers offer a competitive price is using shared infrastructure, virtualized with VMware or PowerVM for example.  Each of these have different limits and overhead with VMware noted by SAP as having a minimum of 12% overhead and PowerVM with 0% as the benchmarks were run under PowerVM to start with.  Likewise, VMware environments are limited to 4TB per instance and often multiple different instances may not run on shared infrastructure based on a very difficult to understand set of rules from SAP.  PowerVM has no such limits or rules and allows up to 8 concurrent production instances, each up to 16TB for S/4 or SoH up to the physical limits of the system.  If the cloud provider is offering a shared environment, are they running under SAP’s definition of “supported” or are they taking the chance and running “unsupported”?  Lastly, if it is a shared environment, is it possible that your performance or security may suffer because of another client’s use of that shared infrastructure?
  • What availability is required? 99.8%? 9%?  99.95%? 4 nines or higher?  Not all cloud providers can address the higher limits, so you should be clear about what your business requires.
  • Is HA mandatory? HA is usually an option, at a higher price.  The type of HA you desire may, or may not, be offered by each cloud provider.  Testing of that HA solution periodically may, or may not be offered so if you need or expect this, make sure you ask about it.
    • For HA, what are the RPO, RTO and RTP time limits? Not all HA solutions are created equal.  How much data loss is acceptable to your business and how quickly must you be able to get back up and running after a failure?  RTP is a term that you may not have heard to often and refers to “Return to Processing”, i.e. it is not enough to get the system back to a point of full data integrity and ready to work, but the system must be at a point that the business expects with a clear understanding of what transactions have or have not been committed.  Imagine a situation where a customer places an order or paid a bill, but it gets lost or where you paid a supplier and mistakenly pay them a second time.
  • Is DR mandatory and what are the RTP, RTO and RTP time limits? Same rationale for these questions as for HA, once again DR, when available, always offered at an additional charge and highly dependent on the type of replication used, with disk based replication usually less expensive than HANA System Replication but with a longer RTO/RTP.
    • Incremental costs for DR replication bandwidth? Often overlooked is the network costs for replicating data from the primary site to the DR site, but clearly a line item that should not be overlooked.  Some customers may decide to use two different cloud providers for primary and DR in which case not only may pricing for each be different but WAN capacity may be even more critical and pricey.
    • Disaster readiness assessment, mock drills or full, periodic data center flips? Having a DR site available is wonderful provided when you actually need it, everything works correctly.  As this is an entire discussion unto itself, let it be said that every business recovery expert will tell you to plan and test thoroughly.  Make sure you discuss this with a potential cloud provider and have the price to support whatever you require included in their bid.

 

I said this would be a two part post, but there is simply too much to include in only 2 parts, so the parts will go on until I address all of the questions and issues.

May 4, 2017 Posted by | Uncategorized | , , , , , , , | Leave a comment

What should you do when your LoBs say they are not ready for S/4HANA – part 2, Why choice of Infrastructure matters

Conversions to S/4HANA usually take place over several months and involve dozens to hundreds of steps.  With proper care and planning, these projects can run on time, within budget and result in a final Go-Live that is smooth and occurs within an acceptable outage window.  Alternately, horror stories abound of projects delayed, errors made, outages far beyond what was considered acceptable by the business, etc.  The choice of infrastructure to support a conversion may the last thing on anyone’s mind, but it can have a dramatic impact on achieving a successful outcome.

A conversion includes running many pre-checks[i] which can run for quite a while[ii] which implies that they can drive CPU utilization to high levels for a significant duration and impact other running workloads.  As a result, consultants routinely recommend that you make a copy of any system, especially production, against which you will run these pre-checks.  SAP recommends that you run these pre-checks against every system to be converted, e.g. Development, Test, Sandbox, QA, etc.  If they are being used for on-going work, it may be advisable to also make a copy of them and run those copies on other systems or within virtual machines which can be limited to avoid causing performance issues with other co-resident virtual machines.

In order to find issues and correct them, conversion efforts usually involve a phased approach with multiple conversions of support systems, e.g. Dev, Test, Sandbox, QA using a tool such as SAP’s Systems Update Manager with the Database Migration Option (SUM w/DMO).  One of the goals of each run is to figure out how long it will take and what actions need to be taken to ensure the Go-Live production conversion completes within the required outage window, including any post-processing, performance tuning, validation and backups.

In an attempt to keep expenses low, many customers will choose to use existing systems or VMs or systems in addition to a new “target” system or systems if HA is to be tested.  This means that the customer’s network will likely be used in support of these connections.  Taken together, the use of shared infrastructure components may come into opposition with events among the shared components which can impacts these tests and activities.  For example, if a VM is used but not enough CPU or network bandwidth is provided, the duration of the test may extend well beyond what is planned meaning more cost for per-hour consulting and may not provide the insight into what needs to be fixed and how long the actual migration may take.  How about if you have plenty of CPU capacity or even a dedicated system, but the backup group decides to initiate a large database backup at the same time and on the same network as your migration test is using.  Or maybe, you decide to run a test at a time that another group, e.g. operations, needs to test something that impacts it or when new equipment or firmware is being installed and modifications to shared infrastructure are occurring, etc.  Of course, you can have good change management and carefully arrange when your conversion tests will occur which means that you may have restricted windows of opportunity at times that are not always convenient for your team.

Let’s not forget that the application/database conversion is only one part of a successful conversion.  Functional validation tests are often required which could overwhelm limited infrastructure or take it away from parallel conversion tasks.  Other easily overlooked but critical tasks include ensuring all necessary interfaces work; that third party middleware and applications install and operate correctly; that backups can be taken and recovered; that HA systems are tested with acceptable RPO and RTO; that DR is set up and running properly also with acceptable RPO and RTO.  And since this will be a new application suite with different business processes and a brand new Fiori interface, training most likely will be required as well.

So, how can the choice of infrastructure make a difference to this almost overwhelming set of issues and requirements?  It comes down to flexibility.  Infrastructure which is built on virtualization allows many of these challenges to be easily addressed.  I will use an existing IBM Power Systems customer running Oracle, DB2 or Sybase to demonstrate how this would work.

The first issue dealt with running pre-checks on existing systems.  If those existing systems are Power Systems and enough excess capacity is available, PowerVM, the IBM Virtualization Hypervisor, allows a VM to be  started with an exact copy of production, passed through normal post-processing such as BDLS to create a non-production copy or cloned and placed behind a network firewall.  This VM could be fenced off logically and throttled such that production running on the same system would always be given preference for cpu resources.  By comparison, a similar database located on an x86 system would likely not be able to use this process as the database is usually running on bare-metal systems for which no VM can be created.

Alternately, for Power Systems, the exact same process could be utilized to carve out a VM on a new HANA target system and this is where the real value starts to emerge.  Once a copy or clone is available on the target HANA on Power system, as much capacity can be allocated to the various pre-checks and related tasks as needed without any concern for the impact on production or the need to throttle these processes thereby optimizing the duration of these tasks.  On the same system, HANA target VMs may be created.  As mock conversions take place, an internal virtual network may be utilized.  Not only is such a network faster by a factor of 2 or more, but it is dedicated to this single purpose and completely unaffected by anything else going on within the datacenter network.  No coordination is required beyond the conversion team which means that there is no externally imposed delay to begin a test or constraints on how long such a test may take or, for that matter, how many times such a test may be run.

The story only gets better.  Remember, SAP suggests you run these pre-checks on all non-prod landscapes.  With PowerVM, you may fire up any number of different copy/clone VMs and/or HANA VMs.  This means that as you move from one phase to the next, from one instance to the next, from conversion to production, run a static validation environment while other tasks continue, conduct training classes, run many different phases for different projects at the same time, PowerVM enables the system to respond to your changing requirements.  This helps avoid the need to purchase extra interim systems and buy when needed, not significantly ahead of time due to the inflexibility of other platforms.  You can even simulate an HA environment to allow you to test your HA strategy without needing a second system, up to the physical limits of the system, of course.  This is where a tool like SAP’s TDMS, Test Data Migration Server, might come in very handy.

And when it comes time for the actual Go-Live conversion, the running production database VM may be moved live from the “old” system, without any downtime, to the “new” system and the migration may now proceed using the virtual, in-memory network at the fastest possible speed and with all external factors removed.  Of course, if the “old” system is based on POWER8, it may then be used/upgraded for other HANA purposes.  Prior Power Systems as well as current generation POWER8 systems can be used for a wide variety of other purposes, both SAP and those that are not.

Bottom line: The choice of infrastructure can help you eliminate external influences that cause delays and complications to your conversion project, optimize your spend on infrastructure, and deliver the best possible throughput and lowest outage window when it comes to the Go-Live cut-over.  If complete control over your conversion timeline was not enough, avoidance of delays keeps costs for non-fixed cost resources to a minimum.  For any SAP customer not using Power Systems today, this flexibility can provide enormous benefits, however the process of moving between systems would be somewhat different.  For any existing Power Systems customer, this flexibility makes a move to HANA on Power Systems almost a no-brainer, especially since IBM has so effectively removed TCA as a barrier to adoption.

[i] https://blogs.sap.com/2017/01/20/system-conversion-to-s4hana-1610-part-2-pre-checks/

[ii] https://uacp.hana.ondemand.com/http.svc/rc/PRODUCTION/pdfe68bfa55e988410ee10000000a441470/1511%20001/en-US/CONV_OP1511_FPS01.pdf page 19

April 26, 2017 Posted by | Uncategorized | , , , , , , , , , | Leave a comment

HANA on Power hits the Trifecta!

Actually, trifecta would imply only 3 big wins at the same time and HANA on Power Systems just hit 4 such big wins.

Win 1 – HANA 2.0 was announced by SAP with availability on Power Systems simultaneously as with Intel based systems.[i]  Previous announcements by SAP had indicated that Power was now on an even footing as Intel for HANA from an application support perspective, however until this announcement, some customers may have still been unconvinced.  I noticed this on occasion when presenting to customers and I made such an assertion and saw a little disbelief on some faces.  This announcement leaves no doubt.

Win 2 – HANA 2.0 is only available on Power Systems with SUSE SLES 12 SP1 in Little Endian (LE) mode.  Why, you might ask, is this a “win”?  Because true database portability is now a reality.  In LE mode, it is possible to pick up a HANA database built on Intel, make no modifications at all, and drop it on a Power box.  This removes a major barrier to customers that might have considered a move but were unwilling to deal with the hassle, time requirements, effort and cost of an export/import.  Of course, the destination will be HANA 2.0, so an upgrade from HANA 1.0 to 2.0 on the source system will be required prior to a move to Power among various other migration options.   This subject will likely be covered in a separate blog post at a later date.  This also means that customers that want to test how HANA will perform on Power compared to an incumbent x86 system will have a far easier time doing such a PoC.

Win 3 – Support for BW on the IBM E850C @ 50GB/core allowing this system to now support 2.4TB.[ii]  The previous limit was 32GB/core meaning a maximum size of 1.5TB.  This is a huge, 56% improvement which means that this, already very competitive platform, has become even stronger.

Win 4 – Saving the best for last, SAP announced support for Suite on HANA (SoH) and S/4HANA of up to 16TB with 144 cores on IBM Power E880 and E880C systems.ii  Several very large customers were already pushing the previous 9TB boundary and/or had run the SAP sizing tools and realized that more than 9TB would be required to move to HANA.  This announcement now puts IBM Power Systems on an even footing with HPE Superdome X.  Only the lame duck SGI UV 300H has support for a larger single image size @ 20TB, but not by much.  Also notice that to get to 16TB, only 144 cores are required for Power which means that there are still 48 cores unused in a potential 192 core systems, i.e. room for growth to a future limit once appropriate KPIs are met.  Consider that the HPE Superdome X requires all 16 sockets to hit 16TB … makes you wonder how they will achieve a higher size prior to a new chip from Intel.

Win 5 – Oops, did I say there were only 4 major wins?  My bad!  Turns out there is a hidden win in the prior announcement, easily overlooked.  Prior to this new, higher memory support, a maximum of 96GB/core was allowed for SoH and S/4HANA workloads.  If one divides 16TB by 144 cores, the new ratio works out to 113.8GB/core or an 18.5% increase.  Let’s do the same for HPE Superdome X.  16 sockets times 24 core/socket = 384 cores.  16TB / 384 cores = 42.7GB/core.  This implies that a POWER8 core can handle 2.7 times the workload of an Intel core for this type of workload.  Back in July, I published a two-part blog post on scaling up large transactional workloads.[iii]  In that post, I noted that transactional workloads access data primarily in rows, not in columns, meaning they traverse columns that are typically spread across many cores and sockets.  Clearly, being able to handle more memory per core and per socket means that less traversing is necessary resulting in a high probability of significantly better performance with HANA on Power compared to competing platforms, especially when one takes into consideration their radically higher ccNUMA latencies and dramatically lower ccNUMA bandwidth.

Taken together, these announcements have catapulted HANA on IBM Power Systems from being an outstanding option for most customers, but with a few annoying restrictions and limits especially for larger customers, to being a best-of-breed option for all customers, even those pushing much higher limits than the typical customer does.

[i] https://launchpad.support.sap.com/#/notes/2235581

[ii] https://launchpad.support.sap.com/#/notes/2188482

[iii] https://saponpower.wordpress.com/2016/07/01/large-scale-up-transactional-hana-systems-part-1/

December 6, 2016 Posted by | Uncategorized | , , , , , , , , , , , , , , , , , , , | 3 Comments

ASUG Webinar next week – Scale-Up Architecture Makes Deploying SAP HANA® Simple

On October 4, 2016, Joe Caruso, Director – ERP Technical Architecture at Pfizer, will join me presenting on an ASUG Webinar.  Pfizer is not only a huge pharmaceutical company but, more importantly, has implemented SAP throughout their business including just about every module and component that SAP has to offer in their industry.  Almost 2 years ago, Pfizer decided to begin their journey to HANA starting with BW.  Pfizer is a leader in their industry and in the world of SAP and has never been afraid to try new things including a large scale PoC to evaluate scale-out vs. scale-up architectures for BW.  After completion of this PoC, Pfizer made a decision regarding which one worked better, proceeded to implement BW HANA and had their go-live just recently.  Please join us to hear about this fascinating journey.  For those that are ASUG members, simply follow this link.

If you are an employee of an ASUG member company, either an Installation or Affiliate member, but not registered for ASUG, you can follow this link to join at no cost.  That link also offers the opportunity for companies to join ASUG, a very worthwhile organization that offers chapter meetings all over North America, a wide array of presentations at their annual meeting during Sapphire, the BI+Analytics conference coming up in New Orleans, October 17 – 20, 2016, hundreds of webinars, not to mention networking opportunities with other member companies and the ability to influence SAP through their combined power.

This session will be recorded and made available at a later date for those not able to attend.  When the link to the recording is made available, I will amend this blog post with that information.

September 29, 2016 Posted by | Uncategorized | , , , , , , , | 5 Comments