SAPonPower

An ongoing discussion about SAP infrastructure

There they go again, HPE misleading customers about Superdome X futures

When the acquisition of SGI by HPE was announced last year, I was openly skeptical about HPE’s motives and wrote a blog post to discuss the potential reasons and implications behind this decision. I felt pretty certain that HPE would not retain both Superdome X (SD-X) and the high end SGI system, now called MC990 X as they play in the same space. I speculated that the MC990 X would not be the winner because it has inferior memory reliability technology and uses a hand wired matrix to interconnect different nodes compared to SD-X which uses a backplane to connect nodes to switches for connectivity to other nodes.

At SapphireNow this past week, I learned, from a couple of different sources, that Intel’s next generation platform, “Purely” using “Skylake” processors include so many significant changes that SD-X will very likely not be able to support them. Those changes include a new chip level interconnect technology “UltraPath” (UPI), as a follow on to QPI, faster memory and a larger footprint which, according to pictures available on the internet, appears to be at least 25% larger. This makes sense since the high end Skylake processor will have up to 28 cores versus Broadwell’s 24 cores and includes 25% more L3 cache but uses the same 14nm lithography. Though specs are limited, my sources also believe this new platform will require more power and cooling than the “Broadwell” chips used in SD-X today. SD-X cell boards are already pretty compact nodes, which made sense when they were trying to deliver a “converged” solution, but which means that any increase in footprint of any component could push these boards beyond their limits. Clearly, the UPI change alone would require HPE’s XNC2 cell controller to be redesigned even if no power or cooling limits were exceeded. Considering how few of these chips are used per year, it would be impractical for HPE to maintain a development program for both XNC2 (and the associated SX3000 switch) in addition to the SGI acquired NUMALink7 cell controllers. The speculation is therefore that SD-X will, in fact, be the loser and HPE will utilize the MC990 X as the go forward, high end platform.

If this turns out to be a correct conclusion, then a lot of very large customers are likely to be in for a major shock since they have purchased SD-X with the expectation of being able to upgrade it and add on like architecture systems for their SAP HANA workloads as they grow over time as well as for other large memory applications. Had they been informed about the upcoming obsolescence of SD-X, many of these customers may have made a different purchasing decision. I am not convinced they would have purchased the MC990 X instead for a variety of reasons, not the least of which is the lack of robust memory protection, available with SD-X, known as RAS, Lockstep or DDDC+ mode, but not with MC990 X. Lacking this, the failure of a chip on a memory DIMM could result in a HANA failure, system failure or the need to immediately shut down the system to diagnose and fix the failed DIMM. Most SAP customers running Suite on HANA or S/4HANA are unlikely to find this level of protection to be acceptable, especially when they are running massive HANA transactional systems.

On the subject of reliability, one of the highest exposures that any system has are the physical connections and the MC990 X has a lot of them to form the ccNUMA mesh. Each cable has a connector on each end which are then inserted into the appropriate port on each pair of node controllers. Any time a cable is inserted into any sort of receptacle, there is a possibility that it will fail due to the pressure required to insert it or corrosion. A system with 2 nodes has 8 such cables, 2 between the pair of controllers on each node and 4 between the nodes. A system with 5 nodes has approximately 50 such cables. While not necessarily catastrophic, a failure of one of these cables would require a planned outage as soon as possible as there would be a performance impact from the loss of the connection.

Another limitation of the MC990 X is the lack of virtualization technology. As each node in a MC990 X contains 4 sockets, the level of granularity of this system is so coarse that very poor utilization is likely to result and a massive increase in footprint will be required to handle the same set of workloads that might have been previously handled by SD-X. Physical partitioning is available, but based on one or more 4 socket nodes, such a waste as to be completely irrelevant.

If this speculation proves to be true, and HPE has not been sharing this roadmap with customers, then those customers have been convinced to invest in an obsolete platform and they should be furious. For those in the process of making a decision, they should be asking HPE the right questions about SD-X product futures and then taking that into account as well as the product weaknesses of MC990 X.

I should note that IBM has but one go forward architecture for Power Systems based on the on-chip interconnect technology that has been included and evolving since POWER4. Customers that invested in Power technology have seen a consistent 3 to 4 year cycle to the next generation with those who purchased high end models often having the option to upgrade from one generation to the next.

If you read my blog post last week, you may recall that I pointed out how HP was lying to customers about HANA on IBM Power Systems.  Now we learn that they may also be doing the same sort of thing about their own products.  Me thinks a trend is emerging here!?!?

May 23, 2017 Posted by | Uncategorized | , , , , , , , , , | 2 Comments

HPE, in an act of desperation, is spreading misinformation about SAP HANA on IBM Power Systems

Misinformation is a poor characterization of HPE’s behavior.  HPE, or some of its employees, are showing customers charts with a variety of statements which are simply untrue.  In any normal definition, this is called a lie.  This is unethical and unprofessional.  I will repeat what it says in my profile, these are my opinions, not a reflection of those of IBM.

You may have seen the blog post from Vicente Moranta: https://www.linkedin.com/pulse/truth-wild-lies-being-told-hanaonpower-vicente-moranta or my own post on IBM Systems Blog: https://www.ibm.com/blogs/systems/can-you-tell-hana-on-ibm-power-systems-fact-from-fiction/ . At IBM, we have this set of ethical rules called “IBM Business Conduct Guidelines” (BCG).  This 15 to 20 page document is required reading every year with a mandatory test to ensure comprehensive understanding of these rules.  I can boil down one of the most important themes into two words: DON’T LIE!

For those of you who have been reading this blog for a while, you may question whether I am too verbose and that may be fair as I thoroughly research each subject and include attribution for claims, usually including direct links to the source of those claims.  I would never think of making up “facts” and, on rare occasion when a reader has informed me of a mistake, I always correct the mistake as well as include a comment to that effect.

Some background:  A few weeks ago, a customer sent us a list of questions about SAP HANA on IBM Power Systems.  At first, the questions seemed bizarre as they included some very pointed misunderstandings about HANA and SAP in general and IBM’s role with SAP in particular.  As I read them more thoroughly, I realized that someone or some entity had coached the customer.  This was confirmed when I received a copy of a HPE presentation from a completely different source with almost identically worded statements.  By the way, back to the BCG, IBM employees are not allowed to view much less share information from a competitor marked confidential and this presentation was not marked with anything, meaning it was being shown to customers with, or without, HPE management’s official knowledge.

Some of the lies it shares:

  • HPE has 99%+ share of the HANA market. It is kind of funny to note that this claim is contradicted in the same table where it shows 80% share for Intel.  I guess they are confusing SAP and SAP HANA markets which is misleading at best.  More importantly, SAP does not release market share information and even if they did, I think the Lenovo, Cisco, Dell and Fujitsu might together claim more than 1% of the market.
  • IBM, not SAP “delivers” HANA code to customers because they have access to SAP code and have created a “special” version of SAP HANA. Wow, it is hard to figure out where to start here.  SAP owns HANA and only they distribute code.  They refused to support other operating systems than Linux, including AIX, for the very reason of wanting a common code tree for all platforms.  HPE is correct that IBM works closely with SAP to optimize HANA code, a fact which should be lauded not criticized.  Apparently, HPE must not have such a relationship and are jealous?  What HPE does not understand is that regardless of who, IBM, Intel or other, contributes code to SAP or suggests modifications to code, SAP makes all decisions regarding that code, including support, and incorporates it into the common code tree meaning all platforms can benefit if the code is not related to a specific, proprietary instruction set.  When Intel contributed code for TSX, Power HANA was not able to use this code, but with appropriate modifications, SAP was able to add the code to call IBM’s similar “Transactional Memory” calls.  Now, there is simple logic which ensures the appropriate call is made depending on the underlying processor architecture.  Likewise, when IBM saw that the huge number of threads in its architecture might push limits in HANA, it worked with SAP to improve the thread and workload dispatch mechanisms in HANA.  When Intel released their Broadwell-EX 24-core chips and SAP approved large socket counts, these systems would have hit the same threading issues, but with the new mechanisms already in place, were able to benefit from IBM & SAP’s joint effort.  Maybe HPE means that SAP has to compile the same code as used for Intel systems on the Power platform.  Well duh, it is a different chip architecture, so this is computer science 101, but hardly a different “version”.
  • Release priority – #1 Intel, #2 Power. Wrong again HPE!  HANA 2.0 released simultaneously on Intel and Power, as they did for S/4HANA 1610 on-prem edition, support for SoH with HANA 2.0, etc.  Where do you get your misinformation HPE?  This information is widely available on SAP’s Service Marketplace and the SAP PAM.
  • Sizes supported – HPE shows Power support of “only” 4.8TB for BW, 9TB for SoH vs. 24TB for Intel and No scale-out HANA on Power – I will give HPE the benefit of the doubt on the 4.8TB statement as 6TB just came out, but the “only” part is strange in that in the same table it shows “only” 4TB support on Intel. As to 9TB SoH and lack of scale-out HANA, both are wrong and have been for a while with 16TB SoH available since December 4th, 2016, see SAP Note 2188482 and scale-out HANA since November 2015. As to the 24TB claim for Intel, the largest supported HANA appliance is 20TB, so HPE, once again, seems to be making up facts.

There were other lies, but I think you get the idea.  Here are a few suggestions:

To HPE management: Shame on you for permitting such behavior or if done with your knowledge, for encouraging it.  If you have any “integrity” (pun intended), you will fire the employees and managers responsible for knowingly spreading lies and will print a retraction in appropriate press sources and on your web site.  If you don’t, then you are demonstrating, loud and clear, that your company is not to be trusted.

To HPE employees: Unless your management takes the above suggestions to heart with appropriate action to rectify this wrong, I am not sure how you can sleep well working for a company that considers truth to be something to be sacrificed at their convenience.  Hope they are truthful about your benefits.

To SAP customers: I can only speak for myself; when a restaurant, retailer or manufacturer lies to me and/or the public, I refuse to ever do business with them again.  The old saying applies, “fool me once, shame on you, fool me twice, shame on me.”  When you consider the minimal differences, if any, in cost of acquisition between all HANA system providers on the market, including IBM Power

May 15, 2017 Posted by | Uncategorized | , , , , , , , , , | Leave a comment

Is your company ready to put S/4HANA into the cloud? – Part 3

The third of a 4 part discussion about corporate requirements in support of S/4HANA and questions to be asked of cloud providers in support of placing this landscape in the cloud.

  • What backups must be performed? Some cloud providers might include daily, weekly, incremental or no backups.  They may include raw image backups vs. database aware backups. Just make sure that whatever backups you require are supported and included in the price for the cloud services.
    • Should corporate backup solution be used? For flexibility reasons as well as visibility, you may prefer that a backup solution that you have tested and approved works in the cloud environment.  Or, perhaps, it is an audit requirement.
    • Are extra server(s) required for backup solution? Your security and audit departments may not permit your backups to share infrastructure, including the network, with any other clients in a provider’s cloud environment.  One or more servers with or without dedicated network infrastructure may be required.
    • How quickly must backups be performed, restored and what is the RTO after database corruption? SAP HANA backups can generally take their time as long as the aggregate transfer rate is sufficient to backup the entire database prior to the next backup.  Well, that is unless you want to be able to restore to the prior day in the event of database corruption in which case you may want the backup finished prior to a specific time on the same day.   Just make sure you think about this and have the infrastructure necessary to meet your backup speed priced.  Even more important is how quickly the backup can be restored as well as what services are offered to restore the backups and roll forward any logs that have been created since that backup was initiated, i.e. the RTO for getting back up and running.
    • Will backups be available to DR systems? Not to be overlooked are backups in DR.  Not only will you want to be able to take backups in DR, but you would also need to be able to restore from a backup in the primary site to the DR site, not so easily done if the primary site is truly down and unavailable.  This means that you would need the backup server to have bi-directional replication with DR site as well as testing to ensure this works correctly.  What incremental costs are required for the replicated backup bandwidth?
  • Security – First a disclaimer. I am not a security expert, so may be only addressing a subset of the real requirements.
    • How will corporate single sign-on operate with the cloud solution? Whether you use Microsoft Active Directory, CA SSO, IBM Tivoli Access Manager or one of the dozens of other products on the market, you are probably using this solution to authenticate and authorize users in SAP.  Make sure it can integrate with the potential S/4HANA system in the cloud.  Make sure that your security administrators can control policies, assign and revoke privileges and audit as necessary.
    • Must communications to/from cloud be encrypted and what solution will be used? We all know that hackers want to access your data for malicious reasons, financial gain or industrial espionage.  Do you want your key strokes and data to and from the cloud to transmit in clear text?  If not, which solution will you use and how might the use of that solution impact performance?  How about between application servers and database servers at the cloud provider?
    • How will data stored in cloud be secured? It is one thing to have your personal email stored on storage devices shared with millions of other users, but do your corporate polices allow for corporate databases to be located on storage devices that are shared with other customers?  If not, do you require dedicated devices, of what kind and at what cost?
    • How will backups be secured? We touched on backups earlier, but this is now specific to the physical media on which those backups are stored not to mention replicated to the DR site as well as any external media that you might require, e.g. tapes, DVDs or removable disks.  How can you be ensured that no one makes a copy, removes a disk, etc?
  • What are the non-production requirements? All of the above was just talking about production, but most customers have an even more extensive non-production landscape.  Many, if not most, of those same questions can be applied to non-production.  Remember, there are few employees that command a higher salary than your developers, whether internal or external.  They create corporate intellectual property and often work with copies of production data.  Their workloads vary based on project demands, phases of implementation or problems to be addressed.  Many customers utilize DR capacity or underutilized capacity on HA systems to address non-prod requirements, however this may not be an option in a cloud environment, or if it is, at what cost?
    • How will images be created/copied, managed, isolated, secured? You may use SAP LaMa (Landscape Manager previously know as Landscape Virtualization Manager (LVM), backup/restore, disk replication, TDMS, BDLS and/or custom scripts to populate non-prod systems.  Will those tools and techniques work in the cloud and at what cost?

 

The last part of this discussion will deal with migration challenges when moving to the cloud and lastly, a few of the reasons that are often used to justify a move to the cloud.

May 5, 2017 Posted by | Uncategorized | , , , , , , , | Leave a comment

Is your company ready to put S/4HANA into the cloud? – Part 2

And now, the details and rationale behind the questions posed in Part 1.

  • What is the expected memory size for the HANA DB? Your HANA instances may fit comfortably within the provider’s offerings, or may force a bare-metal option, or may not be offered at all.  Equally important is expected growth as you may start within one tier and end in another or may be unable to fit in a provider’s cloud environment.
  • What are your performance objectives and how will they be measured/enforced? This may not be that important for some non-production environments, but production is used to run part, or all, of a company.  The last thing you want is to find out that transaction performance is not measured or for which no enforcement for missing an objective exists.  Even worse, what happens if these are measured, but only up to the edge of the provider’s cloud, not inclusive of WAN latency?  Sub-second response time is usually required, but if the WAN adds .5 seconds, your end users may not find this acceptable.  How about if the WAN latency varies?  The only thing worse than poor performance is unpredictable performance.
    • Who is responsible for addressing any performance issues? No one wants finger pointing so is the cloud provider willing to be responsible for end-user performance including WAN latency and at what cost?
    • Is bare-metal required or if shared, how much overhead, how much over-commitment? One of the ways that some cloud providers offer a competitive price is using shared infrastructure, virtualized with VMware or PowerVM for example.  Each of these have different limits and overhead with VMware noted by SAP as having a minimum of 12% overhead and PowerVM with 0% as the benchmarks were run under PowerVM to start with.  Likewise, VMware environments are limited to 4TB per instance and often multiple different instances may not run on shared infrastructure based on a very difficult to understand set of rules from SAP.  PowerVM has no such limits or rules and allows up to 8 concurrent production instances, each up to 16TB for S/4 or SoH up to the physical limits of the system.  If the cloud provider is offering a shared environment, are they running under SAP’s definition of “supported” or are they taking the chance and running “unsupported”?  Lastly, if it is a shared environment, is it possible that your performance or security may suffer because of another client’s use of that shared infrastructure?
  • What availability is required? 99.8%? 9%?  99.95%? 4 nines or higher?  Not all cloud providers can address the higher limits, so you should be clear about what your business requires.
  • Is HA mandatory? HA is usually an option, at a higher price.  The type of HA you desire may, or may not, be offered by each cloud provider.  Testing of that HA solution periodically may, or may not be offered so if you need or expect this, make sure you ask about it.
    • For HA, what are the RPO, RTO and RTP time limits? Not all HA solutions are created equal.  How much data loss is acceptable to your business and how quickly must you be able to get back up and running after a failure?  RTP is a term that you may not have heard to often and refers to “Return to Processing”, i.e. it is not enough to get the system back to a point of full data integrity and ready to work, but the system must be at a point that the business expects with a clear understanding of what transactions have or have not been committed.  Imagine a situation where a customer places an order or paid a bill, but it gets lost or where you paid a supplier and mistakenly pay them a second time.
  • Is DR mandatory and what are the RTP, RTO and RTP time limits? Same rationale for these questions as for HA, once again DR, when available, always offered at an additional charge and highly dependent on the type of replication used, with disk based replication usually less expensive than HANA System Replication but with a longer RTO/RTP.
    • Incremental costs for DR replication bandwidth? Often overlooked is the network costs for replicating data from the primary site to the DR site, but clearly a line item that should not be overlooked.  Some customers may decide to use two different cloud providers for primary and DR in which case not only may pricing for each be different but WAN capacity may be even more critical and pricey.
    • Disaster readiness assessment, mock drills or full, periodic data center flips? Having a DR site available is wonderful provided when you actually need it, everything works correctly.  As this is an entire discussion unto itself, let it be said that every business recovery expert will tell you to plan and test thoroughly.  Make sure you discuss this with a potential cloud provider and have the price to support whatever you require included in their bid.

 

I said this would be a two part post, but there is simply too much to include in only 2 parts, so the parts will go on until I address all of the questions and issues.

May 4, 2017 Posted by | Uncategorized | , , , , , , , | Leave a comment

What should you do when your LoBs say they are not ready for S/4HANA – part 2, Why choice of Infrastructure matters

Conversions to S/4HANA usually take place over several months and involve dozens to hundreds of steps.  With proper care and planning, these projects can run on time, within budget and result in a final Go-Live that is smooth and occurs within an acceptable outage window.  Alternately, horror stories abound of projects delayed, errors made, outages far beyond what was considered acceptable by the business, etc.  The choice of infrastructure to support a conversion may the last thing on anyone’s mind, but it can have a dramatic impact on achieving a successful outcome.

A conversion includes running many pre-checks[i] which can run for quite a while[ii] which implies that they can drive CPU utilization to high levels for a significant duration and impact other running workloads.  As a result, consultants routinely recommend that you make a copy of any system, especially production, against which you will run these pre-checks.  SAP recommends that you run these pre-checks against every system to be converted, e.g. Development, Test, Sandbox, QA, etc.  If they are being used for on-going work, it may be advisable to also make a copy of them and run those copies on other systems or within virtual machines which can be limited to avoid causing performance issues with other co-resident virtual machines.

In order to find issues and correct them, conversion efforts usually involve a phased approach with multiple conversions of support systems, e.g. Dev, Test, Sandbox, QA using a tool such as SAP’s Systems Update Manager with the Database Migration Option (SUM w/DMO).  One of the goals of each run is to figure out how long it will take and what actions need to be taken to ensure the Go-Live production conversion completes within the required outage window, including any post-processing, performance tuning, validation and backups.

In an attempt to keep expenses low, many customers will choose to use existing systems or VMs or systems in addition to a new “target” system or systems if HA is to be tested.  This means that the customer’s network will likely be used in support of these connections.  Taken together, the use of shared infrastructure components may come into opposition with events among the shared components which can impacts these tests and activities.  For example, if a VM is used but not enough CPU or network bandwidth is provided, the duration of the test may extend well beyond what is planned meaning more cost for per-hour consulting and may not provide the insight into what needs to be fixed and how long the actual migration may take.  How about if you have plenty of CPU capacity or even a dedicated system, but the backup group decides to initiate a large database backup at the same time and on the same network as your migration test is using.  Or maybe, you decide to run a test at a time that another group, e.g. operations, needs to test something that impacts it or when new equipment or firmware is being installed and modifications to shared infrastructure are occurring, etc.  Of course, you can have good change management and carefully arrange when your conversion tests will occur which means that you may have restricted windows of opportunity at times that are not always convenient for your team.

Let’s not forget that the application/database conversion is only one part of a successful conversion.  Functional validation tests are often required which could overwhelm limited infrastructure or take it away from parallel conversion tasks.  Other easily overlooked but critical tasks include ensuring all necessary interfaces work; that third party middleware and applications install and operate correctly; that backups can be taken and recovered; that HA systems are tested with acceptable RPO and RTO; that DR is set up and running properly also with acceptable RPO and RTO.  And since this will be a new application suite with different business processes and a brand new Fiori interface, training most likely will be required as well.

So, how can the choice of infrastructure make a difference to this almost overwhelming set of issues and requirements?  It comes down to flexibility.  Infrastructure which is built on virtualization allows many of these challenges to be easily addressed.  I will use an existing IBM Power Systems customer running Oracle, DB2 or Sybase to demonstrate how this would work.

The first issue dealt with running pre-checks on existing systems.  If those existing systems are Power Systems and enough excess capacity is available, PowerVM, the IBM Virtualization Hypervisor, allows a VM to be  started with an exact copy of production, passed through normal post-processing such as BDLS to create a non-production copy or cloned and placed behind a network firewall.  This VM could be fenced off logically and throttled such that production running on the same system would always be given preference for cpu resources.  By comparison, a similar database located on an x86 system would likely not be able to use this process as the database is usually running on bare-metal systems for which no VM can be created.

Alternately, for Power Systems, the exact same process could be utilized to carve out a VM on a new HANA target system and this is where the real value starts to emerge.  Once a copy or clone is available on the target HANA on Power system, as much capacity can be allocated to the various pre-checks and related tasks as needed without any concern for the impact on production or the need to throttle these processes thereby optimizing the duration of these tasks.  On the same system, HANA target VMs may be created.  As mock conversions take place, an internal virtual network may be utilized.  Not only is such a network faster by a factor of 2 or more, but it is dedicated to this single purpose and completely unaffected by anything else going on within the datacenter network.  No coordination is required beyond the conversion team which means that there is no externally imposed delay to begin a test or constraints on how long such a test may take or, for that matter, how many times such a test may be run.

The story only gets better.  Remember, SAP suggests you run these pre-checks on all non-prod landscapes.  With PowerVM, you may fire up any number of different copy/clone VMs and/or HANA VMs.  This means that as you move from one phase to the next, from one instance to the next, from conversion to production, run a static validation environment while other tasks continue, conduct training classes, run many different phases for different projects at the same time, PowerVM enables the system to respond to your changing requirements.  This helps avoid the need to purchase extra interim systems and buy when needed, not significantly ahead of time due to the inflexibility of other platforms.  You can even simulate an HA environment to allow you to test your HA strategy without needing a second system, up to the physical limits of the system, of course.  This is where a tool like SAP’s TDMS, Test Data Migration Server, might come in very handy.

And when it comes time for the actual Go-Live conversion, the running production database VM may be moved live from the “old” system, without any downtime, to the “new” system and the migration may now proceed using the virtual, in-memory network at the fastest possible speed and with all external factors removed.  Of course, if the “old” system is based on POWER8, it may then be used/upgraded for other HANA purposes.  Prior Power Systems as well as current generation POWER8 systems can be used for a wide variety of other purposes, both SAP and those that are not.

Bottom line: The choice of infrastructure can help you eliminate external influences that cause delays and complications to your conversion project, optimize your spend on infrastructure, and deliver the best possible throughput and lowest outage window when it comes to the Go-Live cut-over.  If complete control over your conversion timeline was not enough, avoidance of delays keeps costs for non-fixed cost resources to a minimum.  For any SAP customer not using Power Systems today, this flexibility can provide enormous benefits, however the process of moving between systems would be somewhat different.  For any existing Power Systems customer, this flexibility makes a move to HANA on Power Systems almost a no-brainer, especially since IBM has so effectively removed TCA as a barrier to adoption.

[i] https://blogs.sap.com/2017/01/20/system-conversion-to-s4hana-1610-part-2-pre-checks/

[ii] https://uacp.hana.ondemand.com/http.svc/rc/PRODUCTION/pdfe68bfa55e988410ee10000000a441470/1511%20001/en-US/CONV_OP1511_FPS01.pdf page 19

April 26, 2017 Posted by | Uncategorized | , , , , , , , , , | Leave a comment

HANA on Power hits the Trifecta!

Actually, trifecta would imply only 3 big wins at the same time and HANA on Power Systems just hit 4 such big wins.

Win 1 – HANA 2.0 was announced by SAP with availability on Power Systems simultaneously as with Intel based systems.[i]  Previous announcements by SAP had indicated that Power was now on an even footing as Intel for HANA from an application support perspective, however until this announcement, some customers may have still been unconvinced.  I noticed this on occasion when presenting to customers and I made such an assertion and saw a little disbelief on some faces.  This announcement leaves no doubt.

Win 2 – HANA 2.0 is only available on Power Systems with SUSE SLES 12 SP1 in Little Endian (LE) mode.  Why, you might ask, is this a “win”?  Because true database portability is now a reality.  In LE mode, it is possible to pick up a HANA database built on Intel, make no modifications at all, and drop it on a Power box.  This removes a major barrier to customers that might have considered a move but were unwilling to deal with the hassle, time requirements, effort and cost of an export/import.  Of course, the destination will be HANA 2.0, so an upgrade from HANA 1.0 to 2.0 on the source system will be required prior to a move to Power among various other migration options.   This subject will likely be covered in a separate blog post at a later date.  This also means that customers that want to test how HANA will perform on Power compared to an incumbent x86 system will have a far easier time doing such a PoC.

Win 3 – Support for BW on the IBM E850C @ 50GB/core allowing this system to now support 2.4TB.[ii]  The previous limit was 32GB/core meaning a maximum size of 1.5TB.  This is a huge, 56% improvement which means that this, already very competitive platform, has become even stronger.

Win 4 – Saving the best for last, SAP announced support for Suite on HANA (SoH) and S/4HANA of up to 16TB with 144 cores on IBM Power E880 and E880C systems.ii  Several very large customers were already pushing the previous 9TB boundary and/or had run the SAP sizing tools and realized that more than 9TB would be required to move to HANA.  This announcement now puts IBM Power Systems on an even footing with HPE Superdome X.  Only the lame duck SGI UV 300H has support for a larger single image size @ 20TB, but not by much.  Also notice that to get to 16TB, only 144 cores are required for Power which means that there are still 48 cores unused in a potential 192 core systems, i.e. room for growth to a future limit once appropriate KPIs are met.  Consider that the HPE Superdome X requires all 16 sockets to hit 16TB … makes you wonder how they will achieve a higher size prior to a new chip from Intel.

Win 5 – Oops, did I say there were only 4 major wins?  My bad!  Turns out there is a hidden win in the prior announcement, easily overlooked.  Prior to this new, higher memory support, a maximum of 96GB/core was allowed for SoH and S/4HANA workloads.  If one divides 16TB by 144 cores, the new ratio works out to 113.8GB/core or an 18.5% increase.  Let’s do the same for HPE Superdome X.  16 sockets times 24 core/socket = 384 cores.  16TB / 384 cores = 42.7GB/core.  This implies that a POWER8 core can handle 2.7 times the workload of an Intel core for this type of workload.  Back in July, I published a two-part blog post on scaling up large transactional workloads.[iii]  In that post, I noted that transactional workloads access data primarily in rows, not in columns, meaning they traverse columns that are typically spread across many cores and sockets.  Clearly, being able to handle more memory per core and per socket means that less traversing is necessary resulting in a high probability of significantly better performance with HANA on Power compared to competing platforms, especially when one takes into consideration their radically higher ccNUMA latencies and dramatically lower ccNUMA bandwidth.

Taken together, these announcements have catapulted HANA on IBM Power Systems from being an outstanding option for most customers, but with a few annoying restrictions and limits especially for larger customers, to being a best-of-breed option for all customers, even those pushing much higher limits than the typical customer does.

[i] https://launchpad.support.sap.com/#/notes/2235581

[ii] https://launchpad.support.sap.com/#/notes/2188482

[iii] https://saponpower.wordpress.com/2016/07/01/large-scale-up-transactional-hana-systems-part-1/

December 6, 2016 Posted by | Uncategorized | , , , , , , , , , , , , , , , , , , , | 3 Comments

ASUG Webinar next week – Scale-Up Architecture Makes Deploying SAP HANA® Simple

On October 4, 2016, Joe Caruso, Director – ERP Technical Architecture at Pfizer, will join me presenting on an ASUG Webinar.  Pfizer is not only a huge pharmaceutical company but, more importantly, has implemented SAP throughout their business including just about every module and component that SAP has to offer in their industry.  Almost 2 years ago, Pfizer decided to begin their journey to HANA starting with BW.  Pfizer is a leader in their industry and in the world of SAP and has never been afraid to try new things including a large scale PoC to evaluate scale-out vs. scale-up architectures for BW.  After completion of this PoC, Pfizer made a decision regarding which one worked better, proceeded to implement BW HANA and had their go-live just recently.  Please join us to hear about this fascinating journey.  For those that are ASUG members, simply follow this link.

If you are an employee of an ASUG member company, either an Installation or Affiliate member, but not registered for ASUG, you can follow this link to join at no cost.  That link also offers the opportunity for companies to join ASUG, a very worthwhile organization that offers chapter meetings all over North America, a wide array of presentations at their annual meeting during Sapphire, the BI+Analytics conference coming up in New Orleans, October 17 – 20, 2016, hundreds of webinars, not to mention networking opportunities with other member companies and the ability to influence SAP through their combined power.

This session will be recorded and made available at a later date for those not able to attend.  When the link to the recording is made available, I will amend this blog post with that information.

September 29, 2016 Posted by | Uncategorized | , , , , , , , | 5 Comments

SAP HANA on Power @ IBM Edge

SAP HANA on Power is building momentum at a break neck pace at IBM and IBM’s commitment to HANA is quite apparent when you look at the amount of time and attention being dedicated to this topic at IBM’s premier systems event, IBM Edge 2016.  The information presented below is copied from a newsletter that my colleague, Bob Wolf, shared with his distribution list.

________________________

In an odd coincidence, SAP, IBM, and even Oracle are all holding major customer conferences on the same week in September.  IBM’s Edge 2016 conference will be held in Las Vegas on September 19-22nd at the MGM Grand. It will feature over 1,000 technical sessions revolving around infrastructure, Linux, Cloud, and yes, SAP.

There will be a total of 25 sessions related to SAP at the Edge 2016 conference. Many of these sessions will discuss various aspects and customer experiences with respect to HANA on Power. There will also be SAP related sessions covering cloud, IBM Flash and NVMe storage, as well as DB2. In addition to IBM presenters, there will be a number of HANA on Power sessions presented by a number of customers including Bosch, Ctac, CenturyLink, CPFL Energia, Deloitte, and South Shore. Two of IBM’s business partners, Meridian and Ciber will each be presenting a session.

Many of the IBM speakers covering the SAP topics will be shuttling back and forth between the IBM Edge conference at the MGM Grand and SAP’s TechEd conference at the Venetian. If you see any of the presenters wearing running shoes, rollerblades, or riding on a unicycle or hoverboard, they are not trying to make a fashion statement as much as they are just trying to run the 2 miles between the two conferences.

In case you would like to attend IBM’s Edge 2016 conference and haven’t registered already, here are some links for more information on the conference including how to register. Further down are the descriptions of the 25 SAP related sessions.Edge

Attend IBM Edge 2016 and join the IT professionals, IT leaders and business leaders who will design, build and deliver infrastructure for the cognitive era.

Here’s what you won’t want to miss:
Over 1,000 technical sessions, demonstrations, deep dives, certifications, and labs
Birds of Feather panel discussions to learn from the experiences of fellow IT professionals
Client success stories and provocative presentations from industry thought leaders
Networking opportunities with experts, peers and solution providers in our state of the art Solution EXPO
“Edge at Night” on Tuesday evening:
Enjoy a concert by Train – the GRAMMY Award-winning band.
Before and after the concert, meet with other attendees pool-side for hors d’oeuvres, beverages and networking.
Register now!

Attached below are short descriptions of all 25 the SAP-related sessions featured at Edge 2016.

1) PAD-1623: Architecting SAP HANA for Availability, Flexibility and Scalability to Handle Large Databases
SAP HANA is one of the hottest products around, and no system offers better characteristics to enable large-scale transaction processing and concurrent analytics than IBM Power Systems. This session will offer a deep-dive into the issues that must be addressed to deliver maximum availability while keeping costs under control through effective use of PowerVM. Of significant importance is the dilemma that many very large customers must consider: how to determine the best large-scale system for SAP Business Suite 4 SAP HANA (or SAP S/4HANA) lacking any specific benchmarks. This session will provide a methodology by which to evaluate both IBM and non-IBM systems for this purpose.
Speakers
Alfred Freudenberger, IBM

2) PAD-1922: Benefits of Deploying SAP HANA on IBM Power Systems and Future Roadmap
The joint program between IBM and Bosch will present a case study highlighting the benefits of running SAP HANA on IBM Power Systems. Bosch will review why they chose IBM Power Systems as their platform of choice, their experience with the platform, and the benefits they received. Following this, we will present the roadmap for SAP HANA on Power to showcase what you can expect in the near future.
Speakers
Sara Cohen, IBM
Erik Thorwirth, BOSCH

3) PAD-1921: Client Roundtable: SAP HANA on IBM Power Systems Experiences
Join this roundtable of clients who have deployed IBM Power Systems for SAP HANA. The panelists will discuss the benefits they have already achieved by using Power as the flexible infrastructure for HANA deployments, and why they chose Power to replace existing x86 servers running HANA. Attendees will also hear views on market trends and future joint IBM/SAP plans for HANA.
Speakers
Vicente Moranta, IBM
Niek Verhaar, Ctac N.V.
Erik Thorwirth, BOSCH
Volker Fischer, BOSCH
Claude Bernier, South Shore

4) CCL-2387: Create a Fully Managed SAP HANA Environment in Minutes!
Participate in an interactive demo that walks you through entering a provisioning request process for a fully managed SAP Cloud environment. Then work with the IBM SAP on Cloud Benefits Estimator tool, where you can gauge your potential benefits in the areas of economics, availability, customer reach, innovation and security. See how you can deploy SAP HANA quicker than you ever thought possible.
Speakers
Brian Burke, IBM

5) PAD-1163: Customer Experiences with Installing and Running SAP HANA on Linux on IBM Power Systems
Based on over a dozen customer installs of SAP HANA on Linux on IBM Power Systems, this talk will discuss: Why customers choose Linux on Power for HANA; what customers like about Linux on Power for HANA; what customers are using for their High Availability solution; what storage strategies customers have implemented; what applications customers are running; and what their production workloads look like. Come find out what to plan for with SAP Hana on Power, and how to architect a successful solution.
Speakers
Kurt Koehle, IBM

6) PAD-1509: Dynamic Power Cloud Manager: Concurrent Hybrid Operation of AIX, IBM i, Linux and SAP HANA on Power
In 2014, FRITZ & MACZIOL won an IBM Beacon Award for “Outstanding IT Transformation Solution – Power Systems” for the Dynamic Power Cloud Manager (DPCM). This comprehensive administration and automation solution for IBM Power environments enables you to manage and automate AIX, IBM I, Linux and even SAP HANA concurrently on one or more IBM Power systems via the same, convenient GUI. It can deliver significant cost and time savings, reduce errors—and in case of failure a restore takes only a few mouse-clicks. Attend this session to learn how easy it can be to manage Power environments in this “smarter” way, and hear about its potential compliance, cost reduction and standardization benefits.
Speakers
Rainer Schilling, FRITZ & MACZIOL Software und Computervertrieb GmbH

7) PAD-2543: Enterprise HANA
SAP HANA on IBM Power Systems offers enterprise virtualization capabilities such as capacity on-demand and dynamic resizing of compute and memory resources for on-premise and cloud deployments. These capabilities, along with Live Partition Mobility, high availability and disaster recovery features, bring unprecedented flexibility and allow customers to achieve lower TCO. Come hear client perspectives on these benefits.
Speakers
Niek Verhaar, Ctac N.V.
Erik Thorwirth, BOSCH
Mysore S. Srinivas, IBM

8) PAD-1044: Extended Solutions for SAP HANA and IBM Power Systems Landscapes
This presentation will provide an extended overview of the most important enterprise solutions encapsulating the SAP HANA and IBM Power and Storage landscapes. It will discuss the portfolio of architectural elements needed to increase enterprise-class high availability and simplify disaster recovery using different data replication methods or more common backup/restore/recovery solutions. This session is an extension of the session “Optimal Deployment of SAP HANA on IBM Power and IBM Storage Platforms Based on Customer Experience.”
Speakers
Damir Rubic, IBM

9) PAD-2509: How CenturyLink Optimized Its SAP and SAP HANA Deployments with a Hybrid SAP Infrastructure
At CenturyLink, we strategically leveraged a hybrid SAP infrastructure on IBM Power Systems, giving us investment protection in addressing today’s SAP business requirements while positioning us for our future SAP HANA requirements. We weren’t exactly sure when we would be ready for SAP HANA; however, we knew we needed to add capacity and upgrade our current SAP environment. Our POWER8 solution gave us flexibility: regardless of the direction and timing we chose, it allowed us to run our traditional SAP workload while giving us the ability to evaluate benefits of HANA on POWER8. Once the decision was made to move ECC to HANA, the infrastructure investments offered significant value over the alternatives.

Speakers
Odell Riley, CenturyLink
Connie Walden, CenturyLink

10) WET-2626: How to Choose the Best Cloud Provider to Maximize the Value of Your SAP Deployment
More and more companies are moving their SAP systems to the cloud—and they’re turning to third-party experts for help. But you can’t risk partnering with the wrong cloud services provider (CSP). How do you choose the right CSP with the right expertise for your business, SAP landscape and workloads, while ensuring that all stakeholder concerns (CIO, CFO, LOB managers and SAP project leads) are addressed? You’ll walk away from this session with a list of what to look for when choosing a CSP, what questions to ask, and the determining factors to help you that decision.
Speakers
Brian Burke, IBM
John Harris, IBM

11) CDE-1383: IBM POWER8 and DB/2 BLU: The Foundation of a Cognitive Analytics Platform
In this session, we will share a real-world customer example showcasing the advantages of IBM POWER8 in combination with IBM DB2 BLU. We will discuss the decision process and consider alternatives, such as SAP HANA and other options. We also will provide information on the sizing process and the architectural planning, including TCO/ROI and delivery considerations. In addition, we will explore other advantages, such as IBM PowerHA, AIX Enterprise Edition (EE) and Capacity on Demand (CoD), which provided additional value. The solution was deployed on four IBM Power System E850 servers.
Speakers
Ulrich Walter, IBM

12) PAD-2305: Implementing S4/HANA on IBM Power Systems to Maximize Performance for Analytics and Big Data
This session covers the current implementation of S4/HANA and reviews lessons learned on the IBM Power Systems platform for analytics. We will also share the performance benefits of the Power Systems platform for enterprise SAP workloads that can help reduce cost and increase productivity. In addition, we will describe how the virtualization built into the Power Systems architecture can allow multiple discrete production instances in virtual machines on the same physical server, thus reducing TCO. Finally, we will review the current state of SAP on Power Systems in the IBM cloud, and its future roadmap.
Speakers
Balbir Wadhwa, Deloitte
James Seaman, IBM

13) PAD-2342: Leverage the Full Capabilities of IBM Power Systems to Run SAP HANA on a Flexible, In-Memory Cloud
The session will offer insight into how to run SAP HANA environments on a cloud infrastructure built with IBM PowerVM and IBM hardware. Find out why you should choose IBM technology to create a highly scalable platform that is optimized to run in-memory SAP solutions, and what the design and architecture concepts look like. You’ll also hear an overview of how Ctac built their platform with IBM components, the project approach (smooth cooperation with IBM), challenges and necessary SAP performance tests. Ctac will also share the capabilities of their platform and how they can cope with scalability and capacity on-demand.
Speakers
Niek Verhaar, Ctac N.V.

14) SRA-2554: Leverage the Full Capabilities of IBM Technology to Run SAP HANA on a Flexible In-Memory Cloud
Ctac, a business consultant and cloud provider in the Netherlands, wanted to enhance analytics support for its customers with an in-memory cloud solution that could deliver high performance and simple scalability. This session will give attendees a technical deep-dive into how to run SAP HANA environments on a cloud infrastructure built on IBM PowerVM and IBM FlashSystem. We will describe how to design the architecture of a highly scalable platform that is optimized to run in-memory SAP solutions, share our business challenges and project approach, present the necessary SAP performance test, and demonstrate the capabilities that help us cope with scalability and capacity changes on-demand.
Speakers
Eric Sperley, IBM
Niek Verhaar, Ctac N.V.

15) PAD-1243: Major Client SAP Landscape Transformation to SAP HANA on POWER8 in a Private Cloud Environment
This talk describes a first-of-a-kind SAP HANA on IBM Power Systems private cloud solution in the datacenter of a large Automotive customer. The solution includes a fully automated installation of Linux LPARs through PowerVC, as well as the unattended installation of SAP HANA with the required storage, CPU and memory. Now the customer’s development team can deploy SAP HANA installations on-demand. As an orchestration tool, we use IBM Cloud Orchestrator. For the infrastructure, the client chose IBM Power System E880 servers and IBM Elastic Storage Server (ESS) with an InfiniBand network. The session will also cover different use cases we implemented to modify the deployed SAP HANA installations (SAP BW, SAP Business Suite on HANA).
Speakers
Carsten Dieterle, IBM
Dietmar Wierzimok, IBM

16) PBD-1262: Meeting Ultra Stringent HA and DR SLAs for an SAP Ecosystem with IBM Power Systems and Storage
A global manufacturing company needed its SAP ecosystem to be available 24×7, with SLAs of one hour for disaster recovery (DR) and ten minutes for high availability (HA). Hear a real-world case study on how these requirements were met using IBM Power System S824 servers and IBM Storwize V7000 storage–while lowering TCO. After the rollout the client successfully ran a live DR drill on the production environment within the defined SLA. This IBM solution is now a reference for this client’s sister companies.
Speakers
Hoson Rim, Meridian IT

17) PAD-2545: NVMe Saves Customers Time and Money
New NVMe non-volatile memory technology on IBM Power Systems makes SAP HANA more efficient by saving customers time and money when restarting HANA databases. SAP HANA databases on Power Systems now leverage the latest in server-side flash caching technology that revolutionizes server I/O performance and reliability.
Speakers
Vicente Moranta, IBM

18) PAD-1043: Optimal Deployment of SAP HANA on IBM Power and IBM Storage Platforms Based on Customer Experience
The presentation will provide a detailed description of the core SAP HANA architectural elements and uses cases that are being considered for optimal deployment on the IBM Power Systems platform. It will also discuss the infrastructure architectural building blocks utilized to achieve an accelerated implementation and improved performance synergy between SAP HANA and IBM Linux on Power-based enterprise landscapes.

Speakers
Damir Rubic, IBM

19) PAD-1045: Planning for a Successful SAP HANA on IBM Power Systems Proof of Concept
This presentation consolidates the basic and advanced SAP HANA and IBM Power Systems architectural elements described in other Edge 2016 sessions into the optimal set of steps required for the successful execution of a Proof of Concept (PoC) for this enterprise landscape.
Speakers
Damir Rubic, IBM

20) PAD-1353: Running SAP Faster on FlashSystem and POWER8
Facing a 25% increase in the number of customers, CPFL Energia was close to disrupting its business operations. See how POWER8 plus IBM FlashSystem reduced the overnight billing process from eight hours to five hours (a reduction of 37.5%), which released the capacity to receive two million additional customers and improved call center responsiveness. Attendees will see how a high-performance computing solution can turn a problem into a revolution within the enterprise. Also, you will gain insight into how to extract the maximum from your infrastructure to obtain better SAP HANA performance.
Speakers
Marcio Felix, CPFL Energia
Tiago Machado, CPFL Energia

21) PAD-1112: SAP HANA Data Protection Capabilites and Customer Experiences
HANA is SAP’s strategic in-memory computing platform that is designed to deliver better performance for analytic and transactional applications. HANA’s platforms and business environments continue to expand. IBM’s Spectrum Protect (formerly TSM) has a long tradition of protecting SAP databases. SAP has announced HANA on the IBM Power Systems platform, and again IBM Spectrum Protect was the first partner on the market to support this new platform. The number of reference customers and production accounts is constantly increasing on all HANA platforms. Come learn about this exciting solution, and hear about selected worldwide customer scenarios and best practices for meeting SAP HANA data protection challenges on Intel and Power platforms.
Speakers
Gerd Munz, IBM
Carsten Dieterle, IBM

22) PAD-2385: SAP HANA and IBM Power Systems: A Real-Time View into Your Business
The new SAP HANA on IBM Power Systems solution is ideal for IBM Power Systems users who are moving to SAP HANA, as well as for existing SAP HANA and SAP BI Accelerator users running on x86 who are looking to take advantage of the latest POWER8 technologies. Join us to hear IBM’s foremost expert on SAP HANA and IBM Power Systems share key information on: The SAP/IBM alliance–the best of both worlds as you move to SAP HANA; unique SAP HANA attributes delivered on IBM Power Systems, including performance, architecture, and low TCA/TCO; and a comparison of the on-premise IBM Power Systems for SAP HANA with cloud solutions for SAP HANA.
Speakers
Brian Burke, IBM
John Wise, IBM

23) PAD-2544: SAP S/4HANA on IBM Power Systems: Resiliency and Scalability
Memory resiliency is key for large, multi-terabyte, in-memory databases such as SAP Business Suite 4 SAP HANA (SAP S/4HANA). As system memory capacity grows, memory failure rates go up, resulting in unplanned outages of mission-critical deployments. IBM Power Systems offers resilient memory subsystems as good as mainframes with DRAM sparing and dynamic memory deallocation for predictive failures. Come hear more about Power Systems’ memory features, scalability and resiliency.
Speakers
Mysore S. Srinivas, IBM

24) PAD-1658: Technical and Financial Benefits of Running SAP HANA on IBM POWER8 Versus Intel
This session starts with an overview of what SAP supports with S/4HANA running SUSE Linux on IBM POWER8 servers, followed by an in-depth overview of how POWER8 technology delivers the solution for less with higher performance and greater availability. We will contrast the technical and financial benefits of IBM POWER versus Intel options to put it into perspective. Understanding the technical and financial advantages are critical for line-of-business, C-level and SAP members to recognize. Some stakeholders may overlook the benefit provided by the foundation of the SAP solution by viewing infrastructure as not playing a role in the success of SAP, which can lead to overpaying while accepting an inferior solution.
Speakers
Brett Murphy, Ciber

25) PAD-2254: Why SAP HANA on IBM Power Systems is More Efficient and Cost-Effective than on Intel
This session demonstrates how SAP HANA on IBM Power Systems not only provides a higher quality of service, but also reduces IT cost. You will see how the specific platform characteristics of IBM Power Systems translate into cost savings compared to x86. The discussion will include detailed total cost of ownership examples from real client case studies covering SAP Business Suite and Business Warehouse applications that demonstrate efficiencies and savings to businesses.
Program: Accelerate Cloud Innovation with Power Systems
Track: /ENHANCE/ Big Data and In-Memory Analytics
Information is subject to change. Please consult the IBM Events mobile app for complete agenda and session details.
Speakers
Christopher von Koschembahr, IBM

September 2, 2016 Posted by | Uncategorized | , , , , | Leave a comment

HPE acquisition of SGI implications to HANA customers

Last week, HP announced their intention to acquire SGI for $275M, a 30% premium over its market cap prior to the announcement.  This came as a surprise to most people, including me, both that HP would want to do this and how little SGI was worth, a fact that eWeek called “Embarrasing”, http://www.eweek.com/innovation/hpe-buys-sgi-for-275-million-how-far-the-mighty-have-fallen.html.  This raises a whole host of questions that HANA customers might want to consider.

First and foremost, there would appear to be three possible reasons why HP made this acquisition, 1) eliminate a key competitor and undercut Dell and Cisco at the same time, 2) acquire market share, 3) obtain access to superior technology, resources and/or keep them out of the hands of a competitor or some combination of the above.

If HP considered SGI a key competitor, it means that HP was losing a large number of deals to SGI, a fact that has not been released publicly but which would imply that customers are not convinced of Superdome X’s strength in this market.  As many are aware, both Dell and Cisco have resell agreements with SGI for their high end UV HANA systems.  It would seem unlikely that either Dell or Cisco would continue such a relationship with their arch nemesis and as such, this acquisition will seriously undermine the prospects of Dell and Cisco to compete for scale-up HANA workloads such as Suite on HANA and S/4HANA among customers that may need more than 4TB, in the case of Dell, and 6TB, in the case of Cisco.

On the other hand, market share is a reasonable goal, but SGI’s total revenue in the year ending 6/26/15 was only $512M, which would barely be a rounding error on HP’s revenue of $52B for the year ending 10/31/15.  Hard to imagine that HP could be this desperate for a potential 1% increase in revenue, assuming they had 0% overlap in markets.  Of course, they compete in the same markets, so the likely revenue increase is considerably less than even that paltry number.

That brings us to the third option, that the technology of SGI is so good that HP wanted to get their hands on it.  If that is the case, then HP would be admitting that the technology in Superdome X is inadequate for the demands of Big Data and Analytics.  I could not agree more and made such a case in a recent post on this blog.  In that post, I noted the latency inherent in HP’s minimum 8-hop round trip access to any off board resources (remote memory accesses adds another two hops), remember there are only two Intel processors per board in a Superdome X system which can accommodate up to 16 processors. Scale-up transactional workloads typically access data in rows dispersed across NUMA aligned columns, i.e. will constantly be traversing this high latency network.  Of course, this is not surprising since the architecture used in this system is INCREDIBLY OLD having been developed in the early 2000s, i.e. way before the era of Big Data.  But the surprising thing is that this would imply that HP believes SGI’s architecture is better than their own.  Remember, SGI’s UV system uses a point to point, hand wired, 4-bit wide mesh of wires between every two NUMA ASICs in the system, which, for the potential 32-sockets in a single cabinet system means 16 ASICS and 136 wires, if I have done the math correctly.  HP has been critical of the memory protection employed in systems like SGI’s UV which is based on SDDC (Single Device Data Correction).  In HP’s own words about their DDDC+1 memory protection: “This technology delivers up to a 17x improvement in the number of DIMM replacements versus those systems that use only Single-Chip Sparing technologies. Furthermore, DDDC +1 significantly reduces the chances of memory related crashes compared to systems that only have Single-Chip Sparing capabilities”  HP Integrity Superdome X system architecture and RAS Whitepaper.  What is really interesting is that SDDC does not imply even Single-Chip Sparing.

The only one of those options which makes sense to me is the first one, but of course, I have no way of knowing which of the above is correct. One thing is certain; customers considering implementing HANA on a high-end scale-up architecture from either HP or SGI, and Dell and Cisco by extension, are going to have to rethink their options.  HP has not stated which architecture will prevail or if they will keep both, hard to imagine but not out of the question either.  Without concrete direction from HP, it is possible that a customer decision for either architecture could result in almost immediate obsolescence.  I would not enjoy being in a board room of a company or meeting with my CFO to explain how I made a decision for a multi-million dollar solution which is dead-ended, worth 1/2 or less overnight and for which a complete replacement will be required sooner than later.  Likewise, it would be hard to imagine an upcoming decision for a strategic system to run the company’s entire business based on a single flip of a coin.

Now I am going to sound like a commercial for IBM, sorry.  There is an alternative which is rock solid, increasing, not decreasing in value, and has a strong and very clear roadmap, IBM Power Systems.  POWER8 for HANA has been seeing one of the fastest market acceptance rates of any solution in recent memory with hundreds of customers implementing HANA on Power Systems and/or purchasing systems in preparation for such an implementation, ranging from medium sized businesses in high growth markets to huge brand names.  The roadmap was revealed earlier this year at the OpenPower Summit, http://www.nextplatform.com/2016/04/07/ibm-unfolds-power-chip-roadmap-past-2020/.  This roadmap was further backed up with Google’s announcement of their plans for POWER9, http://www.nextplatform.com/2016/04/06/inside-future-google-rackspace-power9-system/, the UK’s SFTC plans, http://www.eweek.com/database/ibm-powers-uk-big-data-research.html worth £313 (roughly $400M based on current exchange rates) and the US DOE’s decision to base its Summit and Sierra supercomputers on IBM POWER9 systems, http://www.anandtech.com/show/8727/nvidia-ibm-supercomputers, a $375M investment, either of which is worth more than the entire value of SGI interestingly enough.  More importantly, these two major wins mean IBM is now contractually obligated to deliver POWER9 thereby ensuring the Power roadmap for a long time to come.  And, of course, IBM has a long history of delivering mission critical systems to customers, evolving and improving the scaling and workload handling characteristics over time while simultaneously improving systems availability.

August 15, 2016 Posted by | Uncategorized | , , , , , , , , , , | Leave a comment

Large scale-up transactional HANA systems – part 2

Part 1 of this subject detailed the challenges when sizing large scale-up transactional HANA environments.  This part will dive into the details and methodology by which customers may select a vendor lacking an independent transactional HANA benchmark.

Past history with large transactional workloads

Before I start down this path, first it would be useful to understand why it is relevant.  HANA transaction processing utilizes many of the same techniques as a conventional database.  It accesses rows, albeit each column is physically separate, the transaction does not know this and gets all of the data together in one place prior to presenting the results to the dialog calling it.  Likewise, a write must follow ACID properties including only one update against a piece of data can occur at any time requiring that cache coherency mechanisms are employed to ensure this.  And a write to a log in addition to the memory location of the data to be changed or updated must occur.  Sounds an awful lot like a conventional DB which is why past history handling these sorts transactional workloads makes plenty of sense.

HPE has a long history with large scale transactional workloads and Superdome systems, but this was primarily based on Integrity Superdome systems using Itanium processors and HP-UX not with Intel x86 systems and Linux.  Among the Fortune 100, approximately 20 customers utilized HPE’s systems for their SAP database workloads almost entirely based on Oracle with HP-UX.  Not bad and coming in second place to IBM Power Systems with approximately 40 of the Fortune 100 customers that use SAP.  SGI has exactly 0 of those customers.  Intel x86 systems represent 8 of that customer set with 2 being on Exadata, not even close to a standard x86 implementation with its Oracle RAC and highly proprietary storage environment.  Three of the remaining x86 systems are utilized by vendors whose very existence is dependent on x86 so running on anything else would be a contradictory to their mission and these customers must make this solution work no matter what the expense and complexity might be.  That leaves 3 customers, none of which utilize Superdome X technology for their database systems.  To summarize, IBM Power has a robust set of high end current SAP transactional customers; HPE a smaller set entirely based on a different chip and OS than is offered with Superdome X; SGI has no experience in this space whatsoever; and x86 in general has limited experience confined to designs that have nothing in common with today’s high end x86 technology.

Industry Standard Benchmarks

A bit of background.  Benchmarks are lab experiments open to optimization and exploitation by experts in the area and have little resemblance to reality.  Unfortunately, it is the only third party metric by which systems can be compared.  Benchmarks fall into two general categories, those that are horrible and those that are not horrible (note I did not say good).  Horrible ones sometimes test nothing but the speed of CPUs by placing the entire running code in instruction cache and the entire read-only dataset upon which the code executes in data cache meaning no network and disk much less any memory I/O or cache coherency.   SPEC benchmarks such as SPECint2006 and SPECint_rate2006 fall into this category.  They are uniquely suited for ccNUMA systems as there is absolutely no communication between any sockets meaning this represents the best case scenario for a ccNUMA system.

It is therefore revealing that SGI, with 32 sockets and 288 cores, was only able to achieve 11,400 on this ideal ccNUMA benchmark, slightly beating HP Superdome X’s result of 11,100, also with 288 cores.  By comparison, the IBM Power Systems E880 with only 192 cores, i.e. 2/3 of the cores, achieved 14,400, i.e. 26% better performance.

In descending order from horrible to not as bad, there are other benchmarks which can be used to compare systems.  The list of benchmarks includes SAP SD 2-tier, SAP BW-EML, TPC-C and SAP 3-tier.  Of those, the SD 2-tier has the most participation among vendors and includes real SAP code and a real database, but suffers from the database being a tiny percentage of the workload, approximately 6 to 8%, meaning on ccNUMA systems, multiple app servers can be placed on each system board resulting in only database communication going across a pretty darned fast network represented by the ccNUMA fabric.  SGI is a no-show on this benchmark.  HPE did show with Superdome X @ 288 cores and achieved 545,780 SAPS (100,000 users, Ref# 2016002), and still the world record holder.  IBM Power showed up with the E870, an 80 core systems (28% of the number of cores as the HPE system) and achieved 436,100 SAPS (79,750 users, Ref# 2014034) (80% of the SAPS of the HPE system).  Imagine what IBM would have been able to achieve with this almost linearly scalable benchmark had they attempted to run it on the E880 with 192 cores (probably close to 436,100 * 192/80 although it is not allowed for any vendor to publish the “results” of any extrapolations of SAP benchmarks but no one can stop a customer from inputting those numbers into a calcuator).

BW-EML was SAP’s first benchmark designed for HANA, although not restricted to it.  As the name implies, it is a BW benchmark, so it is difficult to derive any correlation to transaction processing, but at least it does show some aspect of performance with HANA, analytic if nothing else and concurrent analytics is one of the core value propositions of HANA.  HPE was a frequent contributor to this benchmark, but always with something other than Superdome X.  It is important to note that Superdome X is the only Intel based system to utilize RAS mode or Intel Lockstep, by default, not as an option.  That mode has a memory throughput impact of 40% to 60% based on published numbers from a variety of vendors, but, to date, no published benchmarks, of any sort, have been run in this mode.  As a result, it is impossible to predict how well Superdome X might perform on this benchmark.  Still, kudos to HPE for their past participation.  Much better than SGI which is, once again, a no-show on this benchmark.  IBM Power Systems, as you might predict, still holds the record for best performance on this benchmark with the 40 core E870 system @ 2 Billion rows.

TPC-C was a transaction processing benchmark that, at least for some time period, had good participation, including from HP Superdome.  That is, until IBM embarrassed HPE so much, by delivering 50% more performance with ½ the number of cores.  After this, HPE never published another result on Superdome … and that was back in the 2007/2008 time frame.  TPC-C was certainly not a perfect benchmark, but it did have real transactions with real updates and about 10% of the benchmark involved remote accesses.  Still, SGI was a no-show and HPE stopped publishing on this level of system in 2007 while IBM continued publishing through 2010 until there was no one left to challenge their results.  A benchmark is only interesting when multiple vendors are vying for the top spot.

Last, but certainly not least, is the SAP SD 3-tier benchmark.  In this one, the database was kept on a totally separate server and there was almost no way to optimize it to remove any ccNUMA effects.  Only IBM had the guts to participate in this benchmark at a large scale with a 64-core POWER7+ system (the previous generation to POWER8).  There was no submission from HPE that came even remotely close and, once again, SGI was MIA.

Architecture

Where IBM Power Systems utilizes a “glueless” interconnect up to 16 sockets, meaning all processor chips connect to each other directly, without the use of specialized hub chips or switches, Intel systems beyond 8 sockets utilize a “glued” architecture.  Currently, only HPE and SGI offer solutions beyond 8 sockets.  HPE is using a very old architecture in the Superdome X, first deployed for PA-RISC (remember those) in the Superdome introduced in 2000.  Back then, they were using a cell controller (a.k.a. hub chip) on each system board.  When they introduced the Itanium processor in 2002, they replaced this hub chip with a new one called SX1000; basically an ASIC that connected the various components on the system board together and to the central switch by which it communicats with other system boards.  Since 2002, HPE has moved through three generations of ASICs and now is using the SX3000 which features considerably faster speeds, better reliability, some ccNUMA enhancements and connectivity to multiple interconnect switches.  Yes, you read that correctly; where Intel has delivered a new generation of x86 chips just about every year over the last 14 years, HPE has delivered 3 generations of hub chips.  Pace of innovation is clearly directly tied to volume and Superdome has never achieved sufficient volume alone nor use by other vendors to increase the speed of innovation.  This means that while HPE may have delivered a major step forward at a particular point in time, it suffers from a long lag and diminishing returns as time and Intel chip generations progress.  The important thing to understand is that every remote access, from either of the two Intel EX chips on each system board, to cache, memory or I/O connected to another system board, must pass through 8 hops, at a minimum, i.e. from calling socket, to SX3000, to central switch to remote SX3000, to remote socket and the same trip in return and that is assuming that data was resident in an on-board cache.

SGI, the other player in the beyond 8 socket space, is using a totally different approach, derived from their experience in the HPC space.  They are also using a hub chip, called a HARP ASIC, but rather than connecting through one or more central switches, in the up to 32 socket systems UV 300H system, each system board, featuring 4 Intel EX chips and a proprietary ASIC per memory riser, includes two hub chips which are linked directly to each of the other hub chips in the system.  This mesh is hand wired with a separate physical cable for every single connection.  Again, you read that correctly, hand wired.  This means that not only are physical connections made for every hub chip to hub chip connection with the inherent potential for an insertion or contact problem on each end of that wire, but as implementation size increases, say from 8-sockets/2 boards to 16-sockets/4 boards or to 32-sockets/8 boards, the number of physical, hand wired connections increases exponentially.   OK, assuming that does not make you just a little bit apprehensive, consider this:  Where HPE uses a memory protection technology called Double Device Data Correction + 1 (DDDC+1) in their Superdome X system, basically the ability to handle not just a single memory chip failure but at least 2 (not at the same time), SGI utilizes SDDC, i.e. Single device data correction.  This means that after detection of the first failure, customers must rapidly decide whether to shut down the system and replace the failing memory component (assuming it has been accurately identified), or hope their software based page deallocation technology works fast enough to avert a catastrophic system failure due to a subsequent memory failure.  Even with that software, if a memory fault occurs in a different page, the SGI system would still be exposed.    My personal opinion is that memory protection is so important in any system, but especially in large scale scale-up HANA systems, that anything short of true enterprise memory protection of at least DDDC is doing nothing other than increasing customer risk.

Summary

SGI is asking customers to accept their assertions that SAP’s certification of the SGI UV 300H at 20TB implies they can scale better than any other platform and perform well at that level, but they are providing no evidence in support of that claim.  SAP does not publish the criteria with which is certifies a solution, so it is possible that SGI has been able to “prove” addressability at 20TB, the ability to initialize a HANA system and maybe even to handle a moderate number of transactions.  Lacking any sort of independent, auditable proof via a benchmark, any reasonable body of customers (one would be nice at least) driving high transaction volumes with HANA or a conventional database and anything other than a 4-bit wide, hand wired ccNUMA nest that would seem prone to low throughput and high error rates, especially with substandard memory protection, it is hard to imagine why anyone would find this solution appealing.

HPE, by comparison, does have some history in transactional systems at high transactional volumes with a completely different CPU, OS and memory architecture, but nothing with Superdome X.  HPE has a few benchmarks, however poor, once again on systems from long ago plus mediocre results with the current generation and an architecture that has a minimum of 8-hops round trip for every remote access.  On the positive side, at least HPE gets it regarding proper memory protection, but does not address how much performance degradation results from this protection.  Once again, SAP’s certification at 16TB for Superdome X must be taken with the same grain of salt as SGI’s.

IBM Power Systems has an outstanding history with transactional systems at very high transactional volumes using current generation POWER8 systems.  Power also dominates the benchmark space and continued to deliver better and better results until no competitor dared risk the fight.  Lastly, POWER8 is latest generation of a chip designed from the ground up with ccNUMA optimization in mind and with reliability as its cornerstone, i.e. the results already include any overhead necessary to support this level of RAS.  Yes, POWER8 is only supported at 9TB today for SAP SoH and S/4HANA, but lest we forget, it is the new competitor in the HANA market and the other guys only achieved their higher supported numbers after extensive customer and internal benchmark testing, both of which are underway with Power.

July 7, 2016 Posted by | Uncategorized | , , , , , , , , , , , , , , , | Leave a comment