SAPonPower

An ongoing discussion about SAP infrastructure

There is HoP for HANA – HANA on Power T&E program begins

At SAPPHIRE NOW 2014 in Orlando, Bernd Leukert, (member of the SAP Executive Board with global responsibility for development and delivery of all products) made an announcement for which most of the community of IBM Power Systems customers, sellers and integrators have been waiting for a long time.  He announced the testing and evaluation program for HANA on IBM Power Systems (HoP).  As SAP has done with virtually every product they have introduced in the past, they will work with a select group of customers that have the interest, skills, experience and willingness to dedicate the time to thoroughly put a product through its paces to ensure that it will deliver the value necessary to customers, operate efficiently and without error and perform at an acceptable level.  Often, this is called Ramp-up and usually, unless critical problems are found, it is followed by an announcement of General Availability (GA).  During this announcement, Mr. Leukert mentioned that customers are welcome to request a full disclosure of SAP’s plans to support HANA on Power.

 

Many people came through our booth, joined us in meetings and used other channels to ask one simple question: “Once HoP reaches GA, why should a customer prefer this solution over one from the plethora of x86 based vendors.  I will boil down this subject into 5 key points: 1) Performance, 2) Reliability, 3) Flexibility, 4) Efficiency and 5) Cost

 

1)      Performance – HANA, as most know, is an in-memory database.  It replaces complex and cumbersome indexes and aggregates with highly compressed columns of data which can be traversed very quickly (full column scans, in effect) and in ways that don’t have to be considered ahead of time by architects and DBAs.  The speed comes at the cost of memory and can be limited by the bandwidth of the memory subsystem in addition to the throughput and number of processor cores in a system.  IBM recently announced 1 and 2 socket POWER8 systems capable of supporting a maximum of 1TB of memory (with both sockets populated).  A radically faster backplane and up to 128MB of L4 cache (with all memory slots populated) result in a maximum bandwidth of 384GB/s.   This compares to a two socket IvyBridge EX system (with 30 cores) which has a maximum bandwidth of 68GB/s in full RAS mode (85GB/s in the benchmark only, no sane customer would ever run HANA with partial RAS mode).  In other words, the 2-socket Power System can deliver 5.6 times the memory bandwidth.  Not bad, but not the whole story.  High memory throughput without high CPU performance results in a simple shifting of bottlenecks.

 

Each POWER8 core supports a robust cache hierarchy with twice the L1, L2 and L3 compared to Ivy Bridge systems (in full RAS mode),  up to 8 simultaneous threads (SMT8) and dual SIMD (Single Instruction Multiple Data) pipelines, compared to 2 threads (Hyperthreading) on IvyBridge EX and a single SIMD pipeline.  HANA takes full advantage of lots of cache, threads and SIMD pipelines.

 

As HoP is not yet GA, no formal benchmarks are available.  That said, the Power S824 (POWER8, 2 socket with 24 cores, 4U system, Cert # 2014016) running AIX achieved 21,212 Users, 116,727 SAPS on the SD 2-tier benchmark.  Not bad, especially when you consider that the best IvyBridge EX 2-socket , 30-core system SD 2-tier benchmark result to date achieved 12,280 Users, 67,020 SAPS (Cisco UCS B260 M4, Cert # 2014018).  On this benchmark, the Power System delivered almost 73% more performance per systems, 116% more performance per core.  While performance with a conventional DB and app servers cannot be correlated with predicted HANA performance, it is a reasonable measurement to consider relative performance of the two types of systems.

 

2)      Reliability – As I have written about in previous blog entries, IBM Power Systems have a long history of providing mission critical reliability.  Data analysis companies have evaluated extensive outage data from customers and determined that Power Systems, on average, have experienced up to 60% fewer outages than x86 systems with up to 5 times faster recovery from outages when they do occur.  That said, Intel has made many strides forward in reliability features.  If we were to assume equivalent reliability at the processor core level (Intel claims, not backed up by customer data yet), there are still a whole host of components that can result in outages, not to mention a major difference in the underlying RAS architecture for problem detection and resolution.  As this has been a discussion in previous blog entries, let me instead focus on the new improvements in POWER8 which leap frogs x86 systems.  As mentioned earlier, HANA loves, eats and breathes memory.  So, let’s consider memory reliability.  POWER8 memory cards on these new scale-out systems include features previously available only on Enterprise class systems such as the Power 770.  Specifically, each rank of memory features an inactive chip that is included at no extra cost and can take the place of a failing chip on the fly.  This is in addition to ChipKill (ECC on steroids) or a recovery mechanism across chips, not only within each one.  By comparison, x86 systems may offer ChipKill or even double ChipKill, but to avoid “recovery” after a chip failure, x86 systems would require memory mirroring (available on most high end x86 systems) but this effectively reduces memory capacity by 50% as ALL memory is mirrored.  This means that a 1.5TB, 30-core IvyBridge EX system would only be able to deliver 768GB of memory with similar memory redundancy to POWER8 with a full complement of 1TB of memory.

 

Lacking this huge penalty, most customers will stick with non-mirrored memory at a considerably lower reliability level.  But the story does not end there.  POWER8 adds extra protection on the memory bus with built in redundancy and automatic failover in the event of a memory lane failure.

 

Taken together, some estimates suggest that POWER8 systems will experience 2.5 to 8 times fewer system outages caused by the memory subsystem than similarly configured x86 systems.   Put another way, if we were to assume that a 1TB x86 system would experience one outage every 10 years (just a random number for comparison purposes,  not based on any actual data), then 10 such systems would result in 1 outage every year based on memory alone.  Maybe not a huge problem for BW systems with scale-out/HA configs, non-mission critical analytics systems or non-prod systems, but could be a big problem for mission critical business suite systems on separate nodes much less larger ones that place multiple SAP components on one node (yes, SAP seems to be envisioning a future of MCOD or Multiple Components on One DB).  Remember, however, that this is just the memory subsystem associated failure and there are many other potential areas of system failure for which Power has its long track record of superior protection and recovery.

 

3)      Flexibility – POWER has virtualization built in.  It is included by default at the hardware/firmware level (unless one were to remove the PowerVM hypervisor and replace it with PowerKVM, not the intended hypervisor for HANA).  At SAPPHIRE, SAP announced GA support for HANA production on VMware vSphere 5.5.  Just because you CAN do something does not mean that you SHOULD.  You could run across a busy highway without the assistance of lights or overpasses, but most people would consider you crazy to do so.  Likewise, vSphere 5.5 still utilizes the same I/O infrastructure that it has in the past resulting in all I/O writes, having to pass from a VM through its virtual device driver to the memory space of the vSphere hypervisor and then on to the I/O subsystem through another “real” device driver and the same, albeit reverse, path on reads.   If that was not bad enough to dissuade all but the most ardent VMware proponents from placing any large scale transactional DB on VMware, VMware’s own published study shows only 65% scaling on 32 virtual processors, which when extrapolated to 64 or 128 vps shows no incremental benefit at all.   Can you then image an incredibly memory intensive HANA system running with a ton of memory but only being able to utilize 32 threads/16 cores effectively?  Would this break the KPIs for HANA?

 

Another area of flexibility comes from configuration options.  We have recommended a TDI approach (Tailored Datacenter Integration: http://www.saphana.com/docs/DOC-4380) where SAP defines the KPIs for HANA and allows customers to determine how they will satisfy those KPIs.  In other words, if a customer wants to use an LPAR on a POWER7+ or POWER8 system, their choice of storage based on their current standards and install base (e.g. EMC, HDS, NetApp, IBM), an existing network infrastructure, on-board or off-board flash drives, XFS or GPFS, etc., a TDI approach would allow for the flexibility to select the components and configurations that work best for them.

 

4)      Efficiency – If running VMware is not a good option, as suggested above, then customers would be forced to employ islands of automation, multiple different HANA implementations on separate servers each with its own dedicated HA and DR servers, ABAP and JAVA app servers on their own systems, probably virtualized, but still separate from the HANA system.  Yes, you might virtualize non-prod, but if you virtualize QA or a stress test system, then you have violated one of the fundamental rules in system architecture, i.e. never utilize a different stack in the critical QA path that is not used in production.

 

By comparison, Power has no such limitations. Anything can be mixed and matched as desired, with obvious constraints of CPU, Memory and I/O considered, of course.  In other words, customers will have the flexibility to put together anything in almost any combination based on their unique needs.  Higher performance and comprehensive virtualization may result in fewer systems being required since it allows customers to place many different partitions (VMs) together, including HANA, app servers, non-prod and HA, into a single cluster using shared, virtualized resources.  As a result, data center footprint, power, cooling, administration, problem resolution, etc. can all see substantial reductions.

 

5)      Cost – With the introduction of POWER8 and its Linux only models, IBM entered a new era on pricing.  Full disclosure: I am not a wizard at configurations so I am certain that these would not pass a proper design review but should suffice for this comparison.  I selected the IBM Power S822L, a Linux only system that includes 2 sockets, 24 cores @ 3.026 GHz,  1TB of memory, PowerVM, a pair of 146GB drives (just for boot and paging), a pair of dual-port 40GB Ethernet adapters, 3 years of 24/7 warranty and Linux support.  The price came out a little over $90K.  I then went to the HP web site and selected a DL580 Gen8 with a pair of E7-4480v2 2.8GHz/15 core processors, 1TB of memory, a couple of 120GB disk drives and 4 dual port 10GB Ethernet adapters, since they apparently don’t support the newer 40GB ones.  Before I even added OS and Virtualization, this config was already close to $62K.  Add VMware vSphere 5.5 and a 4 Guest, 2 socket RedHat license with 3 yr 24/7 support and the total is closing on $78K.  So, for a little over 15% higher cost, at list, you could get a system with over 50% more performance, radically better virtualization and far better reliability.  The reliability advantage alone, especially for Suite on HANA customers, would justify a far higher price tag.  Of course, this is just the first wave of systems and many customers will need far larger configurations.  As IBM has not announced those, as of yet, I can’t speculate where they will come in, price wise, but if you consider how far things have come in a very short time for Power, the future looks awfully bright.

 

One thing that I did not discuss above, other than in passing, is GPFS.  Not that GPFS is to be ignored, but remember, this will hopefully be an option, not a requirement.  This amazing, scale-out technology, is shared with our System x brethren.  It is a game changer for customers, offering not only outstanding performance but built in redundancy for data and log volumes for HANA.  It has been extensively tested by SAP, IBM and customers.  It already certified to scale-out to 56 nodes with System x and SAP has tested it in configurations with over 100 nodes.  If GPFS were added to the mix, for scale-out configurations, this would add substantial additional value over similar x86 configurations, other than those from IBM System x.

 

HANA on Power will enable not just freedom of choice for customers, but the mission critical reliability and performance that may have been holding them back from trying Suite on HANA.  We are looking forward to working with SAP and our customers to explore this exciting new offering.

 

Many thanks to my colleague, Bob Wolf, for his outstanding editing.

Advertisements

June 6, 2014 - Posted by | Uncategorized

17 Comments »

  1. […] There is HoP for HANA – HANA on Power T&E program begins […]

    Pingback by Why SAP HANA on IBM Power Systems « SAPonPower | June 6, 2014 | Reply

  2. […] There is HoP for HANA – HANA on Power T&E program begins […]

    Pingback by SAP HANA on Power – status update « SAPonPower | June 6, 2014 | Reply

  3. Reblogged this on All Things BOBJ BI Blog and commented:
    SAP HANA is moving beyond the x86 architecture with upcoming support for IBM Power Systems.

    Comment by Jonathan Haun | June 6, 2014 | Reply

  4. I’ll be impressed if you can get SAP to publish an SD benchmark for HANA, on Power or any other platform!

    Comment by Paul | June 9, 2014 | Reply

    • Actually, SAP publishes any benchmark which fulfills all of the criteria which SAP has clearly outlined in their benchmark Ts & Cs. After running the benchmark, each vendor fills out the necessary paperwork and then SAP certifies it, assuming it obeys all of their rules. It is up to each vendor to provision the necessary servers, storage, OS and DB environments. As HANA acts primarily as the DB in the SD/ERP environment, any vendor that decides to embark upon this journey would probably have to run in 3-tier mode. That entails a huge amount of app servers (usually 12 to 14 times as much as the DB server in raw SAPS, but may be different with HANA due to the, as yet, unknown SAPS requirements to handle the SD database workload) at a pretty significant cost to the vendor, not to mention a large amount of time and resources to install, customize and tune the environment. That said, I would love to see such a benchmark. No clue as to IBM’s or anyone else’s plans.

      Comment by Alfred Freudenberger | June 9, 2014 | Reply

      • Remarkable then that there is no SD benchmark for HANA database, and would SAP *really* publish if the result looks that bad?

        Comment by Paul | July 4, 2014

      • Also, I was forgetting the following statement by SAP, according to ASUG at http://www.asugnews.com/article/whats-the-sd-benchmark-for-erp-apps-running-on-sap-hana

        “We will not use the classical SD benchmark for Suite on HANA. This benchmark is made to analyze transaction throughput and does not reflect the specific benefits Suite on HANA has in terms of analytics, search and agility.”

        So it appears that SAP disallows SD benchmark on HANA, probably because the price performance is poor.

        Comment by Paul | July 4, 2014

  5. what abt HoP with AIX… is that coming into the equation soon ?

    Comment by Zaid Umer Farooqui | June 9, 2014 | Reply

    • The only thing that we know, at this time, is what SAP has said publicly, i.e. that the T&E program will utilize Linux on Power. If customers would like to see HoP on AIX, they need to share that desire with SAP and/or request an NDA to see what SAP’s plans are for HoP.

      Comment by Alfred Freudenberger | June 9, 2014 | Reply

  6. I think you are a bit confused. The benchmark you quote4 here is a 4 socket benchmark not 2 socket.
    In fact Intel 4 socket benchmarks are 135,000 SAPS vs. Power8 with 115,000 SAPS.

    So it would seem Intel IvyB EX benchmarks well ahead of Power8 (given the same number of sockets).

    Power S824 (POWER8, 2 socket with 24 cores, 4U system, Cert # 2014016)

    Central server: IBM Power System S824, 4 processors / 24 cores / 192 threads,
    POWER8, 3.52 GHz, 32 KB (I) and 64 KB (D) L1 cache and
    512 KB L2 cache per core, 8 MB L3 cache per core,
    512 GB main memory
    Certification number:

    Comment by factual error in this blog | June 19, 2014 | Reply

    • I really appreciate it when my readers find factual errors in my blog as I really strive to get things right. The reader, in this case, in not correct but his/her misunderstanding is actually quite reasonable. I described the S824 as being a 2-socket system. It is, in fact, a 2-socket system. IBM has developed very advanced CPU modules which allow them to place more than one chip on a module. Each module, in turn, fits into 1-socket. The S824 quoted utilizes two 6-core POWER8 chips mounted on a DCM (Dual chip module), of which there are two such modules, each inserted into a separate socket. POWER8 is designed for up to 12-cores on a single chip and IBM can use multiple different variations of this chip based on a wide variety of factors, including price, GHz, power, heat dissipation, cpu, memory and/or I/O throughput.

      Comment by Alfred Freudenberger | June 19, 2014 | Reply

  7. I recommend you check the power 8 benchmark you quote (incorrectly) as being 2 socket.
    It is a 4 socket benchmark

    In your blog you quote that a “Power S824 (POWER8, 2 socket with 24 cores, 4U system, Cert # 2014016) running AIX achieved 21,212 Users, 116,727 SAPS on the SD 2-tier benchmark”

    This statement is false. It is 4 socket. look at the summary for benchmark 2014016

    Here is what it says:
    Central server: IBM Power System S824, 4 processors / 24 cores / 192 threads,
    POWER8, 3.52 GHz, 32 KB (I) and 64 KB (D) L1 cache and
    512 KB L2 cache per core, 8 MB L3 cache per core,
    512 GB main memory

    so you are quoting that a 2 socket IBM Power8 has more SAPS than a 2 socket Intel (which is probably wrong, but there are no data points available).

    What is a fact is that a 4 socket Intel IvyB EX has more SAPS than a 4 socket POWER8

    yet at the same time your blog is saying pSeries out performs Intel servers

    It does not quite add up. It might be best to make a correction to the blog

    BTW: congratulations to IBM for getting HANA on Power. Competition is always good for customers.

    Comment by factual error in this blog | June 19, 2014 | Reply

    • I mistakenly thought that the original response was directed at the number of sockets in the Cisco server, but now see that it refers to the Power server. I have revised the original comment to reflect the Power Server sockets. I stand by the performance claims in the blog entry in that the S824 is absolutely a 2-socket server. Thank you for your comment.

      Comment by Alfred Freudenberger | June 20, 2014 | Reply

  8. I am agree with this opinion. In fact each POWER8 core supports a robust cache hierarchy with twice the L1, L2 and L3 compared to Ivy Bridge systems and dual SIMD (Single Instruction Multiple Data) pipelines, compared to 2 threads on IvyBridge EX and a single SIMD pipeline. In fact HANA takes full advantage of lots of cache, threads and SIMD pipelines.

    Comment by ekspercisap | June 20, 2014 | Reply

  9. Very informative entry, thank you Alfred.
    I’ve witnessed first hand how well IBM product performed with SAP in the HANA lead up products (BIA, later BWA) and customers voted with their feet once they saw how much time, angst, & integration budget could be saved ‘simply’ by choosing IBM infrastructure. Time to business benefit was, and is, critical but so is reliability & TCO once running.
    GPFS helped these implementations also to scale easily & meet the demands of even the largest global customers so I’m really interested to see how much better HANA can satisfy business requirements when running on Power and with GPFS.
    I appreciate the information as I was once an x-86 only ‘true believer’ and even though I continually pushed the products & found them wanting it was still many years before I discovered systems truly capable of meeting & exceeding business critical requirements. Education is key, as it is to many things.
    btw, the communities.vmware.com/blogs… entry highlighted in your previous post has been removed, guess vmware wanted to keep things factual also!

    Comment by Phil R | June 22, 2014 | Reply

  10. In regards to “Comment by factual error in this blog” I would tend to agree with person who stated there were some factual errors in this blog.

    This blog would not conform to the policies for SAP benchmarking here:
    http://global.sap.com/campaigns/benchmark/ppv_overview.epx

    It would be best to fully disclose the facts when comparing SAP benchmarks. Also it is forbidden to correlate SAP benchmarks to costs.

    The key factual error in this blog is the comparison of a 2 socket/4 processor benchmark to a 2 socket/2 processor benchmark

    In order to ensure this blog is accurate I would recommend mentioning in the text of the blog:

    IBM Power8 2 socket/4 processor = 116k SAPS
    Intel IvyB 2 socket/2 processor = 67k SAPS
    Intel IvyB 4 socket/4 processor = 135k SAPS

    A customer can make their own evaluation of this data.

    Comment by best to fully disclose | July 7, 2014 | Reply

    • Actually, if you want to get technical, I should have mentioned OS, DBMS and SAP Version level, i.e. AIX 7.1, DB2 10.5 for the IBM Power System and Windows Server 2012 Datacenter Edition, SQL Server 2012 for the Cisco System with both running SAP EHP5 for ERP 6.0 version of the benchmark. As to correlation to costs, no such claim was made. The performance section refers to the IBM S824 and the Cisco B260. The cost section refers to the IBM S822L and the HP DL580. Furthermore, no attempt was made to derive a $/SAPS. There is no factual error when the number of cores and sockets were accurately reported, however it is correct that the number of processors and threads should also have been noted (4/192 for the S824 and 2/60 for the B260). There is no rule that states that one cannot compare by number of sockets and in fact, section 4.2.5 allows for “other system configuration options.” As to who makes what evaluation of data, there would be little point in a blog about personal observations if I did not make any personal observations and SAP rules most definitely support making comparative claims.

      Comment by Alfred Freudenberger | July 7, 2014 | Reply


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: