SAPonPower

An ongoing discussion about SAP infrastructure

3D XPoint Memory – The best thing for SAP HANA since HANA was invented?

At #SapphireNow, the Intel booth was all atwitter about the new “game changer”, “revolutionary”, “future of computing”, “best thing since the wheel” (ok, I made that last one up).  Yes, they were thrilled with 3D XPoint Optane memory.[i]  It is being positioned as persistent memory, like SSD but much faster and which can take the place of real, a.k.a. DRAM, memory … eventually.  Paraphrasing them, “You will be able to replace conventional memory with 3D XPoint memory at almost the same speed but which gives you the ability to restart your system after failure in a matter of seconds, not minutes or hours, because the entire HANA image will be stored in persistent memory, not on disk or SSDs.”

This sounds fantastic as long as we completely ignore reality.  Let’s dissect the above sentence.

“almost the same speed” – current speculation is that 3DXPoint memory will be about 10 times slower than conventional memory.  That is WAY better than external SSD storage, which is around 1000 times slower, but for memory resident applications, like HANA, 10 times slower will result in at least a 10x performance reduction for HANA.  Remember, we have no idea how this might affect an application which expects very fast access to memory.

“restart your system after failure:” – silly me, I thought the idea was to prevent failure in the first place.  I am curious how often system failure is caused by memory errors or any other cause for which diagnostics might be required to evaluate the underlying problem as well a repair action to fix that problem.  Then the question is in which scenario is a customer willing to circumvent diagnostics and return the system to productive use.  This also assumes that customers are willing to run mission critical systems without any sort of HA solution such as HANA System Replication or HANA Host Auto-Failover.  The use of an HA solution would fail-over production to a secondary system which means that any memory image on the primary system would be out of date almost instantly.

“restart … in seconds” – So, your system has failed for unknown reasons and you are willing to forgo any sort of evaluation of the underlying cause.  So far so good.  So, Linux is capable of restarting and keeping the memory image as it was before hand and utilizing persistent main memory? Not entirely, but with RHEL 7.3 (not supported for HANA yet), using special device drivers applications may be rewritten to utilize “pmem” for pseudo storage devices.[ii]  And HANA is capable of restarting as well from whatever point it was in at the time of failure.  Also, did not know HANA could do this and am surprised that SAP prioritized fast restart ahead of the long laundry list of customer provided requirements … which I doubt they did.  And HANA can figure out what transactions were in flight at the time of failure, which ones had made some changes to memory, but not all, e.g. started to insert data into a delta table but perhaps had not completed this action at time of failure?  Totally wicked!! … and total fantasy, at least for now.

You can easily imagine a variety of other conditions where columns are being updated, e.g. during a delta merge, but have not finished in which some columns contain updated elements and others do not.  I am not saying these are insurmountable problems, but considering that you can’t even make a change to the size of a HANA system without restarting HANA currently, it is a massive stretch to imagine how SAP has or is willing to invest the time and effort to make this work for a highly questionable benefit with likely severe performance degradation.

So, 3D XPoint memory as a replacement for conventional memory is clearly all hype, but don’t expect anyone from Intel or their proponents to tell you this.  How about as a technology for much faster SSDs?  Now we are talking!  I doubt there is any reason why this will not be quickly adopted by disk subsystem vendors and available from multiple sources.

As to whether HANA workloads will benefit, that is a different story.  Remember, HANA is a read-once workload.  Once a column is loaded into memory, it is never read again until unloaded and this should only occur if the memory subsystem is undersized or the system is restarted after maintenance.  So, fast storage is useful for restarts, but super-fast storage is only needed when a system must return to full operation after maintenance very quickly and without any performance degradation, i.e. every column loaded into memory, in 10 minutes or so.  Just as a point of comparison, IBM ran a test with 10 NVMe cards and delivered about 1TB per minute when restarting HANA.  To the best of my knowledge, few customers have expressed more than a passing interest in this capability.  I could imagine a scenario in which customers are willing to put a somewhat recent tier of data, e.g. 1 to 2 year old data, on persistent main memory, with perhaps external, and orders of magnitude slower, storage used for older data.  Once again, this is a nice concept but until SAP writes or adopts code to enable this, it is just a theory.

As to writes, most enterprise storage subsystems can deliver response times that are twice as fast as SAP requires.  IBM SVC (SAN Volume Controller) connected to an IBM Power System has been tested in real customer installations and has delivered the fastest times of any storage subsystem in the industry with a peak latency of only 161us (microseconds) for 4K block size log writes as measured by HWCCT or over 6 times better latency than what SAP requires.   SVC is part of a family of products including V7000, V9000 and Spectrum Virtualization Software which all utilize similar concepts and software.

In other words, you don’t have to wait for tomorrow to get fast restarts and minimized transactional log writes, you just need to select the write infrastructure partner, IBM.

[i] https://www.theregister.co.uk/2017/05/17/coming_xeon_sps_will_run_sap_hana_16_times_faster/
[ii] https://developers.redhat.com/blog/2016/12/05/configuring-and-using-persistent-memory-rhel-7-3/
Advertisements

June 12, 2017 - Posted by | Uncategorized | , , , , ,

4 Comments »

  1. Great article. Completely agree

    Comment by Pradip bhatt | June 12, 2017 | Reply

  2. I too visited the Intel booth @SAPPHIRENOW 2017 and discussed 3D XPoint memory technology and how HANA harnesses it. As you outlined so well in this post the only perceived benefit is quick restarts. Good point regarding HA and DR systems that would NOT have the same data persisted in the associated hosts 3D XPoint memory. That could be an enhancement. The quick restart will definitely help for HANA maintenance operations for a planned outage. I do want to point out that this technology offers more than fast restarts. The primary benefit is a much larger Memory to Core ratio. 3 TB per socket. This will allow for single HANA nodes to manage much larger volumes of data. From a HANA application perspective the data that is targeted for 3D memory is the MAIN Store part of a columnar table. The DELTA Store would still reside on traditional DRAM which lends itself to frequent updates. So, the majority of the data (residing in the MAIN Store) will reside in 3D XPoint persistent memory. The unknown, right now, is the performance degradation introduced as a result of this data in a memory store that is 10X slower. (reference https://www.flashmemorysummit.com/English/Collaterals/Proceedings/2016/20160810_FR21_Caklovic.pdf)

    Comment by Scott Groth | June 15, 2017 | Reply

    • I love the concept of increased memory density, but for BW, SAP has been unwilling to approve higher densities despite the use of real DRAM high density availabilities. I am not sure how or when they will change their memory sizing methodology to accommodate 3D XPoint and allow for higher densities while simultaneously delivering 10x slower performance. This would seem to run counter to their current approach.

      Comment by Alfred Freudenberger | June 15, 2017 | Reply

    • I contemplated including a similar statement in my original post. Glad you chimed in. It will be interesting to observe how SAP embraces this concept of higher memory:core ratios for BW on HANA with the advent of higher density persistent memory. Based on history this would be a significant change in HANA guidelines.

      Comment by Scott Groth | June 16, 2017 | Reply


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: