SAPonPower

An ongoing discussion about SAP infrastructure

Does Intel’s Optane DC Persistent Memory decrease TCO for SAP?

When this new type of persistent memory DIMM (PMEM) was announced by Intel about a year ago, improving restart times was the most important factor cited by Intel and vendors of systems that utilize Intel Cascade Lake processors.  Some of my previous blog posts have discussed the performance issues of PMEM and despite numerous searches, I can find no data presented by Intel or any other vendor to suggest that any improvement has occurred since this technology was made generally available.  Over time, and perhaps as more customers realized that faster restarts at the cost of slower operational performance might not be very compelling, the message started to morph into saving money.

Regarding TCO specifically for SAP Suite on HANA (SoH) and S/4HANA, let’s start with the basic assertion, i.e. PMEM is less expensive than DRAM.  This is documented by pricing which shows a 128GB PMEM DIMM costs approximately 60% of the cost of a 128 DRAM DIMM[i] on one site and 40%[ii]  on another site.  This discrepancy may result when one vendor shows effective prices and another list prices with the list price example showing a higher cost savings with PMEM.

I was interested to see what would happen with actual SAP instances.  For comparison, let us start with a conventional DRAM memory system and assume that after using appropriate sizing tools, we have determined that an SoH or S/4HANA system requires a total of 6TB of memory to support 3TB of data with 3TB dedicated to system and HANA working memory.  I chose 6TB because this fits perfects on most Intel systems using 4 processors and 48@128GB memory DIMMs.  This config also has the added bonus of no waste at all and maximized performance since parallelism is optimized when every memory channel is used.

By comparison, we need to figure out how much memory is required if we utilize PMEM.  The SAP note on persistent memory[iii] describes ratios of DRAM to PMEM ranging from 2:1 to 1:4. For SoH and S/4HANA, the advice given is to run QuickSizer, /SDF/HDB_SIZING or ZNEWHDB_SIZE depending on where you are starting from.  I asked 3 different customers, one small, one medium and one very large, to provide me with the output of their sizing reports based on existing ECC systems.  I have included two key sections for the midsized customer:

Screen Shot 2020-02-04 at 11.57.33 AM

The Persistent Memory FAQ[iv] says: “Persistent memory can be used for the main storage of column store table that is typically the dominating factor of data space consumption in SAP HANA environments. Other areas like delta storage, caches, intermediate result sets or row store remain solely in dynamic RAM (DRAM). Disk LOBs (SAP Note 2220627) are also not part of the persistent memory.”  If you add up the numbers above using this rule, you may notice that this means that when using persistent memory, the amount of data housed in PMEM vs. DRAM does not fit with any of the ratios mentioned earlier.  Looking at the sizing reports that I obtained, the amount of PMEM vs. DRAM was more in the range of 1:1.5[v].

Now, let’s apply the very best ratio of the three reports, i.e. the very large customer, to our 6TB example above and we see that we need 6TB x .433 = 2.53TB PMEM and 6TB x .567 = 3.47TB DRAM.  Assuming 128GB DIMMs, this translates to 20.2 PMEM DIMMs and 27.8 DRAM DIMMs which rounded up comes to 21 and 28 DIMMs, i.e. 49 DIMMs total.  Clearly, this is one more than the max number (48) in a 4-socket system.  In addition, SAP note 2786237 states that a configuration must have: “Homogeneous symmetrical assembly of DRAM and PMEM DIMMs with maximum utilization of all memory channels per processor”, so the minimum configuration would be 28 of each type of DIMM for a total of 56 DIMM slots.

To the best of my knowledge, no Cascade Lake system supports this number of DIMM slots.  Several vendors support 64 DIMM slots on a 6 or 8-socket system.  Those that do not would require a 96 DIMM slots configuration.  At 64 DIMM slots, this configuration would waste the difference between the HANA memory requirement and the system configuration requirement, i.e. 540GB of DRAM and 1,508GB of PMEM would be wasted.  At 96 DIMM slots, the waste would be 2,588GB and 3,556GB respectively.  With either a 64 DIMM slot or 96 DIMM slot configuration, instead of a relatively affordable 4-socket system, a significantly more expensive 6 or 8-socket system would be required.

I chose to use the best pricing that I could find for DIMM prices assuming that other vendors would be able to match these prices.  I then applied that pricing to those vendors that can utilize  64 DIMM slots on a 6 or 8-socket configuration.  After a simple calculation[vi], the cost of just the memory of the DRAM+PMEM system came out $7,648 higher than the DRAM only system.  And remember, this is before adding in any additional costs for more processors and a system which can support more processors.

Of course, 256GB DRAM DIMMs could be used, reducing the DRAM DIMM count to 15, but this raises a thorny issue; No appliance has been certified by SAP[vii] with 256GB DRAM DIMMs.  Even if we ignored that issue and went out on a limb using TDI V5 relaxed rules, the significantly higher cost of 256GB DRAM DIMMs over 128GB DRAM DIMMs[viii] plus the need to round up to 24 DIMM slots would result in a configuration that was still substantially higher cost than the DRAM only configuration.

Any way that you cut it, the use of PMEM in a realistic SoH or S/4HANA configuration results in a higher cost of acquisition than a DRAM only configuration.  In other words, as shown in the previous blog posts, performance takes a major hit when using PMEM for HANA, it does not save any money and actually costs more and the only potential gain comes from faster restarts.

[i] https://www.dell.com/en-us/work/shop/cty/pdp/spd/poweredge-r940/pe_r940_12229_vi_vp?configurationid=0163c707-0003-46a0-808a-3b55c864ba70
[ii] https://dcsc.lenovo.com/#/configuration/cto/7X13CTO1WW?hardwareType=server
[iii] https://launchpad.support.sap.com/#/notes/2786237
[iv] https://launchpad.support.sap.com/#/notes/2700084
[v] actual range was 41.5% to 43.3% for PMEM versus 58.5% to 56.7% for DRAM based on the small to very large reports
[vi] 48 x $2,670 = 128,160 (DRAM only), 32 x $1,574 + 32 x $2,670 = $135,808 (DRAM + PMEM)
[vii] https://www.sap.com/dmc/exp/2014-09-02-hana-hardware/enEN/appliances.html#viewcount=100
[viii] https://dcsc.lenovo.com/#/configuration/cto/7X13CTO1WW?hardwareType=server

February 4, 2020 Posted by | Uncategorized | , , , , , , , , , , , | 2 Comments

Optane DC Persistent Memory – Proven, industrial strength or full of hype?

Intel® Optane™ DC persistent memory represents a groundbreaking technology innovation” says the press release from Intel.  They go on to say that it “represents an entirely new way of managing data for demanding workloads like the SAP HANA platform. It is non-volatile, meaning data does not need to be re-loaded from persistent storage to memory after a shutdown. Meanwhile, it runs at near-DRAM speeds, keeping up with the performance needs and expectations of complex SAP HANA environments, and their users.”  and “Total cost of ownership for memory for an SAP HANA environment can be reduced by replacing expensive DRAM modules with non-volatile persistent memory.”  In other words, they are saying that it performs well, lowers cost and improves restart speeds dramatically.  Let’s take a look at each of these potential benefits, starting with Performance, examine their veracity and evaluate other options to achieve these same goals.

I know that some readers appreciate the long and detailed posts that I typically write.  Others might find them overwhelming.  So, I am going to start with my conclusions and then provide the reasoning behind them in a separate posts.

Conclusions

Performance

Storage class memory is an emerging type of memory that has great potential but in its current form, Intel DC Persistent Memory, is currently unproven, could have a moderate performance impact to highly predictable, low complexity workloads; will likely have a much higher impact to more complex workloads and potentially a significant performance degradation to OLTP workloads that could make meeting performance SLAs impossible.

Some workloads, e.g. aged data in the form of extension nodes, data aging objects, HANA native storage extensions, data tiering or archives could be placed on this type of storage to improve speed of access.  On the other hand, if the SLAs for access to aged data do not require near in-memory speeds, then the additional cost of persistent memory over old, and very cheap, spinning disk may not be justified.

Highly predictable, simple, read-only query environments, such as canned reporting from a BW systems may derive some value from this class of memory however data load speeds will need to be carefully examined to ensure data ingestion throughput to encrypted persistent storage allow for daily updates within the allowed outage window.

Restart Speeds

Intel’s Storage Class memory is clearly orders of magnitude faster than external storage, whether SSD or other types of media.  Assuming this was the only issue that customers were facing, there were no performance or reliability implications and no other way to address restart times, then this might be a valuable technology.  As SAP has announced DRAM based HANA Fast Restart with HANA 2.0 SPS04 and most customers use HANA System Replication when they have high uptime requirements, the need for rapid restarts may be significantly diminished.  Also, this may be a solution to a problem of Intel’s own making as IBM Power Systems customers rarely share this concern, perhaps because IBM invested heavily in fast I/O processing in their processor chips.

TCO

On a GB to GB comparison, Optane is indeed less expensive than DRAM … assuming you are able to use all of it.  Several vendors’ and SAP’s guidance suggest you populate the same number of slots with NVDIMMs as are used for DRAM DIMMs.  SAP recommends only using NVDIMMs for columnar storage and historic memory/slot limitations are largely based on performance.  This means that some of this new storage may go unused which means the cost per used GB may not be as low as the cost per installed GB.

And if saving TCO is the goal, there are dozens of other ways in which TCO can be minimized, not just lowering the cost of DIMMs.  For customers that are really focused on reducing TCO, effective virtualization, different HA/DR methodologies, optimized storage and other associated IT cost optimization may have as much or more impact on TCO as may be possible with the use of storage class memory.  In addition, the cost of downtime should be included in any TCO analysis and since this type of memory is unproven in wide spread and/or large memory installations, plus the available memory protection is less than is available for DRAM based DIMMs, this potential cost to the enterprise may dwarf the savings from using this technology currently.

May 13, 2019 Posted by | Uncategorized | , , , , , , , , , , , , , | 1 Comment