SAPonPower

An ongoing discussion about SAP infrastructure

Optane DC Persistent Memory – Proven, industrial strength or full of hype?

Intel® Optane™ DC persistent memory represents a groundbreaking technology innovation” says the press release from Intel.  They go on to say that it “represents an entirely new way of managing data for demanding workloads like the SAP HANA platform. It is non-volatile, meaning data does not need to be re-loaded from persistent storage to memory after a shutdown. Meanwhile, it runs at near-DRAM speeds, keeping up with the performance needs and expectations of complex SAP HANA environments, and their users.”  and “Total cost of ownership for memory for an SAP HANA environment can be reduced by replacing expensive DRAM modules with non-volatile persistent memory.”  In other words, they are saying that it performs well, lowers cost and improves restart speeds dramatically.  Let’s take a look at each of these potential benefits, starting with Performance, examine their veracity and evaluate other options to achieve these same goals.

I know that some readers appreciate the long and detailed posts that I typically write.  Others might find them overwhelming.  So, I am going to start with my conclusions and then provide the reasoning behind them in a separate posts.

Conclusions

Performance

Storage class memory is an emerging type of memory that has great potential but in its current form, Intel DC Persistent Memory, is currently unproven, could have a moderate performance impact to highly predictable, low complexity workloads; will likely have a much higher impact to more complex workloads and potentially a significant performance degradation to OLTP workloads that could make meeting performance SLAs impossible.

Some workloads, e.g. aged data in the form of extension nodes, data aging objects, HANA native storage extensions, data tiering or archives could be placed on this type of storage to improve speed of access.  On the other hand, if the SLAs for access to aged data do not require near in-memory speeds, then the additional cost of persistent memory over old, and very cheap, spinning disk may not be justified.

Highly predictable, simple, read-only query environments, such as canned reporting from a BW systems may derive some value from this class of memory however data load speeds will need to be carefully examined to ensure data ingestion throughput to encrypted persistent storage allow for daily updates within the allowed outage window.

Restart Speeds

Intel’s Storage Class memory is clearly orders of magnitude faster than external storage, whether SSD or other types of media.  Assuming this was the only issue that customers were facing, there were no performance or reliability implications and no other way to address restart times, then this might be a valuable technology.  As SAP has announced DRAM based HANA Fast Restart with HANA 2.0 SPS04 and most customers use HANA System Replication when they have high uptime requirements, the need for rapid restarts may be significantly diminished.  Also, this may be a solution to a problem of Intel’s own making as IBM Power Systems customers rarely share this concern, perhaps because IBM invested heavily in fast I/O processing in their processor chips.

TCO

On a GB to GB comparison, Optane is indeed less expensive than DRAM … assuming you are able to use all of it.  Several vendors’ and SAP’s guidance suggest you populate the same number of slots with NVDIMMs as are used for DRAM DIMMs.  SAP recommends only using NVDIMMs for columnar storage and historic memory/slot limitations are largely based on performance.  This means that some of this new storage may go unused which means the cost per used GB may not be as low as the cost per installed GB.

And if saving TCO is the goal, there are dozens of other ways in which TCO can be minimized, not just lowering the cost of DIMMs.  For customers that are really focused on reducing TCO, effective virtualization, different HA/DR methodologies, optimized storage and other associated IT cost optimization may have as much or more impact on TCO as may be possible with the use of storage class memory.  In addition, the cost of downtime should be included in any TCO analysis and since this type of memory is unproven in wide spread and/or large memory installations, plus the available memory protection is less than is available for DRAM based DIMMs, this potential cost to the enterprise may dwarf the savings from using this technology currently.

May 13, 2019 Posted by | Uncategorized | , , , , , , , , , , , , , | 1 Comment

Why SAP HANA on IBM Power Systems

This entry has been superseded by a new one: https://saponpower.wordpress.com/2014/06/06/there-is-hop-for-hana-hana-on-power-te-program-begins/

 

Before you get the wrong impression, SAP has not announced the availability of HANA on Power and in no way, should you interpret this posting as any sort of pre-announcement.  This is purely a discussion about why you should care whether SAP decides to support HANA on Power .

 

As you may be aware, during SAP’s announcement of the availability of HANA for Business Suite DB for ramp-up customers in early January, Vishal Sikka, Chief Technology Officer and member of the Executive Board at SAP, stated “We have been working heavily with [IBM].  All the way from lifecycle management and servers to services even Cognos, Business Intelligence on top of HANA – and also evaluating the work that we have been doing on POWER.  As to see how far we can be go with POWER – the work that we have been doing jointly at HPI.  This is a true example of open co-innovation, that we have been working on.”  Ken Tsai, VP of SAP HANA product marketing, later added in an interview with IT Jungle,  “Power is something that we’re looking at very closely now.” http://www.itjungle.com/fhs/fhs011513-story01.html.  And from Amit Sinha, head of database and technology product marketing. “[HANA] on Power is a research project currently sponsored at Hasso Plattner Institute. We await results from that to take next meaningful steps jointly with IBM,”  Clearly, something significant is going on.  So, why should you care?

 

Very simply, the reasons why customers chose Power Systems (and perhaps HP Integrity and Oracle/Fujitsu SPARC/Solaris) for SAP DBs in the past, i.e. scalability, reliability, security, are just as relevant now with HANA as in the past with conventional databases, perhaps even more so.  Why more so?  Because once the promise of real time analytics on an operational database is realized, not necessarily in Version 1.0 of the product, but in the future undoubtedly, then the value obtained by this capability would result in the exact same loss of value if the system was not available or did not respond with the speed necessary for real time analytics.

 

A little known fact is that HANA for Business Suite DB currently is limited to a single node.  This means that scale out options, common in the BW HANA space and others, is not an option for this implementation of the product.  Until that becomes available, customers that wish to host large databases may require a larger number of cores than x86 vendors currently offer.

 

A second known but often overlooked fact is that parallel transactional database systems for SAP are often complex, expensive and have so many limitations that only two types of customers consider this option; those which need continuous or near continuous availability and those that want to move away from a robust UNIX solution and realize that to attain the same level of uptime as a single node UNIX system with conventional HA, an Oracle RAC or DB2 PureScale cluster is required.  Why is it so complex?  Without getting into too much detail, we need to look at the way SAP applications work and interact with the database.  As most are aware, when a user logs on to SAP, they are connecting to a unique application server and until they log off, will remain connected to that server.  Each application server is, in turn, connected to one node of a parallel DB cluster.  Each request to read or write data is sent to that node and if the data is local, i.e. in the memory of that node, the processing occurs very rapidly.  If, on the other hand, the data is on another node, that data must be moved from the remote node to the local node.  Oracle RAC and DB2 PureScale use two different approaches with Oracle RAC using their Cache Fusion to move the data across an IP network and DB2 PureScale using Remote DMA to move the data across the network without using an IP stack, thereby improving speed and reducing overhead.  Though there may be benefits of one over the other, this posting is not intended to debate this point, but instead point out that even with the fastest, lowest overhead transfer on an Infiniband network, access to remote memory is still thousands of time slower than accessing local memory.

 

Some applications are “cluster aware”, i.e. application servers connect to multiple DB nodes at the same time and direct traffic based on data locality which can only be possible if the DB and App servers work cooperatively to communicate what data is located where.  SAP Business Suite is not currently cluster aware meaning that without a major change in the Netweaver Stack, replacing a conventional DB with a HANA DB will not result in cluster awareness and the HANA DB for Business Suite may need to remain as a single node implementation for some time.

 

Reliability and Security have been the subject of previous blog posts and will be reviewed in some detail in an upcoming post.  Clearly, where some level of outages may be tolerable for application servers due to an n+1 architecture, few customers consider outages of a DB server to be acceptable unless they have implemented a parallel cluster and even then, may be mitigated, but still not considered tolerable.  Also, as mentioned above, in order to achieve this, one must deal with the complexity, cost and limitations of a parallel DB.  Since HANA for Business Suite is a single node implementation, at least for the time being, an outage or security intrusion would result in a complete outage of that SAP instance, perhaps more depending on interaction and interfaces between SAP components.  Power Systems has a proven track record among Medium and Large Enterprise SAP customers of delivering the lowest level of both planned and unplanned outages and security vulnerabilities of any open system.

 

Virtualization and partition mobility may also be important factors to consider.  As all Power partitions are by their very definition “virtualized”, it should be possible to dynamically resize a HANA DB partition, host multiple HANA DB partitions on the same system and even move those partitions around using Live Partition Mobility.  By comparison, an x86 environment lacking VMware or similar virtualization technology could do none of the above.  Though, in theory, SAP might support x86 virtualization at some point for production HANA Business Suites DBs, they don’t currently and there are a host of reasons why they should not which are the same reasons why any production SAP databases should not be hosted on VMware as I discussed in my blog posting:  https://saponpower.wordpress.com/2011/08/29/vsphere-5-0-compared-to-powervm/   Lacking x86 virtualization, a customer might conceivably need a DB/HA pair of physical machines for each DB instance compared to potentially a single DB/HA pair for a Power based virtualized environment.

 

And now a point of pure speculation; with a conventional database, basis administrators and DBAs weigh off the cost/benefit of different levels in a storage hierarchy including main memory, flash and HDDs.  Usually, main memory is sized to contain upwards of 95% of commonly accessed data with flash being used for logs and some hot data files and HDDs for everything else.  For some customers, 30% to 80% of an SAP database is utilized so infrequently that keeping aged items in memory makes little sense and would add cost without any associated benefit.  Unlike conventional DBs, with HANA, there is no choice.  100% of an SAP database must reside in memory with flash used for logs and HDDs used for a copy of the data in memory.  Not only does this mean radically larger amounts of memory must be used but as a DB grows, more memory must be added over time.  Also, more memory means more DIMMS with an associated increase in DIMM failure rates, power consumption and heat dissipation.  Here Power Systems once again shines.  First, IBM offers Power Systems with much larger memory capabilities but also offers Memory on Demand on Power 770 and above systems.  With this, customers can pay for just the memory they need today and incrementally and non-disruptively add more as they need it.  That is not speculation, but the following is.  Power Systems using AIX offers Active Memory Expansion (AME), a unique feature which allows infrequently accessed memory pages to be placed into a compress pool which occupies much less space than uncompressed pages.  AIX then transparently moves pages between uncompressed and compressed pools based on page activity using a hardware accelerator in POWER7+.  In theory, a HANA DB could take advantage of this in an unprecedented way.  Where test with DB2 have shown a 30% to 40% expansion rate (i.e. 10GB of real memory looks like 13GB to 14GB to the application), since potentially far more of a HANA DB would have low use patterns that it may be possible to size the memory of a HANA DB at a small fraction of the actual data size and consequently at a much lower cost plus associated lower rates of DIMM failures, less power and cooling.

 

If you feel that these potential benefits make sense and that you would like to see a HoP option, it is important that you share this desire with SAP as they are the only ones that can make the decision to support Power.  Sharing your desire does not imply that you are ready to pull the trigger or that you won’t consider all available option, simply that you would like to get informed about SAP’s plans.  In this way, SAP can gauge customer interest and you can have the opportunity to find out which of the above suggested benefits might actually be part of a HoP implementation or even get SAP to consider supporting one or more of them that you consider to be important.  Customers interested in receiving more detailed information on the HANA on Power effort should approach their local SAP Account  Executive in writing, requesting disclosure information on this platform technology effort.

March 14, 2013 Posted by | Uncategorized | , , , , , , , , | 5 Comments

Get all the benefits of SAP HANA 2.0+ today with no risk and at a fraction of the cost!

I know, that sounds like a guy on an infomercial trying to sell you a set of knives, but it may surprise you that much of the benefits planned for the eventual HANA 2.0 Transactional database from SAP are available today, with tried and proven technology and no rewriting of any SAP applications.

Let’s take a quick step back.  I just got back from SAP TechEd 2011 in Las Vegas.  As with Sapphire, just about every other word out of SAP employees’ mouths was HANA or in-memory computing.   SAP is moving rapidly on this exciting project.  They have now released two new HANA applications, Smart Meter Analytics and COPA Accelerator.  SAP shared that BW on HANA would be entering ramp-up in November.  They are busy coding other solutions as well.  Some people have the misconception that HANA is going to be just like BWA, i.e. plug and play, just select the data you want moved to it, transparent to the application other than being faster.  But, that is not how HANA works.  Applications have to be rewritten to HANA not ported from existing SAP systems.   Data must be modeled based on specific application requirements.  Though the benefits can be huge, the effort to get there is not trivial.

It is going to be a gradual process by which specific applications are rolled out on HANA.  BW is an obvious next step since many customers have both a BW system and a BWA.  BW on HANA promises the ability to have a single device with radical improvements in speed and not just for pre-selected Infocubes, but for the entire database.  Side note, HANA provides additional benefit for text queries as it does not have the 60 character limitation of BW.  It is not quite so clear if customers will be willing to pay the price as this would place even old, infrequently accessed data in memory along with the incremental price for systems, memory and SAP software for this aged data.

While this may be obvious, it is worthwhile summarizing the basic benefits of HANA.  HANA and in-memory database, has three major benefits: 1) all data and indexes reside in memory eliminating any disk access for query purposes resulting in dramatic gains in query speeds, 2) application code execution on the same device as the database thereby eliminating data transfers between application servers and database servers, 3)near realtime replication of data from not just SAP systems, but from just about any other data source that a customer might choose which is indeed a great thing.

So, great goals and eventually, something that should benefit not just query based applications, but a wide variety of applications including transactional processing.  As mentioned above, it is not simply a question of exporting the data from the current ERP database, for instance, and dropping it on HANA.  Every one of the 80,000 or so tables must be modeled into HANA.  All code, including thousands of ABAP and JAVA programs, must be rewritten to run on HANA.  And, as mentioned in a previous blog entry, SAP must significantly enhance HANA to be able to deal with cache coherency, transactional integrity, lock management and discrete data recovery, to name just a few of a long laundry list of challenges.  In other words, despite some overenthusiastic individuals’ assertions or innuendos that the transactional in-memory database will be available in 2 to 3 years, in reality, it will likely take far longer.

This means you have to wait to attain those benefits, right?  Wrong.  The technology exists today, with no code change, no data modeling and fully supported to place entire databases, including indexes, in memory along with application servers thereby achieving two of the three goals of HANA.  Let’s explore that a little further.  HANA, today, allows for 5 to 1 compression of most uncompressed data, but to make HANA work, an equal amount of memory to the compressed data memory must be allocated for temporary work space.  For example, a 1 TB uncompressed database should require about 200GB for data and another 200GB for work space resulting in a total memory requirement of 400GB.  A 10TB uncompressed database would require 4TB of memory, but since the supported configurations only allow for a total of 1TB per node, this would require a cluster of 4 @ 1TB systems.  Fortunately, IBM provides the System  x3850 and is certified for exactly this configuration.    But remember, we are talking about a future in-memory transactional system not today’s HANA.

DB2 offers data compression of approximately 60% today which means that a 10TB database, if buffered completely in memory, would require 4TB of memory.  A tall requirement, but achievable since IBM offers a 4TB system called the Power 795.  However, IBM also offers a feature called Active Memory Expansion (AME), available only with Power Systems and AIX, which can make real memory appear to applications as if it was larger than it really is using memory compression.  DB2 is fully supported with this feature and can see an additional 20% to 35% compression using AME.   In other words, that same 10TB database may be able to fit within 3TB or less of real memory.

Some customers might not need such a large system from a processing perspective, so two options exist.  One, with Capacity on Demand, customers can turn on a small fraction of the available processors while still having access to all available memory.  This would significantly reduce the cost of activating processors as well as for AIX and related software licenses and maintenance.   For those customers that purchase their DB2 software on a per core basis, this would further reduce those costs, but clearly have no effect on customers who purchase the SAP OEM edition of DB2.

A second option is to use DB2 PureScale, now a certified database option available for approved pilot customers of SAP.  With this option, using the same AME feature, a customer could cluster 3 @ 1TB Power Systems or 6 @ 500GB systems.

By the same token, SAP application servers can co-reside in the same OS instance or instances with DB2.  While this would add to the memory requirement, ABAP servers get even more benefit of 50% or more compression with AME.

So, it is entirely possible and conceivable to build a full, in-memory database, using IBM Power Systems and DB2 housing both the database and application server with no code changes, no data modeling, with an established database system that is proven and handles all of the transactional database requirements noted above today.    Assuming you already have a DB2 license, you would not even see any incremental software cost unless you move to DB2 Purescale which does come at an additional cost of 2% SAV.  For those with licenses for Oracle or SQLserver, while I am no expert on DB2 cost studies, those that I have seen show very good ROI over relatively short periods, and that is before you consider the ability to achieve near HANA like performance described in this blog post.  Lastly, this solution is available under existing SAP license agreements unlike HANA which comes with a pretty significant premium.

September 18, 2011 Posted by | Uncategorized | , , , , , , , | Leave a comment