SAPonPower

An ongoing discussion about SAP infrastructure

The top 3 things that SAP needs are memory, memory and I can’t remember the third. :-) A review of the IBM Power Systems announcements with a focus on the memory enhancements.

While this might not exactly be new news, it is worthwhile to consider the value of the latest Power Systems announcements for SAP workloads.  On October 12, 2011, IBM released a wide range of enhancements to the Power Systems family.  The ones that might have received the most publicity, not to mention new model numbers, were valuable but not the most important part of the announcement, from my point of view.  Yes, the new higher MHz Power 770 and 780 and the ability to order a 780 with 2 chips per socket thereby allowing the system to grow to 96 cores were certainly very welcome additions to the family.  Especially nice was that the 3.3 GHz processors in the new MMC model of the 770 came in at the same price as the 3.1 GHz processors in the previous MMB model.  So, 6.5% more performance at no additional cost.

For SAP, however, raw performance often takes second fiddle to memory.   The old rule is that for SAP workloads, we run out of memory long before we run out of CPU.   IBM started to address this issue in 2010 with the announcement of the Active Memory Expansion (AME)  feature of POWER7 systems.  This feature allows for dynamic compression/decompression of memory pages thereby making memory appear to be larger than it really is.   The administrator of a system can select the target “expansion” and the system will then build a “compressed” pool in memory into which pages are compressed and placed starting from those pages less frequently accessed to those more frequently accessed.  As pages are touched, they are uncompressed and moved into the regular memory pool from which they are accessed normally.  Applications run unchanged as AIX performs all of the moves without any interaction or awareness required by the application.   The point at which response time, throughput or a large amount of CPU overhead starts to occur is the “knee of the curve”, i.e. slightly higher than the point at which the expansion should be set.  A tool, called AMEPAT, allows the administrator to “model” the workload prior to turning AME on, or for that matter on older hardware as long as the OS level is AIX 6.1 TL4 SP2 or later.

Some workloads will see more benefit than others.  For instance, during internal test run by IBM, the 2-tier SD benchmark showed outstanding opportunities for compression and hit 111% expansion, e.g. 10GB of real memory appears to be 21GB to the application, before response time or thoughput showed any negative effect from the compression/decompression activity.  During testing of a retail BW workload, 160% expansion was reached.  Even database workloads tend to benefit from AME.  DB2 database, which already feature outstanding compression, have seen another 30% or 40% expansion.  The reason for this difference comes from the different approaches to compression.  In DB2, if 1,000 residences or business have an address on Main Street,  Austin, Texas,  (had to pick a city so selected my own) DB2 replaces Main Street, Austin, Texas in each row with a pointer to another table that has a single row entitled Main Street, Austin, Texas.  AME, by comparison, is more of an inline compression, e.g. if it sees a repeating pattern, it can replace that pattern with a symbol that represents the pattern and how often it repeats.  Oracle recently announced that they would also support AME.  The amount of expansion with AME will likely vary from something close to DB2, if Oracle Advanced Compression is used, to significantly higher if Advanced Compression is not used since many more opportunities for compression will likely exist.

So, AME can help SAP workloads close the capacity gap between memory and CPU.  Another way to view this is that this technology can decrease the cost of Power Systems by either allowing customers to purchase less memory or to place more workloads on the same system, thereby driving up utilization and decreasing the cost per workload.  It is worthwhile to note than many x86 systems have also tried to address this gap, but as none offer anything even remotely close to AME, they have instead resorted to more DIMM slots.  While this is a good solution, it should be noted that twice the number of DIMMs requires twice the amount of power and cooling and suffers from twice the failures, i.e. TANSTAFL: there ain’t no such thing as a free lunch.

In the latest announcements, IBM introduced support for the new 32GB dimms.  This effectively doubled the maximum memory on most models, from the 710 through the 795.  Combined with AME, this decreases or eliminates the gap between memory capacity and  CPU and makes these models even more cost effective since more workloads can share the same hardware.  Two other systems received similar enhancements recently, but these were not part of the formal announcement.  The two latest blades in the Power Systems portfolio, the PS703 and the PS704, were announced earlier this year with twice the number of cores but the same memory as the PS701 and PS702 respectively.  Now, using 16GB DIMMS, the PS703/PS704 can support up to 256GB/512GB of memory making these blades very respectable especially for application server workloads.  Add to that, with the Systems Director Management Console (SDMC) AME can be implemented for blades allowing for even more effective memory per blade.   Combined, these blades have closed the price difference even further compared to similar x86 blades.

One last memory related announcement may have been largely overlooked by many because it involved an enhancement to the Active Memory Sharing (AMS) feature of PowerVM.  AMS has historically been a technology that allowed for overcommitment of memory.  While CPU overcommitment is now routine, memory overcommitment means that some % of memory pages will have to be paged out to solid state or other types of disk.  The performance penalty is well understood making this not appropriate for production workloads but potentially beneficial for many other non-prod, HA or DR workloads.  That said, few SAP customers have implemented this technology due to the complexity and performance variability that can result.  The new announcement introduces Active Memory™ Deduplication for AMS implementations.   Using this new technology, PowerVM will scan partitions after they finish booting and locate  identical pages within and across all partitions on the system.  When identical pages are detected, all copies, except one, will be removed and all memory references will point to the same “first copy” of the page.   Since PowerVM is doing this, even the OSs can be unaware of this action.  Instead, as this post processing proceeds, the PowerVM free memory counter will increase until a steady state has been reached.  Once enough memory is freed up in this manner, new partitions may be started.  It is quite easy to imagine that a large number of pages are duplicates, e.g. each instance of an OS has many read only pages which are identical and multiple instances of an application, e.g. SAP app servers, will likewise have executable pages which are identical.  The expectation is that another 30% to 40% effective memory expansion will occur for many workloads using this new technology.  One caveat however; since the scan is after a partition boots, operationally it will be important to have a phased booting schedule to allow for the dedupe process to free up pages prior to starting more partitions thereby avoiding the possibility of paging.  Early testing suggests that the dedupe process should arrive at a steady state approximately 20 minutes after partitions are booted.

The bottom line is that with the larger DIMMS, AME and AMS Memory Deduplication, IBM Power Systems are in a great position to allow customers to fully exploit the CPU power of these systems by combining even more workloads together on fewer servers.  This will effectively drive down the TCA for customers and remove what little difference there might be between Power Systems and systems from various x86 vendors.

November 29, 2011 - Posted by | Uncategorized | , , , , , , , , , , , ,

4 Comments »

  1. Is there a SAP NOTE for using Active Memory Sharing (AMS) feature of PowerVM for SAP systems?

    Comment by Srilaxmi Rao | December 24, 2013 | Reply

    • There are no SAP notes on AMS that I could find. There is a SAP note on the use of shared memory in the VMware environment, 1122388 – Linux: VMware vSphere configuration guidelines in which it is says that this is not recommended. For the same reasons as with that SAP note, I would not recommend using over commitment of memory under AMS in production environments. For non-prod, however, over commitment of memory may be possible depending on the performance requirements of each partition. For example, developers are expensive resources, so I would avoid anything that would impact their productivity and dev systems are in use pretty much every day. Many others, e.g. sandbox, training, test, QA are used less frequently. As long as the mix of LPARs sharing a physical system are not all active simultaneously, the only impact would be when a partition that has memory paged out becomes active again, in which case AMS will swap out pages for less active LPARs and swap in pages of the LPAR that is becoming active. During this time, performance will be sluggish.

      My personal recommendation would be to avoid this entirely. Though there would be savings on memory, the individuals using those partitions will not be aware of the cost savings and will complain to the help desk or management during those sluggish times. That will translate into headaches for the IT team which many not justify the savings on memory.

      On the other hand, AMS memory deduplication would have virtually no performance impact as long as proper start up procedures are followed. The dedupe function operates as follows. Five minutes after an LPAR is started, the VIO server starts to scan its memory and compare each page to other pages in use by other LPARS. Each time that it finds an exact duplicate, it changes the pointer for that page to the location of the first copy of the page and then wipes the duplicate copy clean and makes it available for allocation to other partition. This only takes a minute or so and has very little impact on any partitions since it is done by the VIO server and only affects the memory map in the hypervisor. The LPARs are not even aware of the actions that have been taken. If an LPAR tries to modify a duduped page, the VIO server grabs a free page and replicates the original page into that free page after which the LPAR is now free to modify it as needed. There might be a short delay during this process, but it should not be noticeable by anyone since it should be a very rare occurrence.

      So, the procedure that needs to be followed is fairly simple. 1) Never start more LPARS than the system has physical memory to support. 2) Wait for the dedupe process to occur. 3) Check the amount of free memory against the memory requirements of new LPARS to be started. 4) Start one or more LPARS which will in turn be deduped eventually. In this way, there will be no memory overcommittment and little to no performance impact. This also means that in the extremely rare event of a full system failure, the start up procedures have to follow the above suggestions.

      With AMS, you can select which partitions may participate and those for which AMS should not touch any pages.

      As for AME, this is highly recommended for ABAP app servers and may have some benefit for Java app servers as long as the systems utilize Power7+ chips, not Power7 since it does not offer hardware AME, only software. You should run AMEPAT, the AME modeling tool, against any partition that you are considering compressing with AME to determine the optimal expansion setting. Though AME can be used with database partitions and 2-tier app/db partitions, AME does not support large pages which Oracle prefers for production systems. That said, this is a performance issue only so you may decide that the performance tradeoff is acceptable in some non-prod LPARS. The good news is that you can selectively turn on AME for each partition. If AME has never been used on a system, I believe that it takes a reboot to enable AME. Likewise, an LPAR that has never been activated for AME must be rebooted to enable it for AME (which turns off large page support as a result), but then it can be set to an expansion factor of 1 if no expansion is desired.

      Comment by Alfred Freudenberger | January 6, 2014 | Reply

  2. We have a great deal of positive experience using Linux HugePages for our Oracle Databases, supporting an array of SAP applications. Can you comment on the value or performance benefit of AIX Large Memory Pages on IBM Power Systems?

    Comment by Dave | October 17, 2014 | Reply

    • We see a similar benefit with AIX and Oracle. Databases, in general, tend to follow a consistent pattern, i.e. if a field is accessed on a given row, there is a very large chance of subsequent fields being accessed and if one row is accessed, then there is also a high likelihood that the following row or several rows might be accessed in the near future. As a result, large page support will allow multiple rows to be brought into memory in a single IOP resulting in dramatically better access times for those subsequent accesses and memory management is reduced as well. This is essentially act as a look-ahead buffer. The reverse is also true. When Oracle flushes changed rows to disk from memory, large page support allows the number of IOPs to minimized resulting in superior throughput and less overhead to manage memory pages. AIX currently offers variable page sizes, from 4K to 16GB, but most often database systems like Oracle are set up to use 64K page sizes. There can be diminishing returns with too large of a page size, especially when those pages have to be flushed and are larger than can be accommodated with single IO operations.

      Comment by Alfred Freudenberger | October 17, 2014 | Reply


Leave a reply to Alfred Freudenberger Cancel reply