An ongoing discussion about SAP infrastructure

ERP Platform Selection – An analysis of Gartner Group’s presentation

Recently, a document from The Gartner Group found its way across my desk.  This document was written by Philip Dawson and Donald Feinburg and was called: “Virtualizing SAP and Oracle: ERP Optimization”.  The document seemingly has a single point; everything except x86 is dead so all new implementations should go on x86 and almost everything else should be migrated to x86 when possible.   Not only is this conclusion short sighted and oblivious to the requirements of large SAP implementations, but we should remember that this is the same company that proclaimed the mainframe dead about 20 years ago, a fact that is as incorrect today as it was back then.

First, the argument that Gartner did not make and is strangely absent but, I think, most customers have as their number #1 criteria: TCO.   Actually, they did in a backhanded sort of way, but in support of legacy UNIX systems and only for tier 2 ERP instances.  I have worked directly with hundreds of SAP and Oracle customers.  While a delight to work with one without financial constraints, these are few and far in between.  Most customers must find ways to save money.  A complex ERP landscape requires a wide array of systems/partitions, high availability, DR, backup/recovery, storage devices, etc.  Companies like International Technology Group (ITG) found that when TCO is analyzed, landscapes based on IBM Power Systems are less expensive than x86:  Solitaire Interglobal Ltd analyzed TCO and many other factors without limiting their analysis to ERP systems and came up with a similar conclusion:  I have put together many landscape comparisons for SAP and have also been able to show the same effect when considering the support that is required for mission critical environments, the cost of enterprise level virtualization technologies and the cost of high availability software with all being supported for a minimum of 3 years with 24×7 by 4 hour maintenance.

All by itself, a consultant’s report which purports to help customers choose their ERP platforms but does not take this vital issue into consideration is questionable at best.  That said, allow me to take a few of the arguments they did make in the document and explain where each falls short.

Argument #1 – “ISVs focus new application functionality first on x86 Tier 1 ports. Largest installed base. – Invest!”  At least in the case of SAP, with the exception of HANA, new functionality is available on all Tier 1 platforms simultaneously, not first on x86.  AIX is considered a Tier 1 platform and shares in this “first” port.  Oracle understandably supports its OSs first, i.e. Linux and Solaris, but actually ports to AIX and other Tier 1 platforms during the same development process and simply limits other OS availability for marketing purposes.  Second point; Gartner, as a company, certainly has the experience that apparently these two, either naive or misinformed, individuals don’t, i.e. that number of installs does not equate to size of the install base.   A Fortune 100 company may have 1,000 or more times the number of SAP users and hundreds of thousands more employees compared to a small customer, but the number of installed copies of an ISV’s software may be exactly the same.  Install base is far more relevant when one considers the % of revenue/profit derived from a type of platform than number of CDs cut.  I know of a few food manufacturers that have thousands of retail customers, but sell the vast majority of their products through “just” Walmart and a few other mega retailers.  By the authors’ logic, they should abandon the mega retailers as the “largest install base” as a count of individual companies they deal with lies in the mom and pop grocery stores.

Argument #2 – “Unix/RISC & Itanium  Viable for mainstream applications and functions (but be prepared to accept delays in new features, functions and patches). Declining installed bases —Move to mainstream on system refresh!”  Some of this is certainly true for the Itanium environment due to Oracle’s explicit removal of Itanium support.  On the AIX side, however,  this statement could not be more incorrect.  Neither Oracle nor SAP has published any such guidance, made any public pronouncements or changed their level of investment with IBM and ongoing development and support efforts continue as before, i.e. AIX is a tier 1 platform and supported as such.  The install base for Power, by the way, is not declining and is benefiting from those customers moving away from HP/Itanium and Oracle/Solaris.

Argument #3 – “Move ERP application deployments toward Linux or Windows on x86. Everything else moving to niche.”  Their conclusion, as with the first argument, is based on marketshare numbers which are not published by SAP or Oracle and are unrelated to the value of installations.  SAP, for example, has a product availability matrix and nothing in that matrix suggest anything remotely close to niche status for AIX.

Argument #4 – “x86 performance and RAS features reach parity, drive Windows and Linux volumes further.”  If only there was a shred of proof to that, this might be an interesting conclusion.  The highest x86 result on the single system SAP SD 2-tier benchmark is 25,500 users/140,720 SAPS using the IBM x3850 x5 with 80 cores.  Not bad, but the highest Power Systems result is 126,063 users/688,630 SAPS using the Power 795 with 256 cores.  That is almost 5 times the size of the x86 result.  Perhaps they meant per core.  While Gartner published this last year and therefore did not have the benefit of the recent results publishes on the SAP benchmark web site, I will share them here: the best x86 result is for the IBM x3650 M4 with 2 @ E5-2690 processors/16 cores achieved 7,855 users/42,880 SAPS  or 2,680 SAPS/core.  The Power 730, with 2 Power7 3.55Ghz chips/16 cores, achieved 8,704 users/47,600 SAPS or 2,975 SAPS/core.  While the x86 results are impressive, they do not achieve parity with Power unless Gartner’s has submitted a new definition to Webster’s Dictionary of parity being 90% not 100%.

As to RAS, wow!, where do I start?  Well, RAS starts with error detection.  Since the mid 90’s, Power Systems have offered First Failure Data Capture (FFDC), previously only found on mainframe systems.  This technology is used pervasively throughout the system down to the processors and memory subsystems.  It is instrumented to detect soft and hard errors in virtually every part of the system and each chip such that the service processor can make intelligent decisions as to whether an error might cause a partition or system failure and then take appropriate actions to avoid or minimize such an event.  In the unlikely event of an unpredicted failure than can’t be contained, this technology allows the service processor to pinpoint the location of the failure such that the failing component can be blacklisted, the system or partition rebooted around the failing component or in a worst case scenario, a FRU number or the failing component to be sent to IBM for immediate replacement.  Intel recently introduced Machine Check Architecture Recovery (MCA Recovery) which is less instrumented and less functional than FFDC was back in the mid 90’s.  Currently, it can pinpoint only memory errors, but it does not utilize a service processor and instead asks the operating system or virtualization manager to handle the error.  This is sort of like GM instrumenting their engines such that when they detect a problem such as poor fuel quality or low air temperature, that the car does nothing or perhaps fails and suggests the user call a repair shop instead of the system automatically detecting the problem and adjusting the fuel injection system to compensate.

Add to this features like dynamic processor deallocation, L2 and L3 cache line delete, enhanced error detection and recovery for PCI adapters, to name just a few, for which there is no equivalent in x86 systems.  Or go one step further and look at Power System’s ability to retry an instruction on any processor in the event of a processor core error detection or being able to pick up the actual thread and drop it on another processor in the system, with no interruption to application execution, if the core is determined to be failing.  No such thing exists in x86.  How about memory protection keys which locks down memory such that, for instance, a failing device driver can corrupt the OS kernel’s memory or a misbehaving an application can corrupt the file system’s memory?  No such thing exists in x86 land.  High end Power Systems even offer the ability to recover from a system clock failure through a redundant clock as well as to add/remove a processor card while the system is running without any interruption to the running partitions such that a failing component can be replaced.  Once again, x86 systems do not offer these sort of features.  So, it makes you wonder on which planet these Gartner guys reside when they make such statements!

Argument #5 – “Consider x86 virtualization for HA.” Where do I start with such an absurd statement?   First of all, Oracle has stated that if you run with VMware or other non-Oracle VM products and have a problem, you must prove that the problem was not caused by the virtualization software by recreating the problem on a standalone system.  If that was not enough to cause a customer to avoid placing a database under virtualization, then the high I/O overhead and increased latencies probably will.  That said, some customers might still but DB under x86 virtualization, but then the issue of HA must be considered.   VMware HA does a great job of detecting a failed system or partition and recovering one or more on other systems, but VMware HA does not do system stack recovery.  For customers that run SAP, for instance, in addition to the database running, one must also ensure that the messaging and enqueue servers are running, that all network files that must be accessed are available, that all file systems are available and that if something is not, that retry or repair actions are initiated prior to restarting the entire partition.  VMware does provide an API by which third parties might write application APIs, as Symantec has done, but few proof points exist as to how well this works in production SAP environments.

Argument #6 – “All UNIX for Large Oracle DBMS deployments greater than 64 cores.”  So, if a Power platform running AIX is less expensive, more reliable and more secure, Gartner’s argument is that you should ignore these facts and, for smaller DBMS instances than 64 cores, place these on x86.  Interesting and hard to argue with an statement that makes no logical sense.

In conclusion, Gartner has produced a document which suggests a system selection based on incorrect facts, assumptions and which ignores the criteria that is important to customers.  By comparison, there is a wealth of arguments that can be made as to why Power Systems continues to be the most robust, cost effective, reliable and secure platform for ERP and other applications.


March 23, 2012 - Posted by | Uncategorized | , , , , , , , , , ,


    • I am going to guess that you are trying to suggest that AIX is not invulnerable as the word “Secure?” by itself does not tell me exactly what you are asking. I am not aware of any vendor that is offering an OS or virtualization technology that claims invulnerability. AIX is much less vulnerable that other OSs, but not invulnerable. I just queried and entered a few different search words. For AIX, it came back with 291 entries, for Windows, 2,387 and for Linux,1,845. Likewise, I entered PowerVM which shows 0 entries, VIOS with 10, VMware with 508 and KVM with 33. So, AIX/VIOS/PowerVM might have a few, but they pale in comparison to x86 environments. These numbers are not an indication that the vulnerability still exists or that the searched OS or virtualization technology was responsible for the problem, just that the word exists within the searched document. You can also view IBM’s Security team (X-Force) analysis of security vulnerabilities, which in their mid-2010 report, found AIX to be far more secure than any other listed OS.

      Comment by Alfred Freudenberger | April 4, 2012 | Reply

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: