SAPonPower

An ongoing discussion about SAP infrastructure

The hype and the reality of HANA

Can you imagine walking into a new car dealership and before you can say anything about your current vehicle and needs, a salesperson  immediately offers to show you the latest, greatest and most popular new car!  Of course you can since this is what that person gets paid to do.  Now, imagine the above scenario where the salesperson says “how is your current car not meeting your needs?” and following it up with “I don’t want you to buy anything from me unless it brings you substantial value”.  After smelling salts have been administered, you might recover enough to act like a cartoon character trying to check your ears to make sure they are functioning properly and ask the salesperson to repeat what he or she said.

The first scenario is occurring constantly with SAP account execs, systems integrators and consultants playing the above role of new car salesperson.  The second rarely happens, but that is exactly the role that I will play in this blog post.

The hype around HANA could not be much louder and deep than it is currently.  As bad as it might be, the FUD (Fear, Uncertainty and Doubt) is worse.  The hype suggests that HANA can do everything except park your car since that is a future capability (not really, I just made that up.)  At the very worst, this hype suggests a vision for the future that, while not solving world hunger or global warming, might improve the operations and profitability of companies.  The second is more insidious.  It suggests that unless you act like lambs and follow the lead of the individual telling this tale, you will be like a lost sheep, out of support and further out of the mainstream.

I will address the second issue first.  As of today, the beginning of August, SAP has made absolutely no statement indicating they will discontinue support for any platform, OS or DB.  In fact, a review of SAP notes shows support for most OS’s with no end date and even DB2 9.7 has an end of support date that is several years past that of direct standard support from IBM!  So, what gives???  Is SAP saying one thing internally and another externally?  I have been working with SAP for far too long and know their business practices too well to believe that they would act in such a two-faced manner not to mention exposing themselves to another round of expensive and draining lawsuits.  Instead, I place the arrow of shame squarely on those rogue SAP account execs that are perpetuating this story.  The next time that one of them makes this sort of suggestion, turn the tables on them.  Ask them to provide you with a statement, in writing, backed up with official press releases or SAP notes, showing that this is the case.  If they can’t, it is reasonable to conclude that they are simply trying to use the age old FUD tactic to get you to spend more money with them now rather than waiting until/if SAP actually decides to stop supporting a particular type of HW, OS or DB.

And now for the first issue; the hype around HANA.  HANA offers dramatic benefits to some SAP customers.  Some incarnation of HANA may indeed be inevitable for the vast majority.  However, the suggestion that HANA is the end-all be-all flies in the face of many other solutions on the market, many of which are radically less expensive and often have dramatically lower risk.  Here is a very simple example.

Most customers would like to reduce the time and resources required to run batch jobs.  It seems as if there is not a CFO anywhere that does not want to reduce month end/quarter close from multiple days down to a day or less.  CFOs are not the only ones with that desire as certain functions must come to a halt during a close and/or the availability requirements go sky high during this time period requiring higher IT investments.  SAP has suggested that HANA can achieve exactly this, however it is not quite clear whether this will require BW HANA, Suite on HANA or some combination of the two or even another as yet unannounced HANA variant.  I am sure that if you ask a dozen consultants, you will get a dozen different answer to how to achieve these goals with HANA and it is entirely possible that each of them are correct in their own way.  One thing is certain however: it won’t come cheaply.  Not only will a company have to buy HANA HW and SW, but they will have to pay for a migration and a boatload of consulting services.  It will also not come without risk.  BW HANA and Suite on HANA require a full migration.  Those systems become the exclusive repository of business critical data.  HANA is currently in its 58th revision in a little over two years.  HA, DR and backup/recovery tools are still evolving.  No benchmarks for Suite on HANA have been published which means that sizing guidelines are based purely on the size of the DB, not on throughput or even users.  Good luck finding extensive large scale customer references or even medium sized ones in your industry.   To make matters worse, a migration to HANA is a one way path.  There is no published migration methodology to move from HANA back to a conventional DB.  It is entirely possible that Suite on HANA will be much more stable than BW HANA was, that these systems will scream on benchmarks,  that all of those HA, DR, Backup/Recovery and associated tools will mature in short order and that monkeys will fly.  Had the word risk not been invented previously, Suite on HANA would probably be the first definition in the dictionary for it.

So, is there another way to achieve those goals, maybe one that is less expensive, does not require a migration, software licenses or consulting services?  Of course not because that would be as impossible to believe as the above mentioned flying monkeys.   Well, strap on your red shoes and welcome to Oz, because it is not only possible but many customers are already achieving exactly those gains.  How?  By utilizing high performance flash storage subsystems like the IBM FlashSystem.  Where transaction processing typically accesses a relatively small amount of data cached in database buffers, batch, month end and quarter close jobs tend to be very disk intensive.  A well-tuned disk subsystem can deliver access speeds of around 5 milliseconds.  SSDs can drop this to about 1 millisecond.   A FlashSystem can deliver incredible throughput while accessing data in as little as 100 microseconds.   Many customers have seen batch times reduced to a third or less than what they experienced before implementing FlashSystem.  Best of all, there are no efforts around migration, recoding, consulting and no software license costs.  A FlashSystem is “just another disk subsystem” to SAP.  If an IBM SVC (SAN Volume Controller) or V7000 is placed in front of a FlashSystem, data can be transparently replicated from a conventional disk subsystem to FlashSystem without even a system outage.  If the subsystem does not produce the results expected the system can be repurposed or, if tried out via a POC, simply returned at no cost.  To date, few, if any, customers have returned a FlashSystem after completing a POC as they have universally delivered such incredible results that the typical result is an order for more units.

Another super simple, no risk option is to consider using the old 2-tier approach to SAP systems.  In this situation, instead of utilizing separate database and application server systems/partitions, database and app server instances are housed within a single OS system/partition.  Some customers don’t realize how “chatty” app servers are with an amazing number of very small queries and data running back and forth to DB servers.  As fast as Ethernet is, it is as slow as molasses compared to the speed of an inter-process communication within an OS.  As crazy as it may seem, simply by consolidating DB and app servers into a single OS, batch and close activity may speed up dramatically.  And here is the no risk part.  Most customers have QA systems and from an SAP architecture perspective, there is no difference in having app servers within a single OS compared to on separate OSs.  As a result, customers can simply give it a shot and see what happens.  No pain other than a little time to set up and test the environment.  Yes, this is the salesman telling you not to spend any money with him.

This is not the only business case for HANA.  Others involve improving reporting or even doing away with reporting in favor of real-time analytics.  Here is the interesting part.  Before Suite on HANA or even BW HANA became available, SAP had introduced real-time replication into side-car HANA appliances.    With these devices, the source of business critical data is kept on conventional databases.  You remember those archaic old systems that are reliable, secure, scalable and around which you have built a best practices environment not to mention have purchased a DB license and are simply paying maintenance on it.  Perhaps naively, I call this the 95-5 rule, not 80-20.  You may be able to achieve 95% of your business goals with such a side-car without risking a migration or the integrity of your data.   Also, since you will be dealing with a subset of data, the cost of the SW license for such a device will likely be a small fraction of the cost of an entire DB.  Even better, as an appliance, if it fails, you just replace the appliance as the data source has not been changed.  Sounds too good to be true?  Ask your SAP AE and see what sort of response you get.  Or make it a little more interesting and suggest that you may be several years away from being ready go to Suite on HANA but could potentially do a side-car in the short term and observe the way the shark will smell blood in the water.  By the way, since you have to be on current levels of SAP software in order to migrate to Suite on HANA and reportedly 70% of customers in North America are not current, (no idea of the rest of the world) so this may not even be much or a stretch.

And I have not even mentioned DB2 BLU yet but will leave that for a later blog posting.

Advertisement

August 5, 2013 Posted by | Uncategorized | , , , , , , , , , , , , , | 4 Comments

Is the SAP 2-tier benchmark a good predictor of database performance?

Answer: Not even close, especially for x86 systems.  Sizings for x86 systems based on the 2-tier benchmark can be as much as 50% smaller for database only workloads as would be predicted by the 3-tier benchmark.  Bottom line, I recommend that any database only sizings for x86 systems or partitions be at least doubled to ensure that enough capacity is available for the workload.  At the same time, IBM Power Systems sizings are extremely conservative and have built in allowances for reality vs. hypothetical 2-tier benchmark based sizings.  What follows is a somewhat technical and detailed analysis but this topic cannot, unfortunately, be boiled down into a simple set of assertions.

 

The details: The SAP Sales and Distribution (S&D) 2-tier benchmark is absolutely vital to SAP sizings as workloads are measured in SAPS (SAP Application Performance Standard)[i], a unit of measurement based on the 2-tier benchmark.  The goal of this benchmark is to be hardware independent and useful for all types of workloads, but the reality of this benchmark is quite different.  The capacity required for the database server portion of the workload is 7% to 9% of the total capacity with the remainder used by multiple instances of dialog/update servers and a message/enqueue server.  This contrasts with the real world where the ratio of app to DB servers is more in the 4 to 1 range for transactional systems and 2 or 1 to 1 for BW.  In other words, this benchmark is primarily an application server benchmark with a relatively small database server.  Even if a particular system or database software delivered 50% higher performance for the DB server compared to what would be predicted by the 2-tier benchmark, the result on the 2-tier benchmark would only change by .07 * .5 = 3.5%.

 

How then is one supposed to size database servers when the SAP Quicksizer shows the capacity requirements based on 2-tier SAPS?   A clue may be found by examining another, closely related SAP benchmark, the S&D 3-tier benchmark.  The workload used in this benchmark is identical to that used in the 2-tier benchmark with the difference being that in the 2-tier benchmark, all instances of DB and App servers must be located within one operating system (OS) image where with the 3-tier benchmark, DB and App server instances may be distributed to multiple different OS images and servers.  Unfortunately, the unit of measurement is still SAPS but this represents the total SAPS handled by all servers working together.  Fortunately, 100% of the SAPS must be funneled through the database server, i.e. this SAPS measurement, which I will call DB SAPS, represents the maximum capacity of the DB server.

 

Now, we can compare different SAPS and DB SAPS results or sizing estimates for various systems to see how well 2-tier and 3-tier SAPS correlate with one another.  Turns out, this is easier said than done as there are precious few 3-tier published results available compared to the hundreds of results published for the 2-tier benchmark.  But, I would not be posting this blog entry if I did not find a way to accomplish this, would I?  I first wanted to find two results on the 3-tier benchmark that achieved similar results.  Fortunately, HP and IBM both published results within a month of one another back in 2008, with HP hitting 170,200 DB SAPS[ii] on a 16-core x86 system and IBM hitting 161,520 DB SAPS[iii] on a 4-core Power system.

 

While the stars did not line up precisely, it turns out that 2-tier results were published by both vendors just a few months earlier with HP achieving 17,550 SAPS[iv] on the same 16-core x86 system and IBM achieving 10,180 SAPS[v] on a 4-core and slightly higher MHz (4.7GHz or 12% faster than used in the 3-tier benchmark) Power system than the one in the 3-tier benchmark.

 

Notice that the HP 2-tier result is 72% higher than the IBM result using the faster IBM processor.  Clearly, this lead would have even higher had IBM published a result on the slower processor.  While SAP benchmark rules do not allow for estimates of slower to faster processors by vendors, even though I I am posting this as an individual not on behalf of IBM, I will err on the side of caution and give you only the formula, not the estimated result:  17,550 / (10,180 * 4.2 / 4.7) = the ratio of the published HP result to the projected slower IBM processor.  At the same time, HP achieved only a 5.4% higher 3-tier result.  How does one go from almost twice the performance to essentially tied?  Easy answer, the IBM system was designed for database workloads with a whole boatload of attributes that go almost unused in application server workloads, e.g. extremely high I/O throughput and advanced cache coherency mechanisms.

 

One might point out that Intel has really turned up its game since 2008 with the introduction of Nehalem and Westmere chips and closed the gap, somewhat, against IBM’s Power Systems.  There is some truth in that, but let’s take a look at a more recent result.  In late 2011, HP published a 3-tier result of 175,320 DB SAPS[vi].  A direct comparison of old and new results show that the new result delivered 3% more performance than the old with 12 cores instead of 16 which works out to about 37% more performance per core.  Admittedly, this is not completely correct as the old benchmark utilized SAP ECC 6.0 with ASCII and the new one used SAP ECC 6.0 EP4 with Unicode which is estimated to be a 28% higher resource workload, so in reality, this new result is closer to 76% more performance per core.  By comparison, a slightly faster DL380 G7[vii], but otherwise almost identical system to the BL460c G7, delivered 112% more SAPS/core on the 2-tier benchmark compared to the BL680c G5 and almost 171% more per SAPS/core once the 28% factor mentioned above is taken into consideration.  Once again, one would need to adjust these numbers based on differences in MHz and the formula for that would be: either of the above numbers * 3.06/3.33 = estimated SAPS/core.

 

After one does this math, one would find that improvement in 2-tier results was almost 3 times the improvement in 3-tier results further questioning whether the 2-tier benchmark has any relevance to the database tier.  And just one more complicating factor; how vendors interpret SAP Quicksizer output.  The Quicksizer conveniently breaks down the amount of workload required of both the DB and App tiers.  Unfortunately, experience shows that this breakdown does not work in reality, so vendors can make modifications to the ratios based on their experience.  Some, such as IBM, have found that DB loads are significantly higher than the Quicksizer estimates and have made sure that this tier is sized higher.  Remember, while app servers can scale out horizontally, unless a parallel DB is used, the DB server cannot, so making sure that you don’t run out of capacity is essential.  What happens when you compare the sizing from IBM to that of another vendor?  That is hard to say since each can use whatever ratio they believe is correct.  If you don’t know what ratio the different vendors use, you may be comparing apples and oranges.

 

Great!  Now, what is a customer to do now that I have completely destroyed any illusion that database sizing based on 2-tier SAPS is even remotely close to reality?

 

One option is to say, “I have no clue” and simply add a fudge factor, perhaps 100%, to the database sizing.  One could not be faulted for such a decision as there is no other simple answer.  But, one could also not be certain that this sizing was correct.  For example, how does I/O throughput fit into the equation.  It is possible for a system to be able to handle a certain amount of processing but not be able to feed data in at the rate necessary to sustain that processing.  Some virtualization managers, such as VMware have to transfer data first to the hypervisor and then to the partition or in the other direction to the disk subsystem.  This causes additional latency and overhead and may be hard to estimate.

 

A better option is to start with IBM.  IBM Power Systems is the “gold standard” for SAP open systems database hosting.  A huge population of very large SAP customers, some of which have decided to utilize x86 systems for the app tier, use Power for the DB tier.  This has allowed IBM to gain real world experience in how to size DB systems which has been incorporated into its sizing methodology.  As a result, customers should feel a great deal of trust in the sizing that IBM delivers and once you have this sizing, you can work backwards into what an x86 system should require.  Then you can compare this to the sizing delivered by the x86 vendor and have a good discussion about why there are differences.  How do you work backwards?  A fine question for which I will propose a methodology.

 

Ideally, IBM would have a 3-tier benchmark for a current system from which you could extrapolate, but that is not the case.  Instead, you could extrapolate from the published result for the Power 550 mentioned above using IBM’s rperf, an internal estimate of relative performance for database intensive environments which is published externally.  The IBM Power Systems Performance Report[viii] includes rperf ratings for current and past systems.  If we multiply the size of the database system as estimated by the IBM ERP sizer by the ratio of per core performance of IBM and x86 systems, we should be able to estimate how much capacity is required on the x86 system.  For simplicity, we will assume the sizer has determined that the database requires 10 of 16 @ IBM Power 740 3.55GHz cores.  Here is the proposed formula:

 

Power 550 DB SAPS x 1/1.28 (old SAPS to new SAPS conversion) x rperf of 740 / rperf of 550

161,520 DB SAPS x 1/1.28 x 176.57 / 36.28 = estimated DB SAPS of 740 @ 16 cores

Then we can divide that above number by the number of cores to get a per core DB SAPS estimate.  By the same token you can divide the published HP BL 460c G7 DB SAPS number by the number of cores.  Then:

Estimated Power 740 DB SAPS/core / Estimated BL460c G7 DB SAPS/core = ratio to apply to sizing

The result is a ratio of 2.6, e.g. if a workload requires 10 IBM Power 740 3.55GHz cores, it would require 26 BL460c G7 cores.  This contrasts to the per core estimated SAPS based on the 2-tier benchmark which suggests just that the Power 740 would have been just 1.4 time the performance per core.   In other words, a 2-tier based sizing would suggest that the x86 system require just 14 cores where the 3-tier comparison suggests it actually needs almost twice that.  This is, assuming the I/O throughput is sufficient.  This also suggests that both systems have the same target utilization.  In reality, where x86 systems are usually sized for no more than 65% utilization, Power System are routinely sized for up to 85% utilization.

 

If this workload was planned to run under VMware, the number of vcpus must be considered which is twice the number of cores, i.e. this workload would require 52 cores which is over the limit of 32 vcpu limit of VMware 5.0.  Even when VMware can handle 64 vcpu, the overhead of VMware and its ability to sustain the high I/O of such a workload must be included in any sizing.

 

Of course, technology moves on and Intel is into its Gen8 processors.  So, you may have to adjust what you believe to the effective throughput of the x86 system based on relative performance to the BL460c G7 above, but now, at least, you may have a frame of reference for doing the appropriate calculations.  Clearly, we have shown that 2-tier is an unreliable benchmark by which to size database only systems or partitions and can easily be off by 100% for x86 systems.

 


[ii] 170,200 SAPS/34,000 users, HP ProLiant BL680c G5, 4 Processor/16 Core/16 Thread, E7340, 2.4 Ghz, Windows Server 2008 Enterprise Edition, SQL Server 2008, Certification # 2008003

[iii] 161,520 SAPS/32,000 users, IBM System p 550, 2 Processor/4 Core/8 Thread, POWER6, 4.2 Ghz , AIX 5.3,  DB2 9.5, Certification # 2008001

[iv] 17,550 SAPS/3,500 users , HP ProLiant BL680c G5, 4 Processor/16 Core/16 Thread, E7340, 2.4 Ghz, Windows Server 2008 Enterprise Edition, SQL Server 2008, Certification # 2007055

[v] 10,180 SAPS/2,035 users, IBM System p 570, 2 Processor/4 Core/8 Thread, POWER6, 4.7 Ghz, AIX 6.1, Oracle 10G, Certification # 2007037

[vi] 175,320 SAPS/32,125 users, HP ProLiant BL460c G7, 2 Processor/12 Core/24 Thread, X5675, 3.06 Ghz,  Windows Server 2008 R2 Enterprise on VMware ESX 5.0, SQL Server 2008, Certification # 2011044

[vii]27,880 SAPS/ 5,110 users, HP ProLiant DL380 G7, 2 Processor/12 Core/24 Thread, X5680, 3.33 Ghz, Windows Server 2008 R2 Enterprise, SQL Server 2008, Certification # 2010031

July 30, 2012 Posted by | Uncategorized | , , , , , , , , , | 1 Comment

Oracle M9000 SAP ATO Benchmark analysis

SAP has a large collection of different benchmark suites.  Most people are familiar with the SAP Sales and Distribution (SD) 2-tier benchmark as the vast majority of all results have been published using this benchmark suite.   A lesser known benchmark suite is called ATO or Assemble-to-Order.  When the ATO benchmark was designed it was intended to replace SD as a more “realistic” workload. As the benchmark is a little more complicated to run and SAP Quicksizer sizings are based on the SD workload the ATO benchmark never got much traction and from 1998 through 2003, only 19 results were published.  Prior to September 2, 2011, this benchmark had seemed to become extinct.  On that date, Oracle and Fujitsu, published a 2-tier result for the SPARC M9000 along with the predictable claim of world record result.  Oracle should be commended for having beaten the results published in 2003.  Of course, we might want to consider that a 2-processor/12-core, 2U Intel based system of today has already surpassed the TPC-C results of a 64-core HP Itanium Superdome that “set the record” back in 2003 at a tiny fraction of the cost and floor space.

 

So we give Oracle a one-handed clap for this “accomplishment”.  But if I left it at that, you might question why I would even bother to post this blog entry.  Let’s delve a little deeper to find the story within the story.  First let me remind the reader, these are my opinions and in no way do they reflect the opinions of IBM nor has IBM endorsed or reviewed my opinions.

 

In 2003, Fujitsu-Siemens published a couple of ATO results using a predecessor of today’s SPARC64 VII chip called SPARC64TM V at 1.35GHz and SAP 4.6C.  The just published M9000 result used the SPARC64 VII at 3.0GHz and SAP EP4 for SAP ERP 6.0 with Unicode.  If one were to divide the results achieved by both systems by the number of cores and compare them, one might find that the new results deliver about a very small increase in throughput per core of roughly 6% over the old results.  Of course, this does not account for the changes in SAP software, Unicode or benchmark requirements.   SAP rules do not allow for extrapolations, so I will instead provide you with the data from which to make your own calculations.  100 SAPS using SAP 4.6c is equal to about 55 SAPS using Business Suite 7 with Unicode.   If you were to multiply the old result by 55/100 and then divide by the number of cores, you could determine the effective throughput per core of the old system if it were running the current benchmark suite.  I can’t show you the result, but will show you the formula that you can use to determine this result yourself at the end of this posting.

 

For comparison, I wanted to figure out how Oracle did on the SD 2-tier benchmark compared to systems back in 2003.  Turns out that almost identical systems were used both in 2003 and in 2009 with the exception of the Sun M9000 which used 2.8GHz processors each of which had half of the L2 cache of the 3.0GHz system used in the ATO benchmark.  If you were to use a similar formula to the one described above and then perhaps multiply by the difference in MHz, i.e. 3.0/2.8 you could derive a similar per core performance comparison of the new and old systems.  Prior to performing any extrapolations, the benchmark users per core actually decreased between 2003 and 2009 by roughly 10%.

 

I also wanted to take a look at similar systems from IBM then and now.  Fortunately, IBM published SD 2-tier results for the 8-core 1.45GHz pSeries 650 in 2003 and for the a 256-core 4.0GHz Power 795 late last year with the SAP levels being identical to the ones used by Sun and Fujitsu-Siemens respectively.  Using the same calculations as were done for the SD and ATO comparisons above, IBM achieved 223% more benchmark users per core than they achieved in 2003 prior to any extrapolations.

 

Yes, there was no typo there.  While the results by IBM improved by 223% on a per core basis, the Fujitsu processor based systems either improved by only 9% or decreased by 10% depending on which benchmark you chose.  Interestingly enough, IBM had only a 9% per core advantage over Fujitsu-Siemens in 2003 which increased to a 294% advantage in 2009/2010 based on the SD 2-tier benchmark.

 

It is remarkable that since November 18, 2009, Oracle(Sun) has not published a single SPARC based SAP SD benchmark result while over 70 results were published by a variety of vendors, including two by Sun for their Intel systems.  When Oracle finally decided to get back into the game to try to prove their relevance despite a veritable flood of analyst and press suggestions to the contrary, rather than competing on the established and vibrant SD benchmark, they choose to stand on top of a small heap of dead carcasses to say they are better than the rotting husks upon which they stand.

 

For full disclosure, here are the actual results:

SD 2-tier Benchmark Results

Certification Date        System                                                                                # Benchmark Users               SAPS                Cert #

1/16/2003                 IBM eServer pSeries 650, 8-cores                                                  1,220                            6,130               2003002

3/11/2003                 Fujitsu Siemens Computers, PrimePower 900,  8-cores                     1,120                            5,620               2003009

3/11/2003                 Fujitsu Siemens Computers, PrimePower 900, 16-cores                    2,200                            11,080              2003010

11/18/2009               Sun Microsystems, M9000, 256-cores                                             32,000                          175,600            2009046

11/15/2010               IBM Power 795, 256-cores                                                           126,063                        688,630            2010046

 

ATO 2-tier results:

Certification Date        System                                                                     Fully Processed Assembly Orders/Hr            Cert #

3/11/2003                 Fujitsu Siemens Computers, PrimePower 900,  8-cores                6,220                                        2003011

03/11/2003               Fujitsu Siemens Computers, PrimePower 900, 16-cores               12,170                                       2003012

09/02/2011               Oracle M9000, 256-cores                                                        206,360                                     2011033

 

Formulas that you might use assuming you agree with the assumptions:

 

Performance of old system / number of cores * 55/100 = effective performance per core on new benchmark suite (EP)

 

(Performance of new system / cores ) / EP = relative ratio of performance per core of new system compared to old system

 

Improvement per core = 1 – relative ratio

 

This can be applied to both the SD and ATO results using the appropriate throughput measurements.

September 9, 2011 Posted by | Uncategorized | , , , , , , , , , , , , , , , , , , | Leave a comment