Large scale-up transactional HANA systems – part 1
Customers which require Suite on HANA (SoH) and S/4HANA systems with 6TB of memory or less will find a wide variety of available options. Those options do not require any specialized type of hardware, just systems that can scale up to 8 sockets with Intel based systems and up to 64 cores with IBM Power Systems (socket count depends on the number of active cores per socket which varies by system). If you require 6TB or less or can’t imagine ever needing more, then sizing is a fairly easy process, i.e. look at the sizing matrix from SAP and select a system which meets your needs. If you need to plan for more than 6TB, this is where it gets a bit more challenging. The list of options narrows to 5 vendors between 6TB and 8TB, IBM, Fujitsu, HPE, SGI and Lenovo and gets progressively smaller beyond that.
All systems with more than one socket today are ccNUMA, i.e. remote cache, memory and I/O accesses are delivered with more latency and lower bandwidth than local to the processor accesses. HANA is highly optimized for analytics, which most of you probably already know. The way it is optimized may not be as obvious. Most tables in HANA are columnar, i.e. every column in a table is kept in its own structure with its own dictionary and the elements of the column are replaced with a very short dictionary pointer resulting in outstanding compression, in most cases. Each column is placed in as few memory pages as possible which means that queries which scan through a column can run at crazy fast speeds as all of the data in the column is as “close” as possible to each other. This columnar structure is beautifully suited for analytics on ccNUMA systems since different columns will typically be placed behind different sockets which means that only queries that cross columns and joins will have to access columns that may not be local to a socket and, even then, usually only the results have to be sent across the ccNUMA fabric. There was a key word in the previous sentence that might have easily been missed: “analytics”. Where analytical queries scan down columns, transactional queries typically go across rows in which, due to the structure of a columnar database, every element in located in a different column, potentially spanning across the entire ccNUMA system. As a result, minimized latency and high cross system bandwidth may be more important than ever.
Let me stop here and give an example so that I don’t lose the readers that aren’t system nerds like myself. I will use a utility company as an example as everyone is a utility customer. For analytics, an executive might want to know the average usage of electricity on a given day at a given time meaning the query is composed of three elements, all contained in one table: usage, date and time. Unless these columns are enormous, i.e. over 2 Billion rows, they are very likely stored behind a single socket with no remote accesses required. Now, take that same company’s customer care center, where a utility consumer wants to turn on service, report an outage or find out what their last few months or years of bills have been. In this case, all sorts of information is required to populate the appropriate screens, first name, last name, street address, city, state, meter number, account number, usage, billed amount and on and on. Scans of columns are not required and a simple index lookup suffices, but every element is located in a different column which has to be resolved by an independent dictionary lookup/replacement of the compressed elements meaning several or several dozen communications across the systems as the columns are most likely distributed across the system. While an individual remote access may take longer, almost 5x in a worst case scenario[i], we are still talking nanoseconds here and even 100 of those still results in a “delay” of 50 microseconds. I know, what are you going to do while you are waiting! Of course, a utility customer is more likely to have hundreds, thousands or tens of thousands of transactions at any given point in time and there is the problem. An increased latency of 5x for every remote access may severely diminish the scalability of the system.
Does this mean that is not possible to scale-up a HANA transactional environment? Not at all, but it does take more than being able to physically place a lot of memory in a system to be able to utilize it in a productive manner with good scalability. How can you evaluate vendor claims then? Unfortunately, the old tried and true SAP SD benchmark has not been made available to run in HANA environments. Lacking that, you could flip a coin, believe vendor claims without proof or demand proof. Clearly, demanding proof is the most reasonable approach, but what proof? There are three types of proof to look at: past history with large transactional workloads, industry standard benchmarks and architecture.
In the over 8TB HANA space, there are three competitors; HPE Superdome X, SGI UV 300H and IBM Power Systems E870/E880. I will address those systems and these three proof points in part 2 of this blog post.