HANA P erformance:
Efficient S peed a nd S cale--‐out f or R eal--‐time B I
Introduction
SAP HANA enables organizations to optimize their business operations by analyzing large amounts of data in real time. HANA runs on inexpensive, commodity hardware and requires no proprietary add-on components. It achieves very high performance without requiring the tuning mechanisms necessary for traditional disk-based databases to perform acceptably.
A One Petabyte (1PB) performance test was developed to demonstrate that HANA is extremely efficient and scalable and can very simply deliver break-through analytic performance for real-time business intelligence (BI) on a very large database that is representative of the data that businesses use to analyze their operations. A 1PB1 data set was generated in a format as would be extracted from an SAP ERP system (e.g., data records with multiple fields) for analysis in a BW system.2
This paper will describe the test environment and present and analyze the test results. The T est E nvironment
The performance test environment was developed to represent a Sales Distribution (SD) BI query environment that supports a broad range of SQL queries and business users. Database Schema
The star-schema design in Figure 1 shows the SD test data environment and each table’s cardinality. No manual tuning structures were used in this design; there were no indexes, materialized views, summary tables or other redundant structures added for the purposes of achieving better performance.
1 1 PB = raw data set size before compression.
2 These tests did not involve BW.
Figure 1 - SD Schema
Test Data
The test data consisted of one large fact table and several smaller dimension tables, as seen in Figure 1. There were 1,200 billion records in the fact table, representing 10 years worth of SD data. The data was hash-partitioned equally across the 95 nodes using Customer_ID. Within each node, the data was further partitioned into one-month intervals. This resulted in 120 partitions per node and approximately 105 million records per partition. (See also Figure 2 and “Loading”.)
System Configuration
The test system configuration (Figure 2) w as a cluster of 100 I BM X5 s ervers w ith 100TB of t otal R AM. 95 n odes w ere c onfigured a s d ata n odes i n t he S AP H ANA e ngine a nd 5 nodes w ere c onfigured a s f ailover n odes. E ach n ode h ad:
?4 C PUs w ith 10 c ores a nd 2 h yper--‐threads p er c ore, t otaling
-40 c ores
-80 h yper--‐threads
?1 T B o f R AM
? 3.3 T B o f d isk s torage
Figure 2 - Test Configuration
Setup
These performance tests did not use pre-caching of results or manual tuning structures of any kind and therefore validated HANA’s load-then-query ability. A design that is completely free of tuning structures (internal or external) is important for building a sustainable real-time BI environment; it not only speeds implementation, it also provides ongoing flexibility for ad hoc querying, while eliminating the time and maintenance cost that tuning structures require.
Loading
The loading was done in parallel using HANA’s bulk insert command with 9 concurrent query streams. Loading was automatically parallelized across all of the nodes, and used the distribution and partitioning scheme defined for each table3 (in this case, hash distribution and monthly range partitions). The bulk insert rate was measured at 53 million records per minute, or almost one million records per second. This load performance is sufficient to load 329 million records (representing one business day’s activity) in just six minutes.
3 As defined with the CREATE TABLE command.
Compression
Data compression occurs during the data loading process. HANA demonstrated a greater than 20X compression rate4 using advanced compression techniques. The 1 PB SD data set was reduced to a trim 49.2TB HANA database, consuming only 517 GBs of RAM on each node in the cluster, i.e. only half of the available RAM per node. Queries
The query suite consisted of 18 distinct SQL queries including 10 base queries plus a number of variations for time interval (month, quarter, etc.). The queries were chosen to represent a mixed workload environment running on data that is representative of its native form (i.e., not indexed, tuned or otherwise de-normalized to avoid joins, as would be customary in a conventional database system). They cover a range of BI activities from departmental to enterprise-level including
?general reporting,
?iterative querying (drill downs),
?ranking, and
?year-over-year analysis,
The resulting queries range from moderately complex to very complex, including SQL constructs such as
?multiple joins,
?in-list,
?sub queries,
?correlated sub queries (CSQ), and
?union all.
The queries are grouped into three general categories of BI usage:
?Reporting – calculations on business performance measures for ranges of materials and/or customers for a time period
?Drilldown – iterative user-initiated queries to gather details for a given individual or group of materials and/or customers
?Analytical – periodic deep historical analysis across customers and/or materials Table 1 documents each query in terms of its business description, SQL constructs, ranges and qualifiers, and time period variations. If a query is described with multiple time periods, each time period is run as a distinct query; for instance, query R1 is run as three distinct queries, covering monthly, quarterly, and semi-annual time periods, to show its performance relative to changes in data volume (semi-annual data is 6X the volume of monthly data). Items in bold denote query complexity factors.
4 Compression rates depend heavily on characteristics of the actual data to be compressed. Individual results may vary.
BI
Query
Business Description SQL Constructs Ranges & Qualifiers Time Period(s)
R1 Sales and distribution report by month
for:
a)given range of materials
b)given range of customers
Star-schema query with multiple
joins to dimensions, with group
by and order by
?Range of materials
?Range of customers
Four variations:
1.One month;
2.Three months;
3.Six months
R2 Sales and distribution report by month for
given list of customers
Star-schema query with multiple
joins to dimensions, with group
by and order by
Range of customers defined
using “in list”
One month
R3 Sales and distribution report by month
for:
a)given range of materials
b)given customer country
Star-schema query with multiple
joins to dimensions, group by
and order by
?Range of materials
?Customer country
One month
R4 Top 100 customers report for:
a)one or several materials groups
b)one or several customer countries
Star-schema query with multiple
joins to dimensions, group by
and order by
?One or several material
groups
?One or several customer
countries
One month of data for two
variations:
1.One country and one
product group;
2.Three countries and
two product groups
R5 Top 100 customers report for:
a)given material group
b)given customer country
Star-schema query with multiple
joins to dimensions, group by
and order by
?Material group
?Customer country
Two variations:
1.Three months;
2.Six months
D1 Sales and distribution report for:
a)single customer;
b)given material group
Star-schema query with multiple
joins to dimensions, group by
and order by
?Single customer
?Material group
One month
D2 Sales and distribution report for:
a)single customer;
b)single material
Star-schema query with multiple
joins to dimensions, group by
and order by
?Single material
?Single customer
One month
D3 Sales and distribution report for:
a)single material
b)given customer country
Star-schema query with multiple
joins to dimensions, group by
and order by
?Single material
?Customer country
One month
A1 Year Over Year (YOY) Top 100
customers analytical report:
a)Top 100 customers identified for a
current time period
b)Business measures calculated for
previously identified top 100
customers in a given historical
period
c) A final result set calculated that
combines current and historical
business measures for top 100
customers
Star-schema query with fact-to-
fact join as a correlated sub-
query that includes multiple
joins to dimensions, and then
group by and
order by
?Range of materials;
?Customer country
Three variations:
1.One month with YOY
comparison
2.Three months with YOY
comparison
3.Six months with YOY
comparison
A2 YOY Trending Report for Top 100
customers:
a)Top 100 customers calculated for a
given time period of current year
b)Top 100 customers calculated for
identical time period of previous year
(or multiple years)
c) A final result set calculated that
combines business measures for top
100 customers over several years
Several star-schema sub-
queries each with multiple joins,
group by and order by, which
are combined by a union all
operator
?Material group;
?Customer country
Three variations:
1.Three months over two
years
2.Six months
over two years
3.Three months over five
years
Table 1 - Query Descriptions
Figure 3 - Baseline Performance, in Seconds
The Reporting and Drill-down queries (160 milliseconds to 640 milliseconds) demonstrate HANA’s excellent ability to aggregate data. For instance, query R1-3, ran in 580 milliseconds, but took only 52% longer to crunch through 600% more data than its monthly equivalent (R1-1).
The Drill-down queries (160 to 240 milliseconds) demonstrate HANA’s aggressive support for ad hoc joins and, therefore, its ability to provide unrestricted “slice and dice” capabilities to users without having to first involve the technical staff to create indexes or aggregates to support it (as would be the case with a conventional database).
The Analytic queries (770 milliseconds to 3.2 seconds) efficiently performed sophisticated joins (fact-to-fact, sub queries, CSQ, union all) and analysis across a sliding time window (year-over-year). The queries with one- to six-month date ranges (A1-1 through A2-2) ran in 1.3 seconds or less. Query A2-3, which analyzed 15 months over a 5-year date range, ran in under 3.5 seconds. All analytic query times are well within timeframes to support iterative speed-of-thought analysis.
Across the board, the baseline tests show that HANA scaled efficiently (better-than-linear) as the data volumes increased for a given query.
Throughput
The throughput tests are summarized in Table 2 and show that, in the face of increasing and mixed BI workloads, HANA scales very well.
Test C ase Throughput
(Queries p er H our)
1 s tream 7,547
10 s treams 57,202
20 s treams 89,123
30 s treams 103,189
40 s treams 108,849
50 s treams 110,994
60 s treams 112,602
Table 2 - Throughput Tests
At 50 streams, the average query response time was 1.6 seconds, and is only 3.4X higher than at baseline (see: Appendix), an indicator of HANA’s excellent internal efficiencies and ability to manage concurrency and mixed workloads.
A rough estimate of BI user concurrency can be derived by dividing total queries per hour by an estimated average number of queries per user per hour. For instance, the 60-streams’ 112,602 queries per hour divided by 20 (a zesty per-user rate of one query every three minutes) provides a reasonable estimate of 5,630 concurrent BI users across a mixture of reporting, drill-down and analytic query types.
Results Summary
In these performance tests, HANA demonstrated ability to deliver real-time BI-query performance as workloads increase in terms of capacity (up to 1 PB of raw data), complexity (queries with complex join constructs and significant intermediate results run in less than 2 seconds), and concurrency (60-stream throughput representing about 5600 active users).
HANA performance is based on comprehensive, well-thought design elements, including, for example,
?an in-memory design
?smart internal data structures (e.g., native columnar, advanced compression and powerful partitioning)
? a clever, cost-based optimizer that can meld these smart data structures into an efficient query plan
?efficient query execution that smartly executes the query plan to take advantage of internal components (e.g., advanced algorithms, multi-level caching
optimization in the CPU and hyper-threading).
Much has been written on HANA’s design and it is not necessary to rewrite it here. For more information on HANA technical features, please refer to “SAP HANA Technical Overview” and “SAP HANA for Next-Generation Business Applications and Real-Time Analytics”.
Real-world Experiences
These tests were run in a simulated environment to isolate HANA performance7; however, you can expect your own BI Query performance to be significantly better on
7 Disclaimer
These performance tests were designed to isolate HANA system level performance independent of the variety of applications that may be used in SAP environments.
HANA than on your existing conventional DBMS, potentially thousands of times better. Most importantly, SAP customer testimonials confirm HANA’s performance. Here are a few examples.
“We have seen massive system speed improvements and increased ability to analyze the most detailed levels of customers and products.” –Colgate Palmolive
“…replacing our enterprise-wide Oracle data mart and resulting in over 20,000X speed improvement processing our most complex freight transportation cost calculation. ...our stand-alone mobile applications that were previously running on Oracle are now running on HANA. Our 2000 local sales representatives can now interact with real-time data instead, and have the ability to make on-the-fly promotion decisions to improve sales.” –Nongfu Spring
“…our internal technical comparison demonstrated that SAP HANA outperforms traditional disk-based systems by a factor of 408,000...”–Mitsui Knowledge Industry Co., Ltd.
“With SAP HANA, we see a tremendous opportunity to dramatically improve our enterprise data warehouse solutions, drastically reducing data latency and improving speed when we can return query results in 45 seconds from BW on HANA vs. waiting up to 20 minutes for empty results from BW on a traditional disk-based database.” –Shanghai Volkswagen
Conclusion
Real-time BI query environments should be able to support new queries as soon as users formulate them. This speed-of-thought capability is only possible if consistently excellent performance is freely available in the underlying database platform.
These rigorous performance tests and their results presented here confirm that, without tuning and without massive hardware or proprietary hardware components, HANA delivers leading performance and scale-out ability and enables real-time BI for businesses that must support a range of analytic workloads, massive volumes of data and thousands of concurrent users.
Therefore, these test results should be used only as a general guideline, and results will vary from implementation to implementation due to variability of platforms, applications and data.
Appendix – Q uery R untimes
*10-, 20-, 30, 40- 50-, 60-stream runs reflect 10ms of think time between queries.
Query N ame TESTCASE Average R untime (seconds)
R1--‐1 1 s tream 0.378 10 s treams 0.479 20 s treams 0.653 30 s treams 0.847 40 s treams 1.006 50 s treams 1.180 60 s treams 1.436 R1--‐2 1 s tream 0.462 10 s treams 0.639 20 s treams 0.782 30 s treams 1.015 40 s treams 1.248 50 s treams 1.613 60 s treams 1.828 R1--‐3 1 s tream 0.584 10 s treams 0.810 20 s treams 0.990 30 s treams 1.289 40 s treams 1.742 50 s treams 2.043 60 s treams 2.424 D3 1 s tream 0.244 10 s treams 0.315 20 s treams 0.448 30 s treams 0.590 40 s treams 0.721 50 s treams 0.862 60 s treams 1.020 R2 1 s tream 0.169 10 s treams 0.238 20 s treams 0.331 30 s treams 0.461 40 s treams 0.568 50 s treams 0.696
R3 1 s tream 0.421 10 s treams 0.647 20 s treams 0.758 30 s treams 1.045 40 s treams 1.349 50 s treams 1.695 60 s treams 2.122 A1--‐1 1 s tream 0.770 10 s treams 0.984 20 s treams 1.443 30 s treams 1.744 40 s treams 2.294 50 s treams 2.864 60 s treams 3.338 A1--‐2 1 s tream 0.943 10 s treams 1.557 20 s treams 1.841 30 s treams 2.640 40 s treams 3.104 50 s treams 3.936 60 s treams 4.459 A1--‐3 1 s tream 1.172 10 s treams 1.961 20 s treams 2.425 30 s treams 3.366 40 s treams 3.978 50 s treams 4.653 60 s treams 5.347 R4--‐1 1 s tream 0.605 10 s treams 0.670 20 s treams 0.769 30 s treams 0.935 40 s treams 1.141 50 s treams 1.379 60 s treams 1.696 R4--‐2 1 s tream 0.635 10 s treams 0.740 20 s treams 0.815 30 s treams 1.023
50 s treams 1.572 60 s treams 1.927 R5--‐1 1 s tream 0.592 10 s treams 0.709 20 s treams 0.889 30 s treams 1.138 40 s treams 1.494 50 s treams 1.902 60 s treams 2.295 R5--‐2 1 s tream 0.568 10 s treams 0.854 20 s treams 1.073 30 s treams 1.384 40 s treams 1.738 50 s treams 2.188 60 s treams 2.563 A2--‐1 1 s tream 1.142 10 s treams 1.337 20 s treams 1.755 30 s treams 2.348 40 s treams 3.345 50 s treams 4.198 60 s treams 4.878 A2--‐2 1 s tream 1.246 10 s treams 1.591 20 s treams 2.118 30 s treams 2.725 40 s treams 3.558 50 s treams 4.442 60 s treams 5.197 A2--‐3 1 s tream 3.201 10 s treams 3.477 20 s treams 5.010 30 s treams 5.778 40 s treams 7.785 50 s treams 9.945 60 s treams 11.575 D1 1 s tream 0.208 10 s treams 0.298
30 s treams 0.518 40 s treams 0.641 50 s treams 0.727 60 s treams 0.861 D2 1 s tream 0.157 10 s treams 0.248 20 s treams 0.349 30 s treams 0.466 40 s treams 0.574 50 s treams 0.658 60 s treams 0.784